Post

Graphify: Give Your AI Coding Assistant a Map of Your Entire Codebase

AI coding assistants are powerful — but they have a fundamental problem: they work by reading files one at a time. Ask Claude Code or Cursor to understand your authentication flow, and it will grep through files, read them sequentially, and try to piece together how things connect. It works, but it’s slow, expensive, and the assistant often misses connections that span multiple files.

Graphify solves this by doing the opposite. Instead of making your AI read files on demand, it pre-builds a queryable knowledge graph of your entire codebase — code, SQL schemas, docs, PDFs, images, even videos — and gives your assistant a map it can navigate instantly.

One command, typed inside your AI assistant:

1
/graphify .

That’s it. Your codebase becomes a graph.


What is Graphify?

Graphify is an open-source AI coding assistant skill with 61k+ GitHub stars and a YC S26 badge. It parses your project into a knowledge graph — a network of nodes (functions, classes, modules, concepts, design decisions) and edges (calls, imports, inheritance, dependencies) — then gives your AI assistant tools to query that graph instead of reading raw files.

The output is three files in graphify-out/:

1
2
3
4
graphify-out/
├── graph.html       # interactive visual browser — click nodes, filter, search
├── GRAPH_REPORT.md  # highlights: key concepts, surprising connections, suggested questions
└── graph.json       # the full graph — query it anytime without re-reading files

What makes it different from other code intelligence tools is scope. Graphify handles not just code but the full context around it: SQL schema files, infrastructure as Terraform HCL, markdown docs, research PDFs, architecture diagrams, and even video walkthroughs — all in the same graph. App code, database schema, and infrastructure in one place.


How it Helps AI Agents

The problem with file-by-file reading

When you ask an AI agent “how does our authentication flow work?”, it typically:

  1. Searches for relevant files
  2. Reads them one by one
  3. Tries to build a mental model of connections
  4. Often misses things that are several hops away

This is slow, burns tokens, and breaks down on large codebases where the answer is spread across dozens of files.

What Graphify gives the agent instead

After running /graphify ., your assistant has access to:

  • Instant relationship queries — “what calls validateToken?” returns a pre-computed list from the graph, not a file search
  • Call chains — “trace the path from the API endpoint to the database” follows edges through the graph without reading intermediate files
  • Cross-cutting connections — the graph reveals links between things in different modules that a grep would never surface
  • The “why” layer — inline comments (# NOTE:, # WHY:, # HACK:), docstrings, and design rationale from docs are extracted as separate nodes linked to the code they explain

Instead of reading files, the agent runs queries like:

1
2
3
/graphify query "what connects auth to the database?"
/graphify path "UserService" "DatabasePool"
/graphify explain "RateLimiter"

Each query returns a targeted subgraph — just the relevant nodes and edges — rather than dumping entire files into context.


How it Saves Tokens

Token savings are the most concrete benefit. Here’s why:

Without Graphify, an agent answering “what does the payment module depend on?” might:

  • Read payment.py (2,000 tokens)
  • Read models.py (3,500 tokens)
  • Read database.py (1,800 tokens)
  • Read config.py (900 tokens)
  • Read utils.py (1,200 tokens)

Total: ~9,400 tokens just to answer one question, most of which is irrelevant content around the answer.

With Graphify, the same question returns a subgraph of dependency edges — maybe 200–400 tokens of structured node/relationship data containing exactly the answer, nothing else.

Graphify also installs pre-tool hooks in supported assistants (Claude Code, Gemini CLI). These hooks fire automatically before the assistant reaches for a file-reading tool and nudge it toward querying the graph first. On Claude Code, the hook intercepts Read and Glob tool calls and reminds the assistant to check the graph before opening files. The result is fewer file reads per task, compounding savings across a long coding session.

The GRAPH_REPORT.md file further reduces cold-start costs — when the assistant needs a broad architecture overview, it reads the 1–2 page report instead of scanning the whole codebase.


Installation

The PyPI package name is graphifyy (double y). The CLI command is graphify.

1
2
3
4
5
6
# Recommended — uv manages PATH automatically
uv tool install graphifyy

# Alternatives
pipx install graphifyy
pip install graphifyy

Then register the skill with your AI assistant:

1
graphify install

This writes a skill file to your AI assistant’s config directory. From that point on, typing /graphify . inside your assistant triggers the graph build.

Optional extras

Graphify’s core (code parsing) works fully offline with no API key. Everything else is opt-in:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# PDF support
uv tool install "graphifyy[pdf]"

# Word/Excel documents
uv tool install "graphifyy[office]"

# Terraform/HCL files
uv tool install "graphifyy[terraform]"

# Video/audio transcription (local, via faster-whisper)
uv tool install "graphifyy[video]"

# MCP server mode
uv tool install "graphifyy[mcp]"

# Local LLM via Ollama
uv tool install "graphifyy[ollama]"

# Everything
uv tool install "graphifyy[all]"

Using Graphify with Different Tools

Graphify supports 20+ AI coding assistants. After installing the base package, run the platform-specific install command once:

Claude Code

1
2
3
graphify install
# or explicitly:
graphify claude install

Claude Code gets the deepest integration — a PreToolUse hook intercepts Read and Glob tool calls and redirects the agent to query the graph first. Then in Claude Code:

1
/graphify .

Cursor

1
graphify cursor install

This writes .cursor/rules/graphify.mdc with alwaysApply: true, so Cursor includes the graph instructions in every conversation automatically.

1
/graphify .

Codex (OpenAI)

1
graphify install --platform codex

Also add multi_agent = true under [features] in ~/.codex/config.toml. In Codex, use $graphify instead of /graphify:

1
$graphify .

Gemini CLI

1
graphify gemini install

Gets a BeforeTool hook — same pre-tool interception as Claude Code.

1
/graphify .

VS Code Copilot Chat

1
graphify vscode install
1
/graphify .

Aider

1
graphify aider install

OpenCode

1
graphify opencode install

GitHub Copilot CLI

1
graphify copilot install

Core Commands

Once the graph is built, you can query it directly from the terminal or inside your assistant:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Build the graph
/graphify .

# Re-extract only changed files (fast incremental update)
/graphify . --update

# Query the graph
/graphify query "what connects auth to the database?"
/graphify path "UserService" "DatabasePool"
/graphify explain "RateLimiter"

# Generate an architecture page with Mermaid call-flow diagrams
graphify export callflow-html

# Auto-rebuild on every git commit (no API cost for code-only changes)
graphify hook install

# Add external content to the graph
/graphify add https://arxiv.org/abs/1706.03762    # a paper
/graphify add <youtube-url>                        # a video walkthrough

The Graph Report

Every build generates GRAPH_REPORT.md with four sections:

God nodes — the most-connected concepts in your project. These are the things everything flows through. Understanding them unlocks the whole codebase.

Surprising connections — links between things that live in different files or modules, ranked by how unexpected they are. These are the connections a code review would miss.

The “why” layer — inline comments (# NOTE:, # WHY:, # HACK:), docstrings, and design rationale from docs extracted as separate nodes linked to the code they explain. Your institutional knowledge, made queryable.

Suggested questions — 4–5 questions the graph is uniquely positioned to answer, tailored to what it found in your project.

Every inferred relationship is tagged EXTRACTED, INFERRED, or AMBIGUOUS — so you always know what was directly found vs what was guessed.


Team Workflow

Graphify is designed to be committed to git, so the whole team starts with a map:

1
2
3
4
5
6
7
# One person builds and commits
/graphify .
git add graphify-out/
git commit -m "chore: update knowledge graph"

# Everyone pulls and their assistant immediately has context
git pull

Install the git hook to keep it current automatically:

1
graphify hook install

This sets up a post-commit hook that rebuilds the AST portion of the graph after each commit — zero API cost, since code is parsed locally via tree-sitter. It also installs a git merge driver so graph.json is never left with conflict markers when two developers commit at the same time.

For large teams, you can run a shared MCP server so everyone points at one graph without running graphify locally:

1
python -m graphify.serve graphify-out/graph.json --transport http --host 0.0.0.0 --port 8080 --api-key "$SECRET"

Privacy

Code files are processed entirely locally via tree-sitter — nothing leaves your machine for code-only corpora, and no API key is required. Video and audio are also transcribed locally via faster-whisper.

Docs, PDFs, and images go through your AI assistant’s model API — whichever backend your IDE session uses. For data-residency requirements, use --backend ollama for fully local inference on everything.

There is no telemetry, no usage tracking, and no analytics.


Further Reading

This post is licensed under CC BY 4.0 by the author.