AI-Native Security Analysis

xgrep is SAST built as infrastructure for AI agents. It turns your codebase into a queryable call graph and exposes it as tools that agents call — so they can reason about vulnerabilities the way attackers do.

The bottleneck isn't detection. It's triage.

You scan. You get 500 findings. Your team triages for days. Next sprint, repeat.

500+

Findings without context

A scanner reports “SQL injection at line 42.” But is that function reachable from an HTTP handler? Is user input actually flowing through it? A flat list can't tell you.

60%+

False positive rate

Most SAST findings are false positives — dead code, sanitized paths, internal-only functions. Without structural context, every finding looks equally dangerous.

Days

Manual triage time

Engineers manually trace call chains, check for sanitizers, verify reachability. For every single finding. Every sprint. This doesn't scale.

AI should call SAST, not the other way around.

xgrep exposes three capabilities as composable tools:

Scan

Find pattern matches and taint flows

Graph

Extract and query function call chains

Inspect

Navigate symbols, definitions, and references

AI Agent (Claude, GPT, custom)

|||

xgrep scan

xgrep graph

xgrep inspect

|||

Your Codebase

What the code graph reveals

A graph turns “vulnerability at line 42” into a structural question with a structural answer. Real output from gin-gonic/gin, the most popular Go web framework.

Reachability

Is a function actually called from a user-facing entry point? The graph traces the full call path — here, from gin's HTTP request handler to its error writer. If there's no route from a handler to the sink, it's not exploitable.

$ xgrep graph paths "handleHTTPRequest" "serveError"
gin.go::Engine.handleHTTPRequest -> gin.go::serveError
gin.go::Engine.handleHTTPRequest -> gin.go::serveError

Two paths found — one per call site in the request handler.

Blast radius

How many callers depend on a function? A bug in a utility called from 18 places is very different from one called from 2. The graph quantifies impact instantly.

$ xgrep graph callers "debugPrint"
debug.go::debugPrint called by:
  Context.initFormCache [certain] (context.go:644)
  debugPrintRoute [certain] (debug.go:37)
  debugPrintLoadTemplate [certain] (debug.go:52)
  Engine.Run [certain] (gin.go:544)
  Engine.RunTLS [certain] (gin.go:562)
  Engine.RunUnix [certain] (gin.go:582)
  Engine.RunQUIC [certain] (gin.go:631)
  ...

18 distinct callers across the framework. A change here ripples everywhere.

Full context in one query

Triage needs more than a line number. One query returns the function's definition, callers, callees, and source — formatted for an LLM to consume.

$ xgrep graph context "serveError" --depth 1

### serveError (function, L764-779) ← FOCUS

Called by: Engine.handleHTTPRequest
Calls: ?::c.Next, ?::c.writermem.Written, debugPrint, ...

func serveError(c *Context, code int, defaultMessage []byte) {
	c.writermem.status = code
	c.Next()
	if c.writermem.Written() {
		return
	}
	...

Definition, neighborhood, and source in a single LLM-ready payload.

Lists vs. graphs

What a scanner tells you

CRITICAL: SQL injection at src/users.py:23

CRITICAL: SQL injection at src/search.py:67

WARNING: Hardcoded secret at src/config.py:11

WARNING: Weak crypto at src/auth/hash.py:28

4 findings. Which ones are real? Which are urgent? You have to figure it out yourself.

What the graph adds

CRITICAL: SQL injection at src/users.py:23

Function: get_user_by_id

Called by: handle_login (HTTP, user-facing)

Path: handle_login → get_user_by_id → cursor.execute

No sanitizer. Reachable. True positive.

CRITICAL: SQL injection at src/search.py:67

Function: run_admin_report

Called by: admin_cron_job (internal only)

No user-facing callers found.

Not user-reachable. Lower priority.

How the graph is built

xgrep parses your code with tree-sitter, extracts function definitions and call sites, and builds a directed property graph. Functions are nodes. Calls are edges. The graph is cached and rebuilt only when source files change.

Parse

Tree-sitter parses every file into an AST

Extract

Functions, methods, classes become nodes

Link

Call sites become directed edges with confidence levels

Query

Ask questions: callers, callees, paths, reachability

Confidence levels: certain (direct function call) / inferred (method call on known type) / uncertain (dynamic dispatch, reflection)

Try it on real projects

Real xgrep output from popular open-source projects. Pick a language and reproduce every example.

Flask — First query builds the graph — no separate step

Bash

# Try it yourself:
# git clone --depth 1 https://github.com/pallets/flask
# cd flask
$ xgrep graph callees "wsgi_app"
Graph cached to .xgrep/graph.json (1784 nodes, 5930 edges)
src/flask/app.py::Flask.wsgi_app calls:
  ?::self.request_context [inferred] (src/flask/app.py:1592)
  ?::ctx.push [inferred] (src/flask/app.py:1596)
  ?::self.full_dispatch_request [inferred] (src/flask/app.py:1597)
  ?::self.handle_exception [inferred] (src/flask/app.py:1600)
  ?::sys.exc_info [inferred] (src/flask/app.py:1602)
  ...
  ?::ctx.pop [inferred] (src/flask/app.py:1616)

Flask — What does the request lifecycle call?

Bash

$ xgrep graph callees "full_dispatch_request"
src/flask/app.py::Flask.full_dispatch_request calls:
  ?::warnings.warn [inferred] (src/flask/app.py:1002)
  ?::request_started.send [inferred] (src/flask/app.py:1013)
  ?::self.preprocess_request [inferred] (src/flask/app.py:1014)
  ?::self.dispatch_request [inferred] (src/flask/app.py:1016)
  ?::self.handle_user_exception [inferred] (src/flask/app.py:1018)
  ?::self.finalize_request [inferred] (src/flask/app.py:1019)

Flask — Full context with source code

Bash

$ xgrep graph context "dispatch_request" --depth 1

# Code context for Flask.dispatch_request

### Flask.dispatch_request (method, L966-990) ← FOCUS

Calls: ?::self.raise_routing_exception, ?::getattr, ?::self.make_default_options_response, ?::self.ensure_sync(self.view_functions[rule.endpoint]), ?::self.ensure_sync

    def dispatch_request(self, ctx: AppContext) -> ft.ResponseReturnValue:
        """Does the request dispatching.  Matches the URL and returns the
        return value of the view or error handler.  This does not have to
        ...
        req = ctx.request
        if req.routing_exception is not None:
            self.raise_routing_exception(req)
        ...

MCP Server

Connect xgrep to any MCP-compatible AI agent. Scan, graph, and code-intelligence tools in one server.

Start the server

Bash

xgrep mcp

Tool	Description
Scan
scan	Run rules against target code
Fix
fix_verify	Preview a fix and re-scan it, without writing to disk
fix_apply	Apply a verified fix that cleared the finding
Graph
graph_build	Build function call graph
graph_callers	Find all callers of a function
graph_callees	Find all functions called by a function
graph_paths	Find call paths between two functions
graph_context	Get N-hop neighborhood with source code
graph_reachable	Find all reachable functions
Inspect
codebase_overview	High-level summary: languages, packages, entry points
symbol_search	Search functions, types, and variables by name
go_to_definition	Find the definition at a position
find_references	Find all usages of a symbol
find_implementations	Find all types implementing an interface
file_symbols	List all symbols defined in a file
hover	Docs, type info, and parameters — like an IDE tooltip
get_ranges	Bulk symbol info for a line range of a file
impact_analysis	Blast radius of changing a symbol, with risk score
dependency_graph	Upstream and downstream call dependencies
text_search	Trigram-indexed text and regex search across the codebase

Configure in Claude Desktop, Cursor, or any MCP client

JSON

{
  "mcpServers": {
    "xgrep": {
      "command": "xgrep",
      "args": ["mcp"]
    }
  }
}

Structured remediation context

Everything an LLM needs to understand and fix a vulnerability, in one payload.

xgrep -f rules/ --json src/

JSON

{
  "check_id": "sql-injection",
  "path": "src/db/query.py",
  "start": {"line": 42, "col": 5},
  "extra": {
    "message": "SQL injection via string formatting",
    "severity": "CRITICAL",
    "fix": "cursor.execute(\"SELECT * FROM users WHERE id = %s\", (user_id,))",
    "remediation_context": {
      "fix_hint": "Use parameterized queries instead of string formatting.",
      "surrounding_code": "... [10 lines before and after] ...",
      "graph_node": {
        "function": "get_user_by_id",
        "callers": ["handle_login", "api_get_user"],
        "callees": ["cursor.execute"]
      },
      "metavariables": {
        "$QUERY": "f\"SELECT * FROM users WHERE id = {user_id}\""
      }
    }
  }
}

The agent does not just apply this blindly. It drives the verified autofix harness over MCP as fix_verify and fix_apply: a fix is deterministic (applied automatically), assisted (the agent satisfies a fix contract), or advisory, and every edit must re-scan clean before it lands. xgrep never writes an unverified fix.

A complete AI security triage in 5 steps.

Triage workflow

Bash

# 1. Scan (rules are compiled into the binary)
xgrep --json --with-overview src/ > findings.json

# 2. Get context for a finding
xgrep graph context get_user_by_id --depth 2

# 3. Trace data flow
xgrep graph paths handle_request cursor.execute

# 4. Check if sanitization exists in the call chain
# (AI reads the graph output and checks for validation)

# 5. Classify: true positive / false positive / needs review

Research backs this approach. SastBench (arXiv:2601.02941) shows LLM agents with structured tool access significantly outperform both standalone SAST and LLMs given static code dumps.