Getting Started
What xgrep is, the accuracy-first design goals behind it, and how to go from install to first scan.
Getting Started
xgrep is a fast, Semgrep-compatible code scanner written in Go. It scans codebases using Semgrep YAML rule syntax and tree-sitter for language-aware, AST-based pattern matching.
New here? Install it and run your first scan — it works with zero configuration:
xgrep scan .What xgrep does
One binary, several capabilities — each links to its section:
Scan for vulnerabilities and secrets
The core of xgrep. Built-in security rules with taint/dataflow analysis that follow untrusted input to dangerous sinks, plus secrets scanning that's on by default. Refine with the exploitability tier, scan only what a PR changed, hunt secrets across git history, and confirm live credentials. Output as text, JSON, SARIF, or GitLab SAST. See what's detected per language — JavaScript/TypeScript, Java, C#, Python, Go, Ruby, Swift, and more.
Navigate and understand code
Code intelligence beyond scanning:
xgrep inspect for codebase overviews, symbol
search, go-to-definition, references, and change-impact analysis; and
xgrep graph for callers, callees, call
paths, and source-inlined neighborhoods.
Fit into your workflow
Integrations: run in CI with GitHub Code Scanning or GitLab SAST and exit-code gating, get real-time diagnostics over the LSP server, or expose everything to AI agents over the MCP server.
Power AI agents
xgrep is built to be an AI-agent backend: packaged
Claude Code skills, an agent guide,
and structured --json everywhere.
Write your own rules
Extend the built-ins with custom rules in the Semgrep-compatible format — write them, test them, and run them alongside the built-in corpus.
Start here
- Installation — install the prebuilt binary via npm.
- First scan — find a real vulnerability in one command.
Design goals
xgrep optimizes for accuracy: when it reports a vulnerability, it should be real and exploitable. False positives are what kill SAST tools — once a scanner cries wolf, people stop reading its output and real bugs slip through. Every rule and engine change is judged against these goals:
- Report exploitable issues, not imperfect code. The bar for a security finding is "exploitable," not "technically imperfect." A technically-true-but-harmless match is treated as noise.
- Earn precision through dataflow/reachability, not by weakening detection. Prefer firing when untrusted input actually reaches a dangerous sink over matching code shape alone. Relaxing what counts as a bug to cut noise also loses real bugs — add context, don't loosen the pattern.
- Separate correctness from security. A code smell (e.g. an unescaped
.in a hostname regex) is a low-severity correctness note; an exploitable bug is a security finding. Smells must never drown out confirmed vulnerabilities. - Calibrate severity and confidence to exploitability. HIGH/CRITICAL only when impact is demonstrable; uncertain findings are low-confidence "review" items, clearly distinct from confirmed ones.
- Prefer AST and semantic analysis over regex. Tree-sitter ASTs and taint dataflow are more precise than text patterns, and are the default.
- Never suppress a true positive to lower a count. If xgrep finds a real bug — even a minor one — the fix belongs in the code, not in muting the rule.