Getting Started

Getting Started

What xgrep is, the accuracy-first design goals behind it, and how to go from install to first scan.

Getting Started

xgrep is a fast, Semgrep-compatible code scanner written in Go. It scans codebases using Semgrep YAML rule syntax and tree-sitter for language-aware, AST-based pattern matching.

New here? Install it and run your first scan — it works with zero configuration:

xgrep scan .

What xgrep does

One binary, several capabilities — each links to its section:

Scan for vulnerabilities and secrets

The core of xgrep. Built-in security rules with taint/dataflow analysis that follow untrusted input to dangerous sinks, plus secrets scanning that's on by default. Refine with the exploitability tier, scan only what a PR changed, hunt secrets across git history, and confirm live credentials. Output as text, JSON, SARIF, or GitLab SAST. See what's detected per language — JavaScript/TypeScript, Java, C#, Python, Go, Ruby, Swift, and more.

Code intelligence beyond scanning: xgrep inspect for codebase overviews, symbol search, go-to-definition, references, and change-impact analysis; and xgrep graph for callers, callees, call paths, and source-inlined neighborhoods.

Fit into your workflow

Integrations: run in CI with GitHub Code Scanning or GitLab SAST and exit-code gating, get real-time diagnostics over the LSP server, or expose everything to AI agents over the MCP server.

Power AI agents

xgrep is built to be an AI-agent backend: packaged Claude Code skills, an agent guide, and structured --json everywhere.

Write your own rules

Extend the built-ins with custom rules in the Semgrep-compatible format — write them, test them, and run them alongside the built-in corpus.

Start here

  • Installation — install the prebuilt binary via npm.
  • First scan — find a real vulnerability in one command.

Design goals

xgrep optimizes for accuracy: when it reports a vulnerability, it should be real and exploitable. False positives are what kill SAST tools — once a scanner cries wolf, people stop reading its output and real bugs slip through. Every rule and engine change is judged against these goals:

  1. Report exploitable issues, not imperfect code. The bar for a security finding is "exploitable," not "technically imperfect." A technically-true-but-harmless match is treated as noise.
  2. Earn precision through dataflow/reachability, not by weakening detection. Prefer firing when untrusted input actually reaches a dangerous sink over matching code shape alone. Relaxing what counts as a bug to cut noise also loses real bugs — add context, don't loosen the pattern.
  3. Separate correctness from security. A code smell (e.g. an unescaped . in a hostname regex) is a low-severity correctness note; an exploitable bug is a security finding. Smells must never drown out confirmed vulnerabilities.
  4. Calibrate severity and confidence to exploitability. HIGH/CRITICAL only when impact is demonstrable; uncertain findings are low-confidence "review" items, clearly distinct from confirmed ones.
  5. Prefer AST and semantic analysis over regex. Tree-sitter ASTs and taint dataflow are more precise than text patterns, and are the default.
  6. Never suppress a true positive to lower a count. If xgrep finds a real bug — even a minor one — the fix belongs in the code, not in muting the rule.

On this page