Scanning

Secrets Scanning

Find committed credentials — API keys, tokens, and private keys — across your code and its full git history. On by default.

Secrets Scanning

A committed credential is an immediate compromise. xgrep scans for secrets by default — alongside the security rules, with no extra flag — so a leaked key surfaces the moment you scan:

xgrep scan .

xgrep reporting a committed GitHub token and Stripe key, each with revoke-and-rotate remediation guidance

To scan only for secrets, select the category:

xgrep scan --category secrets .

What it detects

xgrep ships 270+ detectors covering 150+ providers — cloud platforms (AWS, GCP, Azure), source forges (GitHub, GitLab), and payment, messaging, AI/LLM, and observability vendors — plus generic high-entropy API keys and PEM private keys.

Detection is tuned for precision, in line with xgrep's accuracy-first goals:

  • Provider-specific patterns (prefixes, lengths, checksums) rather than blanket entropy heuristics, so a real ghp_… token matches but a random UUID doesn't.
  • Placeholder filtering — well-known example/dummy values (such as AWS's AKIAIOSFODNN7EXAMPLE) are recognized and not reported.
  • Severity that tracks reality — a detected secret is reported at a reduced severity until validation confirms it's live.

Every finding explains the blast radius and the fix: revoke, rotate, and move the value to a secret store or environment variable.

Supported providers

Alongside the type-based detectors — generic high-entropy API keys, PEM/PKCS private keys, JSON Web Tokens (JWTs), and PII (credit-card numbers, IBANs, US SSNs) — xgrep ships dedicated, provider-specific detectors for:

1Password, Adafruit, Adobe, Age, Airtable, Algolia, Alibaba, Anthropic, Artifactory, Asana, AssemblyAI, Atlassian, Authress, AWS, Azure, Beamer, Bitbucket, Bittrex, Brevo (Sendinblue), Cerebras, Cisco, Civo, ClickHouse, Clojars, Cloudflare, Codecov, Cohere, Coinbase, Confluent, Contentful, curl (netrc), Cursor, Databricks, Datadog, Deepgram, DeepSeek, DigitalOcean, Discord, Doppler, Drone CI, Dropbox, Duffel, Dynatrace, EasyPost, Elastic, ElevenLabs, Endor Labs, Etsy, Exoscale, Facebook, Fastly, Figma, Finicity, Finnhub, Flickr, Flutterwave, Fly.io, Frame.io, Freemius, FreshBooks, Gitea, GitHub, GitLab, Gitter, GoCardless, Google Cloud (GCP), Grafana, Greptile, Groq, Harness, HashiCorp, Heroku, HubSpot, Hugging Face, Infomaniak, Infracost, Intercom, Intra 42, JFrog, Kraken, KuCoin, LaunchDarkly, LightOn, Linear, LinkedIn, Lob, Looker, Mailchimp, Mailgun, Mapbox, Mattermost, MaxMind, MessageBird, Microsoft, Mistral, MongoDB, Netlify, New Relic, New York Times, Notion, npm, NVIDIA, Octopus Deploy, Okta, Ollama, OpenAI, OpenRouter, OpenShift, OVH, Perplexity, Plaid, PlanetScale, Polymarket, PostHog, Postman, Prefect, Private AI, Pulumi, PyPI, RapidAPI, ReadMe, Replicate, RubyGems, Scaleway, Scalingo, Sendbird, SendGrid, Sentry, SettleMint, Shippo, Shopify, Sidekiq, Slack, Snyk, Sourcegraph, Square, Squarespace, Stability AI, Stripe, Sumo Logic, Telegram, Together AI, Travis CI, Twilio, Twitch, Twitter/X, Typeform, UpCloud, Upstage, HashiCorp Vault, Vercel, Weights & Biases, xAI, Yandex, and Zendesk.

New providers are added regularly — run xgrep scan --category secrets to scan with the current set.

Always reported, even in tests

xgrep's production scope drops security findings in test and example paths by default — but secrets are always kept, wherever they live. A credential committed to a test fixture is just as compromised as one in production code.

Scanning git history (--history)

A secret that was committed and later deleted still lives in the repository's history, so it's still compromised. A normal scan only sees the working tree; --history walks the full commit history, scans the content each commit introduced, and catches secrets that no longer exist in the current tree — reporting who introduced each one and when:

xgrep --history --category secrets .
xgrep --history --category secrets --since 2024-01-01 .   # bound the walk for speed

It reads only the local .git object store (no network) and de-duplicates each secret to the earliest commit that introduced it. See the CLI reference for --since, --max-commits, and the commit-provenance fields in JSON/SARIF.

Decoding encoded payloads (--decode)

Secrets are often committed one encoding layer deep — a base64 environment blob, a hex string, a URL-encoded value, or a gzipped config. A normal scan sees only the opaque outer blob and misses the credential inside. --decode decodes those payloads (base64/hex/url/gzip) and re-runs the secret rules over the decoded content:

xgrep --decode --category secrets .
xgrep --history --decode --category secrets .   # combine with history scanning

Each decoded finding records its decode chain in metadata.decoded-from (e.g. base64 > gzip). It's opt-in and hermetic — off by default, and decoding is a local, deterministic transform with no network. See the CLI reference for details.

Validating live secrets (--validate)

Detection finds strings that look like credentials; --validate confirms whether one is actually live by probing its provider, then raises a confirmed finding to full severity while leaving revoked/invalid ones low:

xgrep --validate --category secrets .

It is opt-in and off by default — the one mode that makes outbound network calls. The candidate is sent only to that provider's fixed endpoint, and is never logged or written to disk. Validators currently ship for GitHub, GitLab, Slack, and Stripe tokens (more to follow). See the CLI reference for the full validation_state semantics.

In CI

Secrets scanning runs in CI like any other scan, and because findings are emitted as SARIF or GitLab SAST, leaked credentials show up in GitHub Code Scanning or the GitLab Vulnerability Report. Pair history scanning with a scheduled job to audit a repository's entire past:

xgrep --history --category secrets --sarif -o secrets.sarif .

On this page