Analysis Mode
Use xgrep's built-in native analyzers via mode: analysis rules.
Analysis Mode
Some checks are too context-dependent to express as a pattern. "Is this imported
name ever used?", "does this self.x read refer to an attribute the class never
defines?", "is this regex vulnerable to ReDoS?" — answering these accurately
needs whole-file or semantic analysis, not text/AST shape matching. xgrep ships
these as native analyzers, dispatched with mode: analysis.
A pattern rule that tries to approximate one of these checks is almost always a false-positive factory; the native analyzer is the precise implementation.
Writing an analysis-mode rule
Set mode: analysis and name an analyzer with analyzer:. The rule carries no
pattern — the analyzer produces all findings:
rules:
- id: my-undefined-attribute-rule
languages: [python]
severity: ERROR
message: Attribute is never defined on the class; this raises AttributeError.
mode: analysis
analyzer: undefined-attribute
metadata:
category: correctnessRules of the mode (enforced at load time, mirroring how mode: taint forbids
pattern):
analyzer:is required and must name a known analyzer — an unknown name is a load error, not a silent no-op.- A pattern clause (
pattern,pattern-regex,pattern-either, …) is not allowed alongsidemode: analysis. Writinganalyzer:alone impliesmode: analysis. languages:still scopes which files the rule runs on; each analyzer is also internally gated to the language(s) it supports.
Built-in analyzers
analyzer: | Languages | Detects |
|---|---|---|
unused-import | python | Imported names never referenced in the file. |
unreachable-code | python | Statements after a return/raise/break/continue in the same suite. |
non-callable | python | A call whose callee is provably not callable. |
no-effect | python | A bare comparison/arithmetic statement whose result is discarded. |
inconsistent-return | python | A function that returns a value on some paths and None/implicitly on others. |
conflicting-signature | python | A method overriding a local base method with an incompatible arity. |
equality-types | python | ==/!= between operands of known incompatible types (b"x" == "x"). |
wrong-arg-constructor | python | A constructor / self-method call with an argument count __init__ cannot accept. |
wrong-arg-call | python | A module-function call with an incompatible positional-argument count. |
first-param-not-self | python | An instance method whose first parameter is not self. |
invalid-escape | python | A non-raw string/bytes literal with an unrecognized backslash escape ("\d"). |
undefined-attribute | python | A self.x read where x is never defined on the class or a local base. |
use-before-def | python | A local variable read on a path where it is never bound first (UnboundLocalError). |
empty-except | python | A try whose except handler body is a single pass, silently swallowing the error (CWE-390). |
redundant-comparison | python | A comparison made constant by a preceding elif/guard over the same operands (CWE-570). |
duplicate-binding | javascript, typescript | A name bound twice in one parameter list or destructuring pattern. |
prototype-pollution | javascript, typescript | A for…in / Object.keys() copy into an object without prototype-key guards. |
incomplete-url-scheme | javascript, typescript | A scheme allowlist recognizing some but not all of javascript:/data:/vbscript:. |
incomplete-url-substring | javascript, typescript | A substring hostname check not anchored to the URL's host. |
incomplete-html-sanitization | javascript, typescript | A .replace() chain missing some attribute-breaking characters. |
misspelled-identifier | javascript, typescript | An identifier word-part edit-distance-1 from a more common part used nearby. |
unsafe-cert-trust | java | TLS endpoint identification disabled or never enabled. |
implicit-pending-intent | java | A mutable implicit PendingIntent reaching a broadcast/activity sink (CWE-927). |
sensitive-broadcast | java | Sensitive data broadcast in Intent extras without a permission (CWE-927). |
insecure-basic-auth | java | A Basic auth header sent after a plaintext http:// URL (CWE-522). |
lock-order | java | Inconsistent lock acquisition order across methods (deadlock, CWE-833). |
type-narrowing | java | A compound assignment narrowing a wider type (int += long, CWE-190). |
ruby-incomplete-sanitization | ruby | A .sub/.sub! that removes only the first occurrence of a metacharacter. |
regex-hostname-dot | any | An unescaped . in a hostname regex. |
regex-unanchored-hostname | any | A hostname regex lacking start/end anchors. |
regex-semi-anchored | any | An anchor (^/$) that binds only one alternation branch (^a|b). |
regex-useless-escape | any | Unnecessary backslash escapes in a regex. |
regex-bad-tag-filter | any | A regex HTML/XML tag filter that can be bypassed. |
regex-unmatchable-caret | any | A ^ in a position where it can never match. |
regex-unmatchable-dollar | any | A $ in a position where it can never match. |
regex-suspicious-range | any | A suspicious character range in a […] class (cross-case, digit-to-letter). |
regex-suspicious-character | any | A suspicious escape such as \a (bell) or \b (backspace) in a regex. |
The any-language regex analyzers are scoped by the rule's own languages:
field and adapt to the language internally (for example, Ruby hostname anchors
are \A/\z rather than ^/$).
Configuration
Analysis-mode rules take no parameters in metadata. When an analyzer needs to
vary, it does so by analyzer identity (the regex kinds are separate
analyzer: names rather than a kind option) or by intrinsic behaviour. A
future structured options: block is the planned home for any genuinely
rule-configurable value.
See also
- Writing rules — the rule format and supported features.
- Taint analysis — the other non-
searchmode.