Secure vLLM servers with cnspec
Scan vLLM inference servers against security best practices with cnspec.
Scan a vLLM inference server to assess its externally observable HTTP posture. cnspec probes the server's HTTP routes to find anonymously accessible endpoints, exposed documentation and metrics, permissive CORS configuration, and other risks before an attacker finds them.
cnspec probes remote HTTP routes only. Host-local configuration, environment variables, and internode network controls are outside the scope of this scan.
Prerequisites
To scan a vLLM server with cnspec, you must have:
- cnspec installed on your workstation
- A vLLM inference server you are authorized to scan, reachable over HTTP
Only scan vLLM endpoints you are authorized to assess. The scan sends HTTP requests to the supplied URL.
Connect to a vLLM server
To test access, open a cnspec shell against the server endpoint:
cnspec shell vllm http://localhost:8000cnspec> vllm.server { baseUrl reachable version }
vllm.server: {
baseUrl: "http://localhost:8000"
reachable: true
version: "0.6.3"
}If the server requires authentication, pass a bearer token so cnspec can run authenticated comparison probes:
cnspec shell vllm https://vllm.example.com --api-key YOUR_API_KEYYou can also set the VLLM_API_KEY environment variable to omit the --api-key flag:
export VLLM_API_KEY='YOUR_API_KEY'
cnspec shell vllm https://vllm.example.comAdditional flags:
--insecuredisables TLS certificate verification.--timeoutsets the HTTP request timeout in seconds (default10).
Scan a vLLM server
To scan a vLLM server:
cnspec scan vllm https://vllm.example.com --api-key YOUR_API_KEYUnderstand scan output
When a scan completes, cnspec prints a summary of all the checks it ran, grouped by policy. Each check shows a pass or fail result. For example:
✓ Pass: Ensure the connection uses TLS
✕ Fail: Ensure FastAPI documentation is not anonymously exposed
✓ Pass: Ensure Prometheus metrics are not anonymously exposedAt the end of the output, cnspec shows a risk score from 0 (no risk) to 100 (highest risk). Failed checks include remediation guidance to help you fix issues.
Scan with the Mondoo vLLM Security policy
Mondoo maintains an out of the box Mondoo vLLM Security policy that checks for exposed documentation, metrics, development routes, permissive CORS, and other HTTP posture risks.
Mondoo Platform users: Enable the policy in your space. In the Mondoo Console, go to Findings > Policies, search for "vLLM", and add the policy. All future scans of your vLLM servers automatically evaluate against it. To learn more, read Manage Policies.
Open source users: Pass the policy bundle URL directly to cnspec:
cnspec scan vllm https://vllm.example.com --api-key YOUR_API_KEY \
--policy-bundle https://raw.githubusercontent.com/mondoohq/cnspec/refs/heads/main/content/mondoo-vllm-security.mql.yamlYou can also create your own policies to meet your specific requirements.
Explore a vLLM server
Run cnspec shell vllm https://vllm.example.com --api-key YOUR_API_KEY to open the interactive shell.
Review the server posture summary
cnspec> vllm.server {
usesTls
docsExposed
openapiExposed
metricsExposed
devEndpointsExposed
corsAllowsAnyOrigin
}Audit per-endpoint probe results
cnspec> vllm.endpoints { path method category present anonymousAccessible requiresAuth }
vllm.endpoints: [
0: {
path: "/v1/models"
method: "GET"
category: "inference"
present: true
anonymousAccessible: true
requiresAuth: false
}
...
]Check metrics exposure
cnspec> vllm.metrics { prometheusExposed loadEndpointExposed loadTrackingVisible }Example security checks
Ensure the connection uses TLS
cnspec> vllm.server.usesTls == trueEnsure FastAPI documentation is not anonymously exposed
cnspec> vllm.server.docsExposed == falseEnsure development and profiler routes are not exposed
cnspec> vllm.server.devEndpointsExposed == false
cnspec> vllm.server.profilerEndpointsExposed == falseEnsure CORS does not accept any origin
cnspec> vllm.server.corsAllowsAnyOrigin == falseEnsure Prometheus metrics are not anonymously exposed
cnspec> vllm.metrics.prometheusExposed == falseFind endpoints reachable without authentication
cnspec> vllm.endpoints.where( anonymousAccessible ) { path category }Learn more
-
To learn more about how the MQL query language works, read Write Effective MQL.
-
To learn about all the vLLM resources and properties you can query, read the Mondoo vLLM Resource Pack Reference.