AI

Secure vLLM servers with cnspec

Scan vLLM inference servers against security best practices with cnspec.

Scan a vLLM inference server to assess its externally observable HTTP posture. cnspec probes the server's HTTP routes to find anonymously accessible endpoints, exposed documentation and metrics, permissive CORS configuration, and other risks before an attacker finds them.

cnspec probes remote HTTP routes only. Host-local configuration, environment variables, and internode network controls are outside the scope of this scan.

Prerequisites

To scan a vLLM server with cnspec, you must have:

Only scan vLLM endpoints you are authorized to assess. The scan sends HTTP requests to the supplied URL.

Connect to a vLLM server

To test access, open a cnspec shell against the server endpoint:

cnspec shell vllm http://localhost:8000
cnspec> vllm.server { baseUrl reachable version }
vllm.server: {
  baseUrl: "http://localhost:8000"
  reachable: true
  version: "0.6.3"
}

If the server requires authentication, pass a bearer token so cnspec can run authenticated comparison probes:

cnspec shell vllm https://vllm.example.com --api-key YOUR_API_KEY

You can also set the VLLM_API_KEY environment variable to omit the --api-key flag:

export VLLM_API_KEY='YOUR_API_KEY'
cnspec shell vllm https://vllm.example.com

Additional flags:

  • --insecure disables TLS certificate verification.
  • --timeout sets the HTTP request timeout in seconds (default 10).

Scan a vLLM server

To scan a vLLM server:

cnspec scan vllm https://vllm.example.com --api-key YOUR_API_KEY

Understand scan output

When a scan completes, cnspec prints a summary of all the checks it ran, grouped by policy. Each check shows a pass or fail result. For example:

✓ Pass:  Ensure the connection uses TLS
✕ Fail:  Ensure FastAPI documentation is not anonymously exposed
✓ Pass:  Ensure Prometheus metrics are not anonymously exposed

At the end of the output, cnspec shows a risk score from 0 (no risk) to 100 (highest risk). Failed checks include remediation guidance to help you fix issues.

Scan with the Mondoo vLLM Security policy

Mondoo maintains an out of the box Mondoo vLLM Security policy that checks for exposed documentation, metrics, development routes, permissive CORS, and other HTTP posture risks.

Mondoo Platform users: Enable the policy in your space. In the Mondoo Console, go to Findings > Policies, search for "vLLM", and add the policy. All future scans of your vLLM servers automatically evaluate against it. To learn more, read Manage Policies.

Open source users: Pass the policy bundle URL directly to cnspec:

cnspec scan vllm https://vllm.example.com --api-key YOUR_API_KEY \
  --policy-bundle https://raw.githubusercontent.com/mondoohq/cnspec/refs/heads/main/content/mondoo-vllm-security.mql.yaml

You can also create your own policies to meet your specific requirements.

Explore a vLLM server

Run cnspec shell vllm https://vllm.example.com --api-key YOUR_API_KEY to open the interactive shell.

Review the server posture summary

cnspec> vllm.server {
    usesTls
    docsExposed
    openapiExposed
    metricsExposed
    devEndpointsExposed
    corsAllowsAnyOrigin
}

Audit per-endpoint probe results

cnspec> vllm.endpoints { path method category present anonymousAccessible requiresAuth }
vllm.endpoints: [
  0: {
    path: "/v1/models"
    method: "GET"
    category: "inference"
    present: true
    anonymousAccessible: true
    requiresAuth: false
  }
  ...
]

Check metrics exposure

cnspec> vllm.metrics { prometheusExposed loadEndpointExposed loadTrackingVisible }

Example security checks

Ensure the connection uses TLS

cnspec> vllm.server.usesTls == true

Ensure FastAPI documentation is not anonymously exposed

cnspec> vllm.server.docsExposed == false

Ensure development and profiler routes are not exposed

cnspec> vllm.server.devEndpointsExposed == false
cnspec> vllm.server.profilerEndpointsExposed == false

Ensure CORS does not accept any origin

cnspec> vllm.server.corsAllowsAnyOrigin == false

Ensure Prometheus metrics are not anonymously exposed

cnspec> vllm.metrics.prometheusExposed == false

Find endpoints reachable without authentication

cnspec> vllm.endpoints.where( anonymousAccessible ) { path category }

Learn more

On this page