Detecting API Drift from OpenAPI + Docs
A technical deep-dive into how DocsCI diffs your OpenAPI specification against your documentation to detect parameter drift, schema mismatches, and deprecated endpoints — before they reach your users.
The core problem: two sources of truth
Every API-first company eventually ends up with two sources of truth: the OpenAPI spec (or equivalent schema file) that the backend team maintains, and the developer documentation that the docs team writes and publishes. These two documents are supposed to describe the same thing, but they diverge constantly because they're maintained by different people with different tools and different priorities.
When your OpenAPI spec says a parameter is required and your docs say it's optional, one of them is wrong. DocsCI's job is to find which one and tell you about it before a developer spends 45 minutes debugging a 422.
Step 1: Parsing the OpenAPI spec
DocsCI accepts OpenAPI specs in JSON or YAML, versions 2.x (Swagger) and 3.x. We parse the spec into a normalized internal representation — an endpoint registry where each entry describes the contract for one operation:
// Normalized endpoint contract (internal representation)
type EndpointContract = {
method: "GET" | "POST" | "PUT" | "DELETE" | "PATCH";
path: string; // "/users/{id}"
pathParams: Param[];
queryParams: Param[];
requestBody?: {
required: boolean;
schema: JSONSchema;
requiredFields: string[];
};
responses: Record<string, ResponseContract>;
deprecated: boolean;
tags: string[];
};
type Param = {
name: string;
in: "path" | "query" | "header" | "cookie";
required: boolean;
schema: JSONSchema;
description?: string;
};
// Example: POST /users from OpenAPI
const contract: EndpointContract = {
method: "POST",
path: "/users",
pathParams: [],
queryParams: [],
requestBody: {
required: true,
schema: { type: "object" },
requiredFields: ["email", "plan"], // ← plan is required
},
responses: {
"201": { description: "User created" },
"422": { description: "Validation error" },
},
deprecated: false,
tags: ["users"],
};Step 2: Extracting claims from documentation
Extracting what the documentation claims about an API is harder than parsing a spec. Docs are written in natural language, with code examples, tables, and prose. We use three extraction strategies in combination:
Code example parsing
curl, fetch, and SDK examples are parsed to extract the HTTP method, path, request body fields, and headers. A curl example like `curl -X POST /users -d '{"email":"..."}' tells us the docs claim email is a valid field for POST /users.
# Extracted from this curl example:
curl -X POST https://api.example.com/users \
-H "Content-Type: application/json" \
-d '{"email":"user@example.com"}'
# Claims extracted:
# - endpoint: POST /users
# - request body field: email (string)
# - required fields: email (only field shown)
# - no mention of: planParameter table extraction
Many API docs have a parameters table with columns like 'Name', 'Type', 'Required', 'Description'. We parse these tables and extract structured claims about each parameter's type and required status.
| Parameter | Type | Required | Description | |-----------|--------|----------|-----------------| | email | string | Yes | User email | | plan | string | No | Subscription | # ↑ Docs claim: plan is optional # OpenAPI says: plan is required → DRIFT FINDING
Prose extraction (AI-assisted)
For descriptions like 'The plan field is required when creating a user', we use a lightweight NLP model to extract claims. This catches drift in prose that doesn't appear in tables or code examples.
Step 3: The diff algorithm
Once we have a normalized spec contract and a normalized set of documentation claims, we run a structured diff. The diff rules, in order of severity:
| Drift type | Example | Severity |
|---|---|---|
| Required param not in docs | OpenAPI: plan required. Docs: no mention of plan | critical |
| Optional in docs, required in spec | Docs: plan is optional. Spec: plan is required | critical |
| Wrong type in docs | Docs: user_id is string. Spec: user_id is integer | critical |
| Deprecated endpoint documented as current | Spec: GET /v1/users deprecated=true. Docs: no deprecation notice | warning |
| Response field not documented | Spec returns 'created_at'. Docs don't mention created_at | info |
| Extra field in docs not in spec | Docs show field 'metadata'. Spec doesn't define it | warning |
Step 4: PR comments with AI-generated fixes
For each drift finding, DocsCI generates a precise GitHub PR comment pointing to the exact file and line where the documentation makes the incorrect claim. The comment includes the finding, the OpenAPI source, and an AI-generated suggested fix:
## ⚠️ DocsCI: API Drift — docs/api/users.md:47 **Finding:** POST /users — parameter `plan` is documented as optional but is required in the OpenAPI spec (v2.3.1). **OpenAPI source:** components/schemas/CreateUserRequest/required[1] **Suggested fix:** ```diff - | plan | string | No | Subscription plan | + | plan | string | Yes | Subscription plan (required) | ``` Or in prose: ```diff - The `plan` field is optional. + The `plan` field is required. Valid values: `free`, `pro`, `enterprise`. ```
Integrating with your CI pipeline
Add your OpenAPI spec URL to the DocsCI GitHub Action and drift detection runs automatically on every PR:
- name: Run DocsCI with drift detection
run: |
tar czf docs.tar.gz docs/ *.md
curl -sf -X POST https://snippetci.com/api/runs/queue \
-H "Authorization: Bearer ${{ secrets.DOCSCI_TOKEN }}" \
-F "docs_archive=@docs.tar.gz" \
-F "openapi_url=https://staging-api.example.com/openapi.json" \
| jq -e '.status == "passed"'
# Drift findings appear as inline PR comments
# Critical findings fail the CI check
# Warnings are reported but don't block mergeDetect drift in your docs today
Connect your OpenAPI spec and get a full drift report in 5 minutes. Free tier available.