FactiveAPI Documentation

The FactiveAPI lets you verify any content for factual accuracy. Submit text, URLs, PDFs, videos, or images and receive structured results with individual claims, verdicts, and source citations.

Quickstart

Get from zero to your first verified claim in under 2 minutes.

1. Get your API key

Create an account at factivelabs.com/register and generate an API key from your dashboard under Account → API Access.

2. Install the Python SDK (optional)

pip install factivelabs

3. Verify your first claim

from factivelabs import FactiveClient

client = FactiveClient(api_key="YOUR_API_KEY")

result = client.verify_text(
    text="The Great Wall of China is visible from space."
)

for claim in result.claims:
    print(f"{claim.verdict}: {claim.text}")
    print(f"  Explanation: {claim.explanation}")
    for source in claim.sources:
        print(f"  Source: {source.url}")

Or use cURL directly:

curl -X POST https://api.factivelabs.com/api/v1/verify \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "The Great Wall of China is visible from space."}'

💡

Try the API without signing up in the Playground — no API key required for basic usage.

Authentication

All API requests require a Bearer token in the Authorization header:

Authorization: Bearer fctv_live_sk_abc123...

API keys are prefixed with fctv_live_sk_ for live keys and fctv_test_sk_ for test keys. Keys are hashed (SHA-256) before storage — we never store your key in plaintext.

Generate and manage keys from Account → API Access in your dashboard.

Base URL

https://api.factivelabs.com

All endpoints are versioned under /api/v1/. The current version is v1.

Verify Content

The primary endpoint is POST /api/v1/verify. Send content in the request body and receive structured verification results.

Request Body

{
  "content": "Text to verify",
  "content_type": "text",       // see /api/reference#verify for the full 18-type list
  "stream": false,              // Enable SSE streaming
  "async": false,               // Enable async job mode
  "max_claims": 50,             // Max claims to extract (1-200)
  "skip_table_claims": true,    // Skip granular table/statistical claims
  "second_pass": false          // Enable Sonnet second-pass for disputed claims
}

Response

{
  "id": "fc_abc123",
  "status": "complete",
  "input_type": "text",
  "title": "",
  "extracted_text": "The Great Wall of China is visible from space.",
  "claims": [
    {
      "text": "The Great Wall of China is visible from space",
      "verdict": "disputed",
      "summary": "Disputed — not visible to the unaided eye from orbit.",
      "explanation": "This is a common misconception. The Great Wall is not visible to the unaided eye from low Earth orbit...",
      "corrected_text": "The Great Wall of China is not visible from space with the naked eye.",
      "categories": ["science", "geography"],
      "verified_by": "Sonar",
      "skipped": false,
      "context_flags": {"misconception": true},
      "unclear_reason": null,
      "sources": [
        {
          "title": "NASA - Great Wall of China",
          "url": "https://www.nasa.gov/...",
          "domain": "nasa.gov",
          "publisher": "NASA",
          "snippet": "The Great Wall can barely be seen from the shuttle..."
        }
      ],
      "start_offset": 0,
      "end_offset": 49
    }
  ],
  "counts": {"total": 1, "confirmed": 0, "disputed": 1, "inconclusive": 0, "skipped": 0},
  "usage": {
    "claim_count": 1,
    "skipped_claims": 0,
    "cost_usd": 0.01
  }
}

Response Modes

Synchronous (default)

The request blocks until all claims are extracted and verified. Returns the complete result in a single JSON response. Best for short content (under 5,000 characters).

Streaming (SSE)

Set "stream": true to receive Server-Sent Events. The first event delivers the extracted text and document title — display it immediately. Claims and verdicts then stream in as they complete. This lets you build real-time UIs where content appears instantly while fact-checking runs in parallel.

event: text_extracted
data: {"text": "Full extracted content...", "title": "Document Title", "content_type": "pdf", "char_count": 4821}

event: claim_extracted
data: {"claim": {"text": "...", "sentence_text": "...", "sentence_start": 0, "sentence_end": 84, "span_start": 12, "span_end": 78}}

event: claim_verified
data: {"claim": {"text": "...", "verdict": "confirmed", "summary": "...", "skipped": false, "context_flags": {}, "unclear_reason": null, "sources": [...]}, "skipped": false}

event: claim_verified
data: {"claim": {"text": "...", "verdict": "disputed", "summary": "...", "skipped": false, "context_flags": {"misconception": true}, "unclear_reason": null, "sources": [...]}, "skipped": false}

event: highlights
data: {"highlights": [...], "source_length": 1842, "source_hash": "a1b2c3d4e5f6"}

event: complete
data: {"id": "fc_abc123", "status": "complete", "title": "", "claims": [...], "counts": {"total": 5, "confirmed": 3, "disputed": 1, "inconclusive": 1, "skipped": 0}, "usage": {"claim_count": 5, "skipped_claims": 0, "cost_usd": 0.05}}

The text_extracted event is always the first event emitted. It contains the full extracted text from whatever you submitted (PDF, URL, YouTube, etc.) along with the document title. Use it to display content to your users immediately — no need to extract content yourself.

The highlights event is emitted after all claims are verified. It contains pre-resolved positioning regions that map each verdict back to the source text. See Claim Positioning below.

Asynchronous (polling)

Set "async": true to immediately receive a job ID. Poll GET /api/v1/jobs/{job_id} to check status. Best for large documents and batch processing.

// Submit
POST /api/v1/verify  {"content": "...", "async": true}
// Response (HTTP 202 Accepted):
// {"id": "fc_abc123def456", "status": "queued", "created_at": "2026-05-14T10:30:00Z", "message": "..."}

// Poll
GET /api/v1/jobs/fc_abc123def456

// While processing:
// {
//   "id": "fc_abc123def456",
//   "status": "processing",
//   "progress": {"claims_extracted": 8, "claims_verified": 3, "claims_total": 12},
//   "result": null,
//   "created_at": "2026-05-14T10:30:00Z",
//   "completed_at": null
// }

// When complete:
// {
//   "id": "fc_abc123def456",
//   "status": "complete",
//   "progress": {"claims_extracted": 12, "claims_verified": 12, "claims_total": 12},
//   "result": { "id": "fc_abc123", "claims": [...], "counts": {...}, "usage": {...} },
//   "created_at": "2026-05-14T10:30:00Z",
//   "completed_at": "2026-05-14T10:30:45Z"
// }

Verdicts Explained

Every verified claim receives one of three verdicts:

confirmed — The claim is factually supported by evidence from multiple reliable sources.
disputed — The claim contradicts evidence. The explanation details what's wrong and what the evidence actually says.
inconclusive — Evidence is insufficient, conflicting, or the claim is too subjective to verify definitively.

Each verdict comes with a plain-English explanation and one or more source citations with URLs.

Input Formats

The content_type parameter determines how content is processed. Pick the type that best describes what you're sending — the API uses this to choose the right extraction path.

Web & raw input

text — Plain text. Pass via the content field.
html — Raw HTML. Pass via the content field. We strip tags and extract the article body.
url — Web page URL. Pass via the url field. We fetch and extract the article text.
youtube — YouTube video URL. Pass via the url field. We fetch the transcript and verify claims.
tiktok — TikTok video URL. Pass via the url field. Transcript extraction and verification.

Documents

pdf — Base64-encoded PDF. Pass via the file field. OCR runs automatically on scanned documents.
docx — Base64-encoded DOCX. Pass via the file field. Headings and lists are preserved.
doc — Base64-encoded legacy Word .doc. Pass via the file field.
rtf — Base64-encoded RTF. Pass via the file field.
image — Base64-encoded image. Pass via the file field. We run OCR to extract text, then verify.

Social media pastes

For these types, paste the post text into the content field. The API treats them with format-aware extraction (handles @ mentions, hashtags, replies, etc.).

twitter — A copy-pasted tweet or thread.
reddit — A copy-pasted Reddit post or comment.
instagram — A copy-pasted Instagram caption or comment.

AI assistant transcripts

For these types, paste the AI assistant's reply (and optionally the prompt) into the content field. The API recognizes assistant-output formatting and prioritizes factual claims over conversational scaffolding.

chatgpt — Pasted ChatGPT output.
claude — Pasted Claude output.
gemini — Pasted Gemini output.
perplexity — Pasted Perplexity answer.
gist — Generic AI-paste handler when the source isn't one of the above.

Claim Positioning

When verification completes, the API returns pre-resolved highlight regions that map each verdict back to the source text. This gives you everything you need to build inline highlighting, annotations, or underlines in your UI — without any offset math on your end.

What you get

Each highlight region includes three layers of positioning data:

Matched text (text) — the exact words to highlight. This is the recommended primary locator. Find this string in your rendered content with a simple indexOf() search — no offset arithmetic required. This works regardless of how your content was rendered (markdown, HTML, plain text).
Character offsets (start, end) — exact positions in the source text as submitted to the API. Use these as a fallback when the text search doesn't find a match — extract the substring from your original source text and search for that instead.
Original sentence (answer_span) — the full sentence the claim was extracted from. Useful for context display or as a last-resort search key.

Recommended approach: Use the text field as your primary positioning method. It's rendering-agnostic and works even when your displayed content differs from the raw source (e.g., after markdown rendering, link insertion, or reformatting). Only fall back to start/end offsets when text can't be found.

Divergence detection

The highlights payload also includes source_length and source_hash (MD5 fingerprint of the source text at computation time). Compare these against your local copy to detect whether the text has changed since highlights were computed. If it has, the text field is especially valuable since it doesn't depend on character offsets being correct.

Overlap resolution

Highlights are pre-resolved server-side: overlapping regions are merged with verdict priority (disputed > inconclusive > confirmed). Disputed regions are never dropped. You'll never receive overlapping regions — however, if you're assembling highlights from multiple SSE events or applying them to a different text than the API processed, you should implement client-side overlap resolution. See Highlight Integration below for a complete example.

Example payload

// In streaming mode, listen for the highlights event:
event: highlights
data: {
  "highlights": [
    {
      "start": 42,
      "end": 118,
      "verdict": "disputed",
      "text": "often celebrated as the only man-made structure visible from space",
      "answer_span": "Few structures capture the imagination ...",
      "claim_index": 2,
      "claim_indices": [2],
      "tooltip": "The Great Wall is not visible from space with the naked eye."
    },
    {
      "start": 215,
      "end": 289,
      "verdict": "confirmed",
      "text": "stretches over 13,000 miles across northern China",
      "answer_span": "The wall stretches over 13,000 miles ...",
      "claim_index": 4,
      "claim_indices": [4],
      "tooltip": "Confirmed by multiple sources."
    }
  ],
  "source_length": 1842,
  "source_hash": "a1b2c3d4e5f6"
}

See the Highlight object in the API Reference for the complete field list.

Highlight Integration

This guide walks through building client-side highlight rendering using the FactiveAPI's positioning data. The approach works in any DOM environment (vanilla JS, React, Vue, etc.).

Step 1: Receive highlights

In streaming mode (mode: "stream"), highlights arrive as an SSE event after all claims are verified. In synchronous mode, they're included in the response body.

// SSE streaming — listen for each named event type separately.
// SSE event listeners only fire for events whose `event:` header matches
// the name you pass to addEventListener. The default 'message' listener
// does NOT receive named events like `event: highlights`.
const evtSource = new EventSource('/api/v1/verify?stream=true', { ... });

evtSource.addEventListener('text_extracted', (event) => {
  const data = JSON.parse(event.data);
  // data: { text, title, content_type, char_count }
  displaySourceText(data.text, data.title);
});

evtSource.addEventListener('claim_verified', (event) => {
  const data = JSON.parse(event.data);
  // data.claim: { text, verdict, summary, sources, ... }
  renderClaim(data.claim);
});

evtSource.addEventListener('highlights', (event) => {
  const data = JSON.parse(event.data);
  // data: { highlights: [...], source_length, source_hash }
  applyHighlights(data.highlights);
});

evtSource.addEventListener('complete', (event) => {
  const data = JSON.parse(event.data);
  evtSource.close();
});

// Synchronous mode
const result = await fetch('/api/v1/verify', { ... }).then(r => r.json());
applyHighlights(result.highlights);  // available after verification completes

Step 2: Locate highlights in your content

Use the text field to find each highlight's position in your rendered content. Fall back to start/end offsets only when the text search misses.

function locateHighlight(hl, renderedText, rawSourceText) {
  // Primary: search for the exact text in rendered content
  if (hl.text) {
    const idx = renderedText.indexOf(hl.text);
    if (idx !== -1) {
      return { start: idx, end: idx + hl.text.length };
    }
  }

  // Fallback: use start/end offsets against the raw source text,
  // then search for that substring in the rendered content
  if (hl.start != null && hl.end != null && rawSourceText) {
    const expected = rawSourceText.substring(hl.start, hl.end);
    if (expected) {
      const idx = renderedText.indexOf(expected);
      if (idx !== -1) {
        return { start: idx, end: idx + expected.length };
      }
    }
  }

  // If neither works, the content has diverged from what the API processed.
  // Log a warning — don't silently mask the mismatch.
  console.warn('Could not locate highlight:', hl.text?.substring(0, 50));
  return null;
}

Don't use fuzzy matching. If a highlight can't be located, that's a signal that your content has diverged from what the API processed. Fuzzy matching masks these issues. Let mismatches surface so you can diagnose them — typically it means your content was modified between submission and rendering.

Step 3: Resolve overlaps (client-side)

The API pre-resolves overlaps server-side, so you typically won't see them. But if you're combining highlights from multiple requests, or your text search places two highlights on overlapping characters, resolve them with verdict priority: disputed (highest) > inconclusive > confirmed (lowest).

const VERDICT_PRIORITY = {
  disputed: 0,     // highest — always wins
  inconclusive: 1,
  confirmed: 2     // lowest
};

function resolveOverlaps(ranges, textLength) {
  // For each character, keep only the highest-priority verdict
  const charOwner = new Array(textLength);

  for (const range of ranges) {
    const priority = VERDICT_PRIORITY[range.verdict] ?? 99;
    for (let c = range.start; c < range.end; c++) {
      if (!charOwner[c] || priority < charOwner[c].priority) {
        charOwner[c] = { ...range, priority };
      }
    }
  }

  // Collapse into contiguous, non-overlapping segments
  const segments = [];
  let segStart = -1, segOwner = null;

  for (let c = 0; c <= textLength; c++) {
    const owner = charOwner[c] || null;
    if (owner === segOwner) continue;
    if (segOwner) {
      segments.push({ start: segStart, end: c, ...segOwner });
    }
    segStart = c;
    segOwner = owner;
  }

  return segments;
}

Step 4: Render highlights in the DOM

Walk the container's text nodes with a TreeWalker, split them at segment boundaries, and wrap the target portions in styled <span> elements.

function renderHighlights(container, segments) {
  // Remove previous highlights (unwrap spans, keep text)
  container.querySelectorAll('.hl-span').forEach(span => {
    const parent = span.parentNode;
    while (span.firstChild) parent.insertBefore(span.firstChild, span);
    parent.removeChild(span);
  });
  container.normalize();

  // Collect text nodes with their character offsets
  const walker = document.createTreeWalker(container, NodeFilter.SHOW_TEXT);
  const textNodes = [];
  let offset = 0, node;
  while ((node = walker.nextNode())) {
    textNodes.push({ node, start: offset, end: offset + node.nodeValue.length });
    offset += node.nodeValue.length;
  }

  // Apply segments in reverse order (preserves earlier offsets)
  for (let i = segments.length - 1; i >= 0; i--) {
    const seg = segments[i];

    for (const tn of textNodes) {
      if (tn.end <= seg.start || tn.start >= seg.end) continue;

      const nodeStart = Math.max(seg.start, tn.start) - tn.start;
      const nodeEnd = Math.min(seg.end, tn.end) - tn.start;
      let target = tn.node;

      // Split text node at boundaries
      if (nodeEnd < target.nodeValue.length) target.splitText(nodeEnd);
      if (nodeStart > 0) target = target.splitText(nodeStart);

      // Wrap in highlight span
      const span = document.createElement('span');
      span.className = `hl-span hl-${seg.verdict}`;
      span.dataset.verdict = seg.verdict;
      span.dataset.claimIndex = seg.claimIndex;
      span.title = seg.tooltip || '';
      target.parentNode.insertBefore(span, target);
      span.appendChild(target);

      break; // Move to next segment
    }
  }
}

Step 5: Style the highlights

Apply colors that distinguish verdict types at a glance. Here's a minimal CSS starting point:

/* Base highlight */
.hl-span {
  border-radius: 3px;
  padding: 1px 0;
  cursor: pointer;
  transition: opacity 0.15s;
}

/* Disputed — red/pink */
.hl-disputed   { background: rgba(239, 68, 68, 0.18); border-bottom: 2px solid #ef4444; }

/* Inconclusive — amber */
.hl-inconclusive { background: rgba(245, 158, 11, 0.15); border-bottom: 2px solid #f59e0b; }

/* Confirmed — green */
.hl-confirmed  { background: rgba(34, 197, 94, 0.12); border-bottom: 2px solid #22c55e; }

/* Dark mode */
@media (prefers-color-scheme: dark) {
  .hl-disputed   { background: rgba(239, 68, 68, 0.25); }
  .hl-inconclusive { background: rgba(245, 158, 11, 0.22); }
  .hl-confirmed  { background: rgba(34, 197, 94, 0.18); }
}

Putting it all together

Here's the complete flow in ~30 lines:

// 1. Receive highlights from the API (named event, NOT 'message')
evtSource.addEventListener('highlights', (event) => {
  const data = JSON.parse(event.data);

  const container = document.getElementById('article-content');
  const renderedText = container.textContent;
  const rawSource = myOriginalSourceText; // the text you submitted to the API

  // 2. Locate each highlight in rendered content
  const ranges = [];
  for (const hl of data.highlights) {
    const pos = locateHighlight(hl, renderedText, rawSource);
    if (pos) {
      ranges.push({ ...pos, verdict: hl.verdict, tooltip: hl.tooltip,
                     claimIndex: hl.claim_index });
    }
  }

  // 3. Resolve any overlaps
  const segments = resolveOverlaps(ranges, renderedText.length);

  // 4. Render in the DOM
  renderHighlights(container, segments);
});

Framework adapters: This guide uses vanilla JavaScript, but the same pattern works in React (wrap in a useEffect), Vue (use a watch), or any framework. The key insight is that you're operating on the DOM's textContent, not HTML — so the approach is rendering-agnostic.

Text Extraction

The POST /api/v1/extract-text endpoint extracts plain text from any supported input format — no claim extraction, no verification, no billing. Use it to preview document content before running it through verify, or to store extracted text in your own database.

DOCX extraction preserves heading hierarchy (as markdown #/##), bulleted and numbered lists with indent levels, tables as pipe-delimited rows, and text from embedded images via OCR.

Every response includes a title field — a human-readable document title extracted automatically from file metadata, the first heading, or the filename. For YouTube videos, this is the video title. For URLs, the page title.

POST /api/v1/extract-text
{
  "content_type": "docx",
  "file": "UEsDBBQAAAAIAB..."  // base64-encoded DOCX
}

// Response:
{
  "id": "et_a1b2c3d4e5f6",
  "status": "complete",
  "input_type": "docx",
  "text": "# Introduction\n\nThis thesis examines...",
  "title": "Czech Independence and the Chicago School",
  "char_count": 48210,
  "processing_time_ms": 340
}

See the API Reference for the full parameter list and title extraction strategy per content type.

Claim Extraction

The POST /api/v1/extract endpoint runs ProRata's claim extraction engine without the verification step. It decomposes text into individual, atomic claims — each one a self-contained verifiable statement.

This is significantly faster and cheaper than full verification ($0.002/claim vs $0.01/claim), making it ideal for high-volume content decomposition, pre-processing for your own pipeline, or AI agent reasoning.

Request

POST /api/v1/extract

{
  "content": "Einstein failed math. The Great Wall is visible from space.",
  "content_type": "text",
  "max_claims": 100
}

Response

{
  "id": "ex_abc123",
  "status": "complete",
  "claims": [
    {
      "text": "Einstein failed math",
      "sentence": "Einstein failed math.",
      "start_offset": 0,
      "end_offset": 20,
      "filtered": false
    },
    {
      "text": "The Great Wall is visible from space",
      "sentence": "The Great Wall is visible from space.",
      "start_offset": 22,
      "end_offset": 58,
      "filtered": false
    }
  ],
  "claims_count": 2,
  "usage": {
    "claims_extracted": 2,
    "claims_filtered": 0,
    "content_length": 58,
    "cost_usd": 0.004
  }
}

Extract-Text vs Extract vs Verify

Choose the right endpoint for your use case:

Extract-Text (/api/v1/extract-text) — Returns plain text and a document title. No claims, no verification. Free. Use when you need to preview or ingest content from files and URLs.
Extract (/api/v1/extract) — Returns claims only. No verdicts, no sources, no verification. ~2-5x faster, ~5x cheaper than verify. Use when you need to decompose content into claims for your own downstream processing.
Verify (/api/v1/verify) — Returns claims with verdicts (confirmed/disputed/inconclusive), explanations, and source citations. Use when you need end-to-end fact-checking.

Streaming Input

Fact-check content as it's being generated — LLM streaming output, live transcription, real-time captions, any progressive text source. The pattern: buffer your producer's output until you reach a paragraph break, then POST that paragraph to /api/v1/verify/paragraph/stream. Each call is fully self-contained — no session, no shared state.

Contract — read before integrating

One POST = one complete paragraph. Buffer your tokens locally until you see a paragraph break (\n\n), then POST that paragraph as a single request. Do not stream individual tokens.
Retry on transient failure. A 5xx response or network error should be retried with backoff (recommended: 3 attempts, 0.25s / 0.6s / 1.0s). A 4xx response means the paragraph was rejected and should not be retried.
Each paragraph is independent. Pass earlier paragraphs (or the user's question) as context to disambiguate references like "this," "the company," or "as mentioned above."

Per-paragraph request:

POST /api/v1/verify/paragraph/stream
{
  "text": "First complete paragraph from the LLM.",
  "context": "Optional: the user's question, or earlier paragraphs"
}

SSE events you'll receive (per paragraph):

event: paragraph_claims
data: {"paragraph_id": "pg_abc123def456", "claims": [{"text": "Claim text...", "sentence": "Original sentence containing the claim.", "start_offset": 0, "end_offset": 49}]}

event: verify_result
data: {"text": "Claim text...", "verdict": "confirmed", "summary": "...", "explanation": "...", "sources": [...], "verified_by": "Sonar"}

event: done
data: {"paragraph_id": "pg_abc123def456", "chars_received": 412, "total_claims": 3, "processing_time_ms": 9420}

Recommended client loop:

buffer = ""
async for token in llm.stream():
    buffer += token
    while "\n\n" in buffer:
        paragraph, buffer = buffer.split("\n\n", 1)
        if paragraph.strip():
            # Fire-and-forget; multiple paragraphs verify in parallel
            asyncio.create_task(verify_paragraph(paragraph, prior_context))

# Don't forget the trailing buffer once the LLM finishes
if buffer.strip():
    await verify_paragraph(buffer, prior_context)

The factivelabs Python SDK's verify_stream(...) helper does this for you automatically — handles buffering, parallel paragraph verification, retries, and event aggregation.

JSON variant: if you don't need progressive UI updates, use POST /api/v1/verify/paragraph instead. Same input, same per-paragraph contract, but you receive a single JSON response with all verified claims.

Use cases: AI chatbot output verification, live meeting transcription, broadcast fact-checking, real-time document editing, voice agent monitoring.

Batch Processing

Submit up to 100 documents in a single request using POST /api/v1/verify/batch:

{
  "items": [
    {"content": "Claim one to verify", "content_type": "text"},
    {"url": "https://example.com/article", "content_type": "url"},
    {"content": "Another claim to check", "content_type": "text"}
  ]
}

Each item is processed independently. Results are returned as an array in the same order. Batch limits vary by plan (Free: 5, Pro: 50, Enterprise: 100).

Async Jobs

When you set "async": true, the API returns a job ID immediately. Use this to process large documents without holding a connection open.

GET /api/v1/jobs/{job_id}

// Pending (verification still running):
{
  "id": "fc_abc123def456",
  "status": "processing",
  "progress": {"claims_extracted": 8, "claims_verified": 3, "claims_total": 12},
  "result": null,
  "created_at": "2026-05-14T10:30:00Z",
  "completed_at": null
}

// Complete — the full verify response is in `result`:
{
  "id": "fc_abc123def456",
  "status": "complete",
  "progress": {"claims_extracted": 12, "claims_verified": 12, "claims_total": 12},
  "result": { "id": "fc_abc123", "claims": [...], "counts": {...}, "usage": {...} },
  "created_at": "2026-05-14T10:30:00Z",
  "completed_at": "2026-05-14T10:30:45Z"
}

// Failed (no error field is currently surfaced; status='failed' with result=null):
{"id": "fc_abc123def456", "status": "failed", "progress": {"claims_extracted": 0, "claims_verified": 0, "claims_total": 0}, "result": null, "created_at": "2026-05-14T10:30:00Z", "completed_at": "2026-05-14T10:30:00Z"}

Poll every 2-5 seconds. Jobs expire after 24 hours.

Private Corpus

Fact-check claims against your own documents instead of (or in addition to) the public web. Useful for internal-knowledge verification — product specs, policy docs, compliance handbooks, scientific literature, anything you don't want fact-checked against random web pages.

How the corpus works

Upload documents via POST /api/v1/private-corpus/upload (multipart, 1–10 files per call). Supported types: pdf, docx, doc, rtf, txt, md, html.
Wait for ingestion. Documents move queued → processing → embedding → ready. Poll GET /api/v1/private-corpus to watch the state. Most documents reach ready in 5–30 seconds.
Verify against the corpus by adding "use_private_corpus": true to any /verify or /verify/paragraph request.

Combining corpus with the web

The optional corpus_mode field on POST /api/v1/verify controls fall-back behavior. (Note: corpus_mode is not currently supported on POST /api/v1/verify/paragraph — paragraph requests run in corpus_only mode when use_private_corpus: true.)

"corpus_only" (default) — corpus only. Returns inconclusive if no corpus document covers the claim.
"corpus_and_web" — corpus first, fall back to the public web on inconclusive. Useful when the corpus is partial coverage of the claim's domain.

Improving routing accuracy

The corpus has a one-paragraph scope description — auto-generated from your documents and used to decide whether a claim should be routed to the corpus at all. View or edit it via GET /api/v1/private-corpus/scope and PUT /api/v1/private-corpus/scope. You can also pass a one-off corpus_scope field on individual /verify requests to override the saved description for that call.

Example

// 1. Upload some docs (multipart/form-data)
POST /api/v1/private-corpus/upload
Content-Type: multipart/form-data
files=@policies.pdf
files=@handbook.docx

// 2. Wait for them to be ready
GET /api/v1/private-corpus
// → [{"doc_id":"doc_abc","status":"ready", ...}, ...]

// 3. Verify against the corpus
POST /api/v1/verify
{
  "content": "Our return policy is 30 days for unused items.",
  "use_private_corpus": true,
  "corpus_mode": "corpus_and_web"
}

Documents are isolated by API key prefix — one customer cannot read another customer's corpus. If you're sub-tenanting, isolation is further scoped to (api_key, end_user_id) so each of your end-users gets their own private corpus under your account. See the Private Corpus reference for the full endpoint list including audit logs and document chunk inspection.

Source Blacklist

Exclude specific domains from the web sources we consult. Pass an exclude_domains array on any verify request and we'll filter those domains out of every web search performed during fact-checking — chunks from blacklisted domains never reach the verifier.

How it works

Per request, not stored. The API does not persist your blacklist. Send the list on every request. Your app maintains its own storage for the user's preferred exclusions.
Bare hostnames. Use just the domain — no protocol, no path. "reuters.com", not "https://www.reuters.com/".
Subdomains are excluded automatically. Adding "nytimes.com" also excludes "opinion.nytimes.com", "cooking.nytimes.com", etc.
Cap: 100 domains per request. Anything beyond that returns a 400.

Gemini-grounded fallback paths cannot honor this filter. When a claim falls through to the Gemini fallback verifier (rare, for ambiguous cases the primary verifier can't resolve), the blacklist is not applied at that layer. Those claims may still cite blacklisted sources. Check the verified_by field on each claim if you need to know which path ran.

Example

POST /api/v1/verify
{
  "content": "Your claim text here.",
  "exclude_domains": ["reuters.com", "infowars.com", "examplespam.net"]
}

Works the same way on POST /api/v1/verify/paragraph and its streaming variant.

Sub-tenanting

If you're building a multi-user product on top of FactiveLabs, pass an end_user_id on each request to isolate your end-users' data while keeping all activity billed to your account.

How it works

One API key, many sub-tenants. Your single API key identifies your business as the customer. The optional end_user_id identifies which of your users a particular request belongs to.
Per-sub-tenant isolation. Private corpora, audit logs, and corpus scope descriptions are scoped to (api_key, end_user_id). User A cannot see User B's uploads, even though both live under your account.
Rolled-up billing. All sub-tenant activity bills to your account. Per-sub-tenant usage breakdowns are available in your FactiveLabs dashboard.
Optional. Omit end_user_id entirely and your account behaves as a single tenant — the same behavior as before this field existed.

Format

Any stable string you control, up to 128 characters. Most callers use their own internal user ID (a UUID, a numeric primary key, an opaque hash). The value is opaque to FactiveLabs — we don't parse it or look it up against any external system.

Example

// Verify a claim for one of your end-users
POST /api/v1/verify
{
  "content": "...",
  "end_user_id": "user_a8f3c1b2"
}

// Upload a document to that end-user's private corpus
POST /api/v1/private-corpus/upload
X-End-User-Id: user_a8f3c1b2
<multipart body with files>

// Verify against that end-user's corpus
POST /api/v1/verify
{
  "content": "...",
  "use_private_corpus": true,
  "end_user_id": "user_a8f3c1b2"
}

For the private-corpus endpoints, end_user_id is passed as a header (X-End-User-Id) for cleaner ergonomics with multipart uploads. For all other endpoints, it's a JSON body field.

Pick a stable identifier. Once a user's corpus is keyed to an end_user_id, changing it for that same user means they lose access to their existing uploads. Use your immutable internal user ID, not anything mutable like an email address.

Error Handling

The API uses standard HTTP status codes. Error responses include a JSON body with details:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "You have exceeded the rate limit of 10 requests per minute.",
    "retry_after": 45
  }
}

400 — Bad request (invalid parameters, missing content)
401 — Unauthorized (missing or invalid API key)
403 — Forbidden (plan limit exceeded, account disabled)
404 — Not found (invalid job ID)
413 — Content too large (exceeds plan's character limit)
429 — Rate limit exceeded (check Retry-After header)
500 — Internal server error

Rate Limits

Rate limits are applied per API key. The current limits by plan:

Free: 50 requests/minute, 200,000 character limit per request
Pro: 500 requests/minute, 200,000 character limit per request
Growth: 1,000 requests/minute, 200,000 character limit per request
Scale: 5,000 requests/minute, 200,000 character limit per request

Rate limit information is included in response headers:

X-RateLimit-Limit: 50
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1711843200

Python SDK

The official Python SDK provides a typed, convenient wrapper around the REST API.

pip install factivelabs

Sync client

from factivelabs import FactiveClient

client = FactiveClient(api_key="YOUR_API_KEY")

# Verify text
result = client.verify_text(text="Some claim to check")

# Verify a URL (auto-detects YouTube and TikTok)
result = client.verify_url(url="https://example.com/article")

# Verify a file (PDF, DOCX, image — auto-detected from extension)
result = client.verify_file(file="/path/to/report.pdf")

# Access results
for claim in result.claims:
    print(claim.verdict, claim.text, claim.explanation)

Async client

from factivelabs import AsyncFactiveClient

client = AsyncFactiveClient(api_key="YOUR_API_KEY")
result = await client.verify_text(text="Some claim to check")

Batch verification

# Submit a batch and block until all jobs finish
submission = client.submit_batch(items=[
    {"text": "Claim 1"},
    {"text": "Claim 2"},
    {"url": "https://example.com/article"},
])
results = client.wait_for_batch(submission)

for r in results:
    for claim in r.claims:
        print(claim.verdict, claim.text)

Private corpus

# Upload a document
client.corpus.upload(files=["/path/to/policies.pdf"])

# List documents and their ingestion status
docs = client.corpus.list()

# Verify against the corpus
result = client.verify_text(
    text="Our return policy is 30 days for unused items.",
    use_private_corpus=True,
    corpus_scope="Internal customer-facing policies and the company handbook.",
)

Streaming paragraphs (LLM output verification)

# verify_stream() handles buffering, parallel paragraph verification,
# retries, and event aggregation for you.
result = client.verify_stream(text_stream=my_llm_stream_iter)

Not in the SDK yet. The exclude_domains (Source Blacklist) and end_user_id (Sub-tenanting) parameters are currently available only via raw HTTP. Send them as JSON body fields (or as the X-End-User-Id header for private-corpus endpoints). SDK helpers are on the roadmap.