SDK Reference

All public methods on the DeepCitation client class and standalone utility functions exported from the deepcitation package.

For the REST API endpoints, see API Reference. The SDK methods below are wrappers around these endpoints with additional convenience features.


Client Class

Constructor

import { DeepCitation } from "deepcitation";

const dc = new DeepCitation({
  apiKey: process.env.DEEPCITATION_API_KEY,
});
Option Type Required Description
apiKey string Yes Your DeepCitation API key. Must start with sk-dc- and be at least 20 characters.
apiUrl string No Override the API base URL. Defaults to https://api.deepcitation.com. Must use HTTPS (except http://localhost for local development). The client throws ValidationError at construction if this constraint is not met.
maxRetries number No Maximum retries for transient network failures (connection drops, DNS errors). Uses exponential backoff with jitter: 2^(attempt-1) * 100ms ± 10%, capped at 16s. Does not retry HTTP error responses (4xx/5xx). Default: 3.
requestSource string No Tag identifying request origin (e.g. "playground"). Sent as X-Request-Source header.
onLatestVersion (latestVersion: string) => void No Callback invoked when the API responds with a latest SDK version header. Useful for detecting when a newer SDK version is available.
convertedPdfDownloadPolicy "url_only" \| "always" \| "never" No Controls when converted PDF download URLs are included in file responses. "url_only" (default) includes a signed download URL; "always" includes both URL and raw bytes; "never" omits the converted PDF entirely.
logger DeepCitationLogger No Custom logger for SDK internals. Implements optional debug, info, warn, and error methods. Pass console to log to stdout.

File Preparation

prepareAttachments(files)

Upload one or more files and extract text with line IDs for LLM prompts. This is the primary method for preparing source documents.

const { fileDataParts, deepTextPages } = await dc.prepareAttachments([
  { file: pdfBuffer, filename: "report.pdf" },
  { file: imageBuffer, filename: "chart.png" },
]);

const attachmentId = fileDataParts[0].attachmentId;
Parameter Type Description
files FileInput[] Array of { file, filename } objects. file can be File, Blob, or Buffer.

Returns: PrepareAttachmentsResult{ fileDataParts: PreparedAttachment[], deepTextPagesByAttachmentId: Record<string, string[]> }


uploadFile(file, options?)

Upload a single file. Lower-level than prepareAttachments — use when you need fine-grained control over individual uploads.

const result = await dc.uploadFile(pdfBuffer, {
  filename: "report.pdf",
  attachmentId: "custom-id-123",
  endUserId: "user-456",
});
Parameter Type Description
file File \| Blob \| Buffer The file to upload
options.filename string Override filename
options.attachmentId string Custom attachment ID (auto-generated if omitted)
options.endUserId string Your end-user identifier for usage attribution

Returns: UploadFileResponse — includes canonical deepTextPages


prepareUrl(options)

Convert a web page or hosted document to PDF and prepare it for verification.

const result = await dc.prepareUrl({
  url: "https://example.com/article",
  filename: "article.pdf",
  skipCache: false,
});
Parameter Type Description
options.url string URL of the web page or document
options.filename string Custom filename for the converted document
options.attachmentId string Custom attachment ID
options.skipCache boolean Force fresh conversion, bypass URL cache (default: false)
options.endUserId string Your end-user identifier

Returns: UploadFileResponse (includes deepTextPages, urlSource, and urlCache fields)


convertToPdf(input)

Convert an Office document (DOCX, XLSX, PPTX) to PDF without preparing it for verification.

const { downloadUrl } = await dc.convertToPdf({
  url: "https://example.com/report.docx",
});
Parameter Type Description
input ConvertFileInput \| string URL or conversion options

Returns: ConvertFileResponse


prepareConvertedFile(options)

Prepare a previously converted PDF for citation verification.

const result = await dc.prepareConvertedFile({
  convertedFileUrl: downloadUrl,
  filename: "report.pdf",
});

Returns: UploadFileResponse (includes deepTextPages)


Citation Verification

verify(input, citations?)

Convenience wrapper that parses citations from raw LLM output, groups them by attachment, and verifies each group.

const { verifications } = await dc.verify({
  llmOutput: response.content,
  outputImageFormat: "avif",
});
Parameter Type Description
input.llmOutput string Raw LLM output containing [N] markers and <<<CITATION_DATA>>> block
input.outputImageFormat "avif" \| "jpeg" \| "png" Proof image format (default: "avif")
input.fileDataParts Array<{ attachmentId: string; filename?: string }> File metadata for Zero Data Retention / post-expiry scenarios
input.endUserId string Your end-user identifier
citations Record<string, Citation> Pre-parsed citations (if omitted, parsed from llmOutput)

Returns: VerifyCitationsResponse{ verifications: Record<string, Verification> }

verify() calls getAllCitationsFromLlmOutput() internally. Use verifyAttachment() when you extract and manage citations yourself.


verifyAttachment(attachmentId, citations, options?)

Verify explicit citations against a specific attachment. Use this when you manage citation extraction yourself.

const citations = getAllCitationsFromLlmOutput(response.content);
const { verifications } = await dc.verifyAttachment(attachmentId, citations, {
  outputImageFormat: "avif",
});
Parameter Type Description
attachmentId string The attachment ID from prepareAttachments()
citations CitationInput Map of citation keys to Citation objects
options.outputImageFormat "avif" \| "jpeg" \| "png" Proof image format (default: "avif")
options.endUserId string Your end-user identifier

Returns: VerifyCitationsResponse{ verifications: Record<string, Verification> }


Batch & Iterative Verification

verifyBatch(citations, options?)

Verify citations across multiple attachments in a single request. Each citation must include an attachmentId field. Use this when you have citations from multiple documents and want a single API call.

const allCitations = getAllCitationsFromLlmOutput(response.content);
const { verifications } = await dc.verifyBatch(allCitations, {
  outputImageFormat: "avif",
});
Parameter Type Description
citations Record<string, Citation> Citations keyed by citation key. Each citation must include an attachmentId field.
options.outputImageFormat "avif" \| "jpeg" \| "png" Proof image format (default: "avif")
options.endUserId string Your end-user identifier

Returns: VerifyCitationsResponse{ verifications: Record<string, Verification> }

verifyBatch() is equivalent to calling verifyAttachment() per attachment but batches them into one network round-trip. Use verify() for the simplest case — it calls verifyBatch() internally after parsing citations from raw LLM output.


verifyIterative(attachmentId, citations, options)

Verify citations with iterative refinement. Each failed citation is retried up to maxAttempts times using the correction callback, letting you fix citations programmatically instead of discarding them.

const { verifications } = await dc.verifyIterative(
  attachmentId,
  citations,
  {
    maxAttempts: 3,
    outputImageFormat: "avif",
    onAttemptComplete: async (attempt, history, citationKey) => {
      if (attempt.status === "not_found") {
        // Shorten the source match and retry
        const original = citations[citationKey];
        return { ...original, sourceMatch: original.sourceMatch.split(" ").slice(0, 2).join(" ") };
      }
      return null; // stop retrying
    },
  },
);
Parameter Type Description
attachmentId string The attachment ID from prepareAttachments()
citations CitationInput Map of citation keys to Citation objects
options.maxAttempts number Maximum verification passes per citation. Default: 3.
options.outputImageFormat "avif" \| "jpeg" \| "png" Proof image format (default: "avif")
options.endUserId string Your end-user identifier
options.onAttemptComplete (attempt: LlmSearchAttempt, history: LlmSearchAttempt[], citationKey: string) => Promise<Citation \| { citation: Citation; isFalsePositiveRejection?: boolean } \| null \| undefined> Called after each non-terminal attempt. Return an amended Citation to retry, { citation, isFalsePositiveRejection: true } to flag a false positive, or null/undefined to stop.

Returns: VerifyCitationsResponse{ verifications: Record<string, Verification> }


Attachment Management

getAttachment(attachmentId, options?)

Retrieve full attachment metadata including page renders, verifications, and extracted text.

const attachment = await dc.getAttachment("abc123");

Returns: AttachmentResponse (includes canonical deepTextPages)


deleteAttachment(attachmentId)

Permanently delete an attachment and all associated data. Irreversible.

const { deleted } = await dc.deleteAttachment("abc123");

Returns: DeleteAttachmentResponse{ attachmentId, deleted: true }


extendExpiration(options)

Extend the expiration date of an attachment.

const { expiresAt } = await dc.extendExpiration({
  attachmentId: "abc123",
  duration: "year", // "month" or "year"
});
Parameter Type Description
options.attachmentId string The attachment to extend
options.duration "month" \| "year" Extension period (30 or 365 days)

Returns: ExtendExpirationResponse{ attachmentId, expiresAt, previousExpiresAt }


Standalone Utility Functions

These functions are imported directly from deepcitation — they don’t require a client instance.

Citation Parsing

Core Parsing

import {
  getAllCitationsFromLlmOutput,
  parseCitationResponse,
  parseCitationData,
  citationDataToCitation,
  groupCitationsByAttachmentId,
  groupCitationsByAttachmentIdObject,
  getCitationKey,
  normalizeCitationType,
  extractCitationsFromMarkers,
} from "deepcitation";
Function Signature Description
getAllCitationsFromLlmOutput (llmOutput: string) => Record<string, Citation> Parse <<<CITATION_DATA>>> block from LLM output. Returns {} on failure — never throws.
parseCitationResponse (llmOutput: string) => ParsedCitationResult Parse LLM output into { visibleText, citations, markerMap } for rendering.
parseCitationData (json: unknown) => CitationData \| null Parse and validate a single raw citation data object. Returns null if the object lacks a required id field.
citationDataToCitation (data: CitationData) => Citation Convert a parsed CitationData object to a Citation for verification.
groupCitationsByAttachmentId (citations: Record<string, Citation>) => Map<string, Record<string, Citation>> Group citations by their attachmentId for per-attachment verification. Returns a Map.
groupCitationsByAttachmentIdObject (citations: Record<string, Citation>) => Record<string, Record<string, Citation>> Same as groupCitationsByAttachmentId but returns a plain object instead of a Map.
getCitationKey (citation: Citation) => string Generate a stable 16-char hash key for a citation. Used as the dictionary key in verification results.
normalizeCitationType (type: string) => CitationType Normalize a raw type string (e.g. "audio", "video") to a CitationType value.
extractCitationsFromMarkers (text: string) => Record<string, Citation> Extract citations from LLM output that has only [N] markers but no <<<CITATION_DATA>>> block. Uses the surrounding sentence as sourceContext.

Text Manipulation

import {
  stripCitations,
  replaceCitationMarkers,
  extractVisibleText,
  hasCitationData,
  getCitationMarkerIds,
} from "deepcitation";
Function Signature Description
stripCitations (llmResponse: string) => string Remove all citation artifacts from LLM output — strips both [N] markers and the <<<CITATION_DATA>>> block. Returns clean readable text.
replaceCitationMarkers (text: string, options?) => string Replace [N] markers with custom content. Pass a replacer function, showSourceMatch: true to substitute anchor text, or showVerificationStatus: true to append status indicators. Default behavior strips markers.
extractVisibleText (llmResponse: string) => string Split the LLM output at the <<<CITATION_DATA>>> delimiter and return only the visible portion above it. Does not strip [N] markers.
hasCitationData (text: string) => boolean Check whether a string contains a <<<CITATION_DATA>>> block.
getCitationMarkerIds (text: string) => number[] Return all citation marker IDs found in text, in order of appearance. Handles both [N] and [anchor](cite:N) formats.

Type Guards

import {
  isDocumentCitation,
  isUrlCitation,
  isAudioVideoCitation,
} from "deepcitation";
Function Signature Description
isDocumentCitation (c: Citation) => c is DocumentCitation Narrow a Citation union to DocumentCitation (type "document").
isUrlCitation (c: Citation) => c is UrlCitation Narrow a Citation union to UrlCitation (type "url").
isAudioVideoCitation (c: Citation) => c is AudioVideoCitation Narrow a Citation union to AudioVideoCitation (type "audio" or "video").

Prompt Wrapping

import { wrapCitationPrompt, wrapSystemCitationPrompt } from "deepcitation/prompts";
Function Signature Description
wrapCitationPrompt (options: WrapCitationPromptOptions) => WrapCitationPromptResult Wrap both system and user prompts with citation instructions. Returns { enhancedSystemPrompt, enhancedUserPrompt }.
wrapSystemCitationPrompt (options: WrapSystemPromptOptions) => string Wrap only the system prompt. Use when you manage user prompt construction yourself.

See Prompts for details on what these functions inject.

Verification Helpers

import { getCitationStatus, validateUploadFile } from "deepcitation";
Function Signature Description
getCitationStatus (verification: Verification) => CitationStatus Derive UI status (isVerified, isPartialMatch, isMiss, isPending) from a verification result.
validateUploadFile (file: unknown) => { valid: boolean, error?: string } Validate a file before uploading (checks size, type).

Rendering

prepareCitations(input, options?)

Convert raw LLM output into a normalized intermediate representation (IR) that any rendering adapter can consume. This is the formal boundary between parsing/verification logic and custom renderers — prepare once, render to multiple formats.

import { prepareCitations, type CitationAdapter } from "deepcitation";

const ir = prepareCitations(llmOutput, {
  verifications,
  sourceLabels: { [attachmentId]: "Annual Report 2024" },
});

// Walk the IR to build custom output
for (const seg of ir.segments) {
  if (seg.type === "text") {
    output += seg.text;
  } else {
    // seg.type === "citation"
    const citation = ir.citations.find(c => c.citationNumber === seg.citationNumber);
    output += renderCitationBadge(citation);
  }
}
Parameter Type Description
input string \| ParsedCitationResult Raw LLM output string, or a pre-parsed result from parseCitationResponse().
options.verifications VerificationRecord Verification results keyed by citation key. Populates isVerified, isPartialMatch, etc. on each citation in the IR.
options.sourceLabels Record<string, string> Display labels keyed by attachmentId (use "" for URL citations). Pre-resolved onto each ResolvedCitation in the IR.

Returns: CitationIR{ segments: ReadonlyArray<TextSegment \| CitationSegment>, citations: ReadonlyArray<ResolvedCitation> }

Type Description
CitationIR The normalized IR: { segments, citations }. Adapters consume this — they never call parseCitationResponse directly.
CitationAdapter<TOptions, TOutput> A pure function (ir: CitationIR, options?: TOptions) => TOutput. Implement this to target a new render format (email, PDF, Notion, etc.).
ResolvedCitation A citation with verification status and a pre-computed sourceLabel from the fallback chain.
PrepareCitationsOptions Input options for prepareCitations(): { verifications?, sourceLabels? }.

Display Utilities

These helpers are useful when building custom citation renderers.

import {
  toSuperscript,
  getIndicator,
  humanizeLinePosition,
  INDICATOR_SETS,
  SUPERSCRIPT_DIGITS,
} from "deepcitation";
Export Signature / Type Description
toSuperscript (n: number) => string Convert a number to Unicode superscript characters (e.g. 3"³").
getIndicator (set: IndicatorSet, index: number) => string Get the indicator character at a given index from an IndicatorSet. Wraps when index exceeds the set length.
humanizeLinePosition (pos: LinePosition) => string Format a LinePosition as a human-readable string (e.g. "line 5" or "lines 5–7").
INDICATOR_SETS Record<IndicatorStyle, IndicatorSet> Predefined indicator sets: "numbers" (1, 2, 3…), "letters" (a, b, c…), "symbols" (†, ‡, §…).
SUPERSCRIPT_DIGITS string[] Lookup array mapping digits 0–9 to their Unicode superscript equivalents.

Error Classes

All errors extend DeepCitationError and include code, isRetryable, and statusCode properties.

import {
  AuthenticationError,   // 401/403 — fix the API key
  PaymentRequiredError,  // 402 — add or update payment method
  RateLimitError,        // 429 — rate limit exceeded
  ValidationError,       // 400/404/413 — fix the input
  ServerError,           // 5xx — safe to retry
  NetworkError,          // Network failure — safe to retry
} from "deepcitation";
Class Code Status Retryable When thrown
AuthenticationError DC_AUTH_INVALID 401, 403 No Missing, invalid, or expired API key
PaymentRequiredError DC_PAYMENT_REQUIRED 402 No Free tier exhausted, spend cap hit, or payment failed. Includes a billingCode field with the server-side billing reason.
RateLimitError DC_RATE_LIMITED 429 Yes API rate limit exceeded
ValidationError DC_VALIDATION_ERROR 400, 404, 413 No Bad input — wrong file format, oversized file, or invalid attachment ID
ServerError DC_SERVER_ERROR 5xx Yes API server error
NetworkError DC_NETWORK_ERROR Yes Timeout, DNS failure, or connection refused

See Error Handling for retry patterns and the isRetryable flag.


Constants

Delimiters

import {
  CITATION_DATA_START_DELIMITER,  // "<<<CITATION_DATA>>>"
  CITATION_DATA_END_DELIMITER,    // "<<<END_CITATION_DATA>>>"
  SDK_VERSION,                    // Current SDK version string
} from "deepcitation";

Prompt Format Constants

The SDK exports four distinct citation prompt formats. Pass these directly to LLM APIs that accept raw system prompt text, or use them to build custom wrapCitationPrompt alternatives.

import {
  // Standard document format (default)
  CITATION_PROMPT,
  CITATION_JSON_OUTPUT_FORMAT,
  CITATION_REMINDER,

  // Audio/video format (timestamps instead of page/line references)
  AV_CITATION_PROMPT,
  CITATION_AV_JSON_OUTPUT_FORMAT,
  CITATION_AV_REMINDER,

  // Compact format (omits source_context/reasoning — ~80–135 fewer tokens per citation)
  COMPACT_CITATION_PROMPT,
  COMPACT_CITATION_SCENARIO2_PROMPT,
  COMPACT_CITATION_JSON_OUTPUT_FORMAT,
} from "deepcitation";
Constant Format When to use
CITATION_PROMPT Standard Default for document/PDF citations. Includes source_context and reasoning fields.
CITATION_JSON_OUTPUT_FORMAT Standard JSON schema object for use with structured output APIs (e.g. OpenAI response_format).
CITATION_REMINDER Standard Short reminder string to append to user prompts.
AV_CITATION_PROMPT Audio/Video Replaces page_id/line_ids with timestamps (start_time/end_time in HH:MM:SS.SSS).
CITATION_AV_JSON_OUTPUT_FORMAT Audio/Video JSON schema for AV timestamp citations.
CITATION_AV_REMINDER Audio/Video Reminder variant that mentions timestamps.
COMPACT_CITATION_PROMPT Compact Omits source_context and reasoning from LLM output (reconstructed offline via hydrate). Use for latency-sensitive pipelines. Saves ~80–135 tokens per citation.
COMPACT_CITATION_SCENARIO2_PROMPT Compact Variant for annotating pre-existing user text (text is frozen; only [N] markers are inserted).
COMPACT_CITATION_JSON_OUTPUT_FORMAT Compact JSON schema for compact citations (n, k, p, l only).

See Prompts for when to choose each format.

Prompt ID Compression

When attachment IDs are long (e.g. UUIDs), compressing them reduces token count in large prompts.

import { compressPromptIds, decompressPromptIds, type CompressedResult } from "deepcitation";

// Compress: replace full IDs with minimal unique prefixes
const { compressed, prefixMap } = compressPromptIds(promptObject, attachmentIds);

// Decompress: restore full IDs from prefixes
const restored = decompressPromptIds(compressed, prefixMap);
Export Signature Description
compressPromptIds <T>(obj: T, ids: string[] \| undefined) => CompressedResult<T> Replace all occurrences of each ID in obj with its minimal unique prefix. Returns the compressed object and a prefixMap for decompression. Throws if a safe prefix cannot be found.
decompressPromptIds <T>(compressed: T \| string, prefixMap: Record<string, string>) => T \| string Restore full IDs from a compressed object or string using the prefixMap from compressPromptIds.
CompressedResult<T> { compressed: T, prefixMap: Record<string, string> } Return type of compressPromptIds. Pass prefixMap to decompressPromptIds to reverse the operation.

Back to top

© 2026 DeepCitation — a product of FileLasso, Inc.