SDK Reference

All public methods on the DeepCitation client class and standalone utility functions exported from the deepcitation package.

For the REST API endpoints, see API Reference. The SDK methods below are wrappers around these endpoints with additional convenience features.

Client Class

Constructor

import { DeepCitation } from "deepcitation";

const dc = new DeepCitation({
  apiKey: process.env.DEEPCITATION_API_KEY,
});

Option	Type	Required	Description
`apiKey`	`string`	Yes	Your DeepCitation API key. Must start with `sk-dc-` and be at least 20 characters.
`apiUrl`	`string`	No	Override the API base URL. Defaults to `https://api.deepcitation.com`. Must use HTTPS (except `http://localhost` for local development). The client throws `ValidationError` at construction if this constraint is not met.
`maxRetries`	`number`	No	Maximum retries for transient network failures (connection drops, DNS errors). Uses exponential backoff with jitter: `2^(attempt-1) * 100ms ± 10%`, capped at 16s. Does not retry HTTP error responses (4xx/5xx). Default: `3`.
`requestSource`	`string`	No	Tag identifying request origin (e.g. `"playground"`). Sent as `X-Request-Source` header.
`onLatestVersion`	`(latestVersion: string) => void`	No	Callback invoked when the API responds with a latest SDK version header. Useful for detecting when a newer SDK version is available.
`convertedPdfDownloadPolicy`	`"url_only" \\| "always" \\| "never"`	No	Controls when converted PDF download URLs are included in file responses. `"url_only"` (default) includes a signed download URL; `"always"` includes both URL and raw bytes; `"never"` omits the converted PDF entirely.
`logger`	`DeepCitationLogger`	No	Custom logger for SDK internals. Implements optional `debug`, `info`, `warn`, and `error` methods. Pass `console` to log to stdout.

File Preparation

`prepareAttachments(files)`

Upload one or more files and extract text with line IDs for LLM prompts. This is the primary method for preparing source documents.

const { fileDataParts, deepTextPages } = await dc.prepareAttachments([
  { file: pdfBuffer, filename: "report.pdf" },
  { file: imageBuffer, filename: "chart.png" },
]);

const attachmentId = fileDataParts[0].attachmentId;

Parameter	Type	Description
`files`	`FileInput[]`	Array of `{ file, filename }` objects. `file` can be `File`, `Blob`, or `Buffer`.

Returns: PrepareAttachmentsResult — { fileDataParts: PreparedAttachment[], deepTextPagesByAttachmentId: Record<string, string[]> }

`uploadFile(file, options?)`

Upload a single file. Lower-level than prepareAttachments — use when you need fine-grained control over individual uploads.

const result = await dc.uploadFile(pdfBuffer, {
  filename: "report.pdf",
  attachmentId: "custom-id-123",
  endUserId: "user-456",
});

Parameter	Type	Description
`file`	`File \\| Blob \\| Buffer`	The file to upload
`options.filename`	`string`	Override filename
`options.attachmentId`	`string`	Custom attachment ID (auto-generated if omitted)
`options.endUserId`	`string`	Your end-user identifier for usage attribution

Returns: UploadFileResponse — includes canonical deepTextPages

`prepareUrl(options)`

Convert a web page or hosted document to PDF and prepare it for verification.

const result = await dc.prepareUrl({
  url: "https://example.com/article",
  filename: "article.pdf",
  skipCache: false,
});

Parameter	Type	Description
`options.url`	`string`	URL of the web page or document
`options.filename`	`string`	Custom filename for the converted document
`options.attachmentId`	`string`	Custom attachment ID
`options.skipCache`	`boolean`	Force fresh conversion, bypass URL cache (default: `false`)
`options.endUserId`	`string`	Your end-user identifier

Returns: UploadFileResponse (includes deepTextPages, urlSource, and urlCache fields)

`convertToPdf(input)`

Convert an Office document (DOCX, XLSX, PPTX) to PDF without preparing it for verification.

const { downloadUrl } = await dc.convertToPdf({
  url: "https://example.com/report.docx",
});

Parameter	Type	Description
`input`	`ConvertFileInput \\| string`	URL or conversion options

Returns: ConvertFileResponse

`prepareConvertedFile(options)`

Prepare a previously converted PDF for citation verification.

const result = await dc.prepareConvertedFile({
  convertedFileUrl: downloadUrl,
  filename: "report.pdf",
});

Returns: UploadFileResponse (includes deepTextPages)

Citation Verification

`verify(input, citations?)`

Convenience wrapper that parses citations from raw LLM output, groups them by attachment, and verifies each group.

const { verifications } = await dc.verify({
  llmOutput: response.content,
  outputImageFormat: "avif",
});

Parameter	Type	Description
`input.llmOutput`	`string`	Raw LLM output containing `[N]` markers and `<<<CITATION_DATA>>>` block
`input.outputImageFormat`	`"avif" \\| "jpeg" \\| "png"`	Proof image format (default: `"avif"`)
`input.fileDataParts`	`Array<{ attachmentId: string; filename?: string }>`	File metadata for Zero Data Retention / post-expiry scenarios
`input.endUserId`	`string`	Your end-user identifier
`citations`	`Record<string, Citation>`	Pre-parsed citations (if omitted, parsed from `llmOutput`)

Returns: VerifyCitationsResponse — { verifications: Record<string, Verification> }

verify() calls getAllCitationsFromLlmOutput() internally. Use verifyAttachment() when you extract and manage citations yourself.

`verifyAttachment(attachmentId, citations, options?)`

Verify explicit citations against a specific attachment. Use this when you manage citation extraction yourself.

const citations = getAllCitationsFromLlmOutput(response.content);
const { verifications } = await dc.verifyAttachment(attachmentId, citations, {
  outputImageFormat: "avif",
});

Parameter	Type	Description
`attachmentId`	`string`	The attachment ID from `prepareAttachments()`
`citations`	`CitationInput`	Map of citation keys to Citation objects
`options.outputImageFormat`	`"avif" \\| "jpeg" \\| "png"`	Proof image format (default: `"avif"`)
`options.endUserId`	`string`	Your end-user identifier

Returns: VerifyCitationsResponse — { verifications: Record<string, Verification> }

Batch & Iterative Verification

`verifyBatch(citations, options?)`

Verify citations across multiple attachments in a single request. Each citation must include an attachmentId field. Use this when you have citations from multiple documents and want a single API call.

const allCitations = getAllCitationsFromLlmOutput(response.content);
const { verifications } = await dc.verifyBatch(allCitations, {
  outputImageFormat: "avif",
});

Parameter	Type	Description
`citations`	`Record<string, Citation>`	Citations keyed by citation key. Each citation must include an `attachmentId` field.
`options.outputImageFormat`	`"avif" \\| "jpeg" \\| "png"`	Proof image format (default: `"avif"`)
`options.endUserId`	`string`	Your end-user identifier

Returns: VerifyCitationsResponse — { verifications: Record<string, Verification> }

verifyBatch() is equivalent to calling verifyAttachment() per attachment but batches them into one network round-trip. Use verify() for the simplest case — it calls verifyBatch() internally after parsing citations from raw LLM output.

`verifyIterative(attachmentId, citations, options)`

Verify citations with iterative refinement. Each failed citation is retried up to maxAttempts times using the correction callback, letting you fix citations programmatically instead of discarding them.

const { verifications } = await dc.verifyIterative(
  attachmentId,
  citations,
  {
    maxAttempts: 3,
    outputImageFormat: "avif",
    onAttemptComplete: async (attempt, history, citationKey) => {
      if (attempt.status === "not_found") {
        // Shorten the source match and retry
        const original = citations[citationKey];
        return { ...original, sourceMatch: original.sourceMatch.split(" ").slice(0, 2).join(" ") };
      }
      return null; // stop retrying
    },
  },
);

Parameter	Type	Description
`attachmentId`	`string`	The attachment ID from `prepareAttachments()`
`citations`	`CitationInput`	Map of citation keys to Citation objects
`options.maxAttempts`	`number`	Maximum verification passes per citation. Default: `3`.
`options.outputImageFormat`	`"avif" \\| "jpeg" \\| "png"`	Proof image format (default: `"avif"`)
`options.endUserId`	`string`	Your end-user identifier
`options.onAttemptComplete`	`(attempt: LlmSearchAttempt, history: LlmSearchAttempt[], citationKey: string) => Promise<Citation \\| { citation: Citation; isFalsePositiveRejection?: boolean } \\| null \\| undefined>`	Called after each non-terminal attempt. Return an amended `Citation` to retry, `{ citation, isFalsePositiveRejection: true }` to flag a false positive, or `null`/`undefined` to stop.

Returns: VerifyCitationsResponse — { verifications: Record<string, Verification> }

Attachment Management

`getAttachment(attachmentId, options?)`

Retrieve full attachment metadata including page renders, verifications, and extracted text.

const attachment = await dc.getAttachment("abc123");

Returns: AttachmentResponse (includes canonical deepTextPages)

`deleteAttachment(attachmentId)`

Permanently delete an attachment and all associated data. Irreversible.

const { deleted } = await dc.deleteAttachment("abc123");

Returns: DeleteAttachmentResponse — { attachmentId, deleted: true }

`extendExpiration(options)`

Extend the expiration date of an attachment.

const { expiresAt } = await dc.extendExpiration({
  attachmentId: "abc123",
  duration: "year", // "month" or "year"
});

Parameter	Type	Description
`options.attachmentId`	`string`	The attachment to extend
`options.duration`	`"month" \\| "year"`	Extension period (30 or 365 days)

Returns: ExtendExpirationResponse — { attachmentId, expiresAt, previousExpiresAt }

Standalone Utility Functions

These functions are imported directly from deepcitation — they don’t require a client instance.

Citation Parsing

Core Parsing

import {
  getAllCitationsFromLlmOutput,
  parseCitationResponse,
  parseCitationData,
  citationDataToCitation,
  groupCitationsByAttachmentId,
  groupCitationsByAttachmentIdObject,
  getCitationKey,
  normalizeCitationType,
  extractCitationsFromMarkers,
} from "deepcitation";

Function	Signature	Description
`getAllCitationsFromLlmOutput`	`(llmOutput: string) => Record<string, Citation>`	Parse `<<<CITATION_DATA>>>` block from LLM output. Returns `{}` on failure — never throws.
`parseCitationResponse`	`(llmOutput: string) => ParsedCitationResult`	Parse LLM output into `{ visibleText, citations, markerMap }` for rendering.
`parseCitationData`	`(json: unknown) => CitationData \\| null`	Parse and validate a single raw citation data object. Returns `null` if the object lacks a required `id` field.
`citationDataToCitation`	`(data: CitationData) => Citation`	Convert a parsed `CitationData` object to a `Citation` for verification.
`groupCitationsByAttachmentId`	`(citations: Record<string, Citation>) => Map<string, Record<string, Citation>>`	Group citations by their `attachmentId` for per-attachment verification. Returns a `Map`.
`groupCitationsByAttachmentIdObject`	`(citations: Record<string, Citation>) => Record<string, Record<string, Citation>>`	Same as `groupCitationsByAttachmentId` but returns a plain object instead of a `Map`.
`getCitationKey`	`(citation: Citation) => string`	Generate a stable 16-char hash key for a citation. Used as the dictionary key in verification results.
`normalizeCitationType`	`(type: string) => CitationType`	Normalize a raw type string (e.g. `"audio"`, `"video"`) to a `CitationType` value.
`extractCitationsFromMarkers`	`(text: string) => Record<string, Citation>`	Extract citations from LLM output that has only `[N]` markers but no `<<<CITATION_DATA>>>` block. Uses the surrounding sentence as `sourceContext`.

Text Manipulation

import {
  stripCitations,
  replaceCitationMarkers,
  extractVisibleText,
  hasCitationData,
  getCitationMarkerIds,
} from "deepcitation";

Function	Signature	Description
`stripCitations`	`(llmResponse: string) => string`	Remove all citation artifacts from LLM output — strips both `[N]` markers and the `<<<CITATION_DATA>>>` block. Returns clean readable text.
`replaceCitationMarkers`	`(text: string, options?) => string`	Replace `[N]` markers with custom content. Pass a `replacer` function, `showSourceMatch: true` to substitute anchor text, or `showVerificationStatus: true` to append status indicators. Default behavior strips markers.
`extractVisibleText`	`(llmResponse: string) => string`	Split the LLM output at the `<<<CITATION_DATA>>>` delimiter and return only the visible portion above it. Does not strip `[N]` markers.
`hasCitationData`	`(text: string) => boolean`	Check whether a string contains a `<<<CITATION_DATA>>>` block.
`getCitationMarkerIds`	`(text: string) => number[]`	Return all citation marker IDs found in text, in order of appearance. Handles both `[N]` and `[anchor](cite:N)` formats.

Type Guards

import {
  isDocumentCitation,
  isUrlCitation,
  isAudioVideoCitation,
} from "deepcitation";

Function	Signature	Description
`isDocumentCitation`	`(c: Citation) => c is DocumentCitation`	Narrow a `Citation` union to `DocumentCitation` (type `"document"`).
`isUrlCitation`	`(c: Citation) => c is UrlCitation`	Narrow a `Citation` union to `UrlCitation` (type `"url"`).
`isAudioVideoCitation`	`(c: Citation) => c is AudioVideoCitation`	Narrow a `Citation` union to `AudioVideoCitation` (type `"audio"` or `"video"`).

Prompt Wrapping

import { wrapCitationPrompt, wrapSystemCitationPrompt } from "deepcitation/prompts";

Function	Signature	Description
`wrapCitationPrompt`	`(options: WrapCitationPromptOptions) => WrapCitationPromptResult`	Wrap both system and user prompts with citation instructions. Returns `{ enhancedSystemPrompt, enhancedUserPrompt }`.
`wrapSystemCitationPrompt`	`(options: WrapSystemPromptOptions) => string`	Wrap only the system prompt. Use when you manage user prompt construction yourself.

See Prompts for details on what these functions inject.

Verification Helpers

import { getCitationStatus, validateUploadFile } from "deepcitation";

Function	Signature	Description
`getCitationStatus`	`(verification: Verification) => CitationStatus`	Derive UI status (`isVerified`, `isPartialMatch`, `isMiss`, `isPending`) from a verification result.
`validateUploadFile`	`(file: unknown) => { valid: boolean, error?: string }`	Validate a file before uploading (checks size, type).

Rendering

`prepareCitations(input, options?)`

Convert raw LLM output into a normalized intermediate representation (IR) that any rendering adapter can consume. This is the formal boundary between parsing/verification logic and custom renderers — prepare once, render to multiple formats.

import { prepareCitations, type CitationAdapter } from "deepcitation";

const ir = prepareCitations(llmOutput, {
  verifications,
  sourceLabels: { [attachmentId]: "Annual Report 2024" },
});

// Walk the IR to build custom output
for (const seg of ir.segments) {
  if (seg.type === "text") {
    output += seg.text;
  } else {
    // seg.type === "citation"
    const citation = ir.citations.find(c => c.citationNumber === seg.citationNumber);
    output += renderCitationBadge(citation);
  }
}

Parameter	Type	Description
`input`	`string \\| ParsedCitationResult`	Raw LLM output string, or a pre-parsed result from `parseCitationResponse()`.
`options.verifications`	`VerificationRecord`	Verification results keyed by citation key. Populates `isVerified`, `isPartialMatch`, etc. on each citation in the IR.
`options.sourceLabels`	`Record<string, string>`	Display labels keyed by `attachmentId` (use `""` for URL citations). Pre-resolved onto each `ResolvedCitation` in the IR.

Returns: CitationIR — { segments: ReadonlyArray<TextSegment \| CitationSegment>, citations: ReadonlyArray<ResolvedCitation> }

Type	Description
`CitationIR`	The normalized IR: `{ segments, citations }`. Adapters consume this — they never call `parseCitationResponse` directly.
`CitationAdapter<TOptions, TOutput>`	A pure function `(ir: CitationIR, options?: TOptions) => TOutput`. Implement this to target a new render format (email, PDF, Notion, etc.).
`ResolvedCitation`	A citation with verification status and a pre-computed `sourceLabel` from the fallback chain.
`PrepareCitationsOptions`	Input options for `prepareCitations()`: `{ verifications?, sourceLabels? }`.

Display Utilities

These helpers are useful when building custom citation renderers.

import {
  toSuperscript,
  getIndicator,
  humanizeLinePosition,
  INDICATOR_SETS,
  SUPERSCRIPT_DIGITS,
} from "deepcitation";

Export	Signature / Type	Description
`toSuperscript`	`(n: number) => string`	Convert a number to Unicode superscript characters (e.g. `3` → `"³"`).
`getIndicator`	`(set: IndicatorSet, index: number) => string`	Get the indicator character at a given index from an `IndicatorSet`. Wraps when index exceeds the set length.
`humanizeLinePosition`	`(pos: LinePosition) => string`	Format a `LinePosition` as a human-readable string (e.g. `"line 5"` or `"lines 5–7"`).
`INDICATOR_SETS`	`Record<IndicatorStyle, IndicatorSet>`	Predefined indicator sets: `"numbers"` (1, 2, 3…), `"letters"` (a, b, c…), `"symbols"` (†, ‡, §…).
`SUPERSCRIPT_DIGITS`	`string[]`	Lookup array mapping digits 0–9 to their Unicode superscript equivalents.

Error Classes

All errors extend DeepCitationError and include code, isRetryable, and statusCode properties.

import {
  AuthenticationError,   // 401/403 — fix the API key
  PaymentRequiredError,  // 402 — add or update payment method
  RateLimitError,        // 429 — rate limit exceeded
  ValidationError,       // 400/404/413 — fix the input
  ServerError,           // 5xx — safe to retry
  NetworkError,          // Network failure — safe to retry
} from "deepcitation";

Class	Code	Status	Retryable	When thrown
`AuthenticationError`	`DC_AUTH_INVALID`	401, 403	No	Missing, invalid, or expired API key
`PaymentRequiredError`	`DC_PAYMENT_REQUIRED`	402	No	Free tier exhausted, spend cap hit, or payment failed. Includes a `billingCode` field with the server-side billing reason.
`RateLimitError`	`DC_RATE_LIMITED`	429	Yes	API rate limit exceeded
`ValidationError`	`DC_VALIDATION_ERROR`	400, 404, 413	No	Bad input — wrong file format, oversized file, or invalid attachment ID
`ServerError`	`DC_SERVER_ERROR`	5xx	Yes	API server error
`NetworkError`	`DC_NETWORK_ERROR`	—	Yes	Timeout, DNS failure, or connection refused

See Error Handling for retry patterns and the isRetryable flag.

Constants

Delimiters

import {
  CITATION_DATA_START_DELIMITER,  // "<<<CITATION_DATA>>>"
  CITATION_DATA_END_DELIMITER,    // "<<<END_CITATION_DATA>>>"
  SDK_VERSION,                    // Current SDK version string
} from "deepcitation";

Prompt Format Constants

The SDK exports four distinct citation prompt formats. Pass these directly to LLM APIs that accept raw system prompt text, or use them to build custom wrapCitationPrompt alternatives.

import {
  // Standard document format (default)
  CITATION_PROMPT,
  CITATION_JSON_OUTPUT_FORMAT,
  CITATION_REMINDER,

  // Audio/video format (timestamps instead of page/line references)
  AV_CITATION_PROMPT,
  CITATION_AV_JSON_OUTPUT_FORMAT,
  CITATION_AV_REMINDER,

  // Compact format (omits source_context/reasoning — ~80–135 fewer tokens per citation)
  COMPACT_CITATION_PROMPT,
  COMPACT_CITATION_SCENARIO2_PROMPT,
  COMPACT_CITATION_JSON_OUTPUT_FORMAT,
} from "deepcitation";

Constant	Format	When to use
`CITATION_PROMPT`	Standard	Default for document/PDF citations. Includes `source_context` and `reasoning` fields.
`CITATION_JSON_OUTPUT_FORMAT`	Standard	JSON schema object for use with structured output APIs (e.g. OpenAI `response_format`).
`CITATION_REMINDER`	Standard	Short reminder string to append to user prompts.
`AV_CITATION_PROMPT`	Audio/Video	Replaces `page_id`/`line_ids` with `timestamps` (`start_time`/`end_time` in `HH:MM:SS.SSS`).
`CITATION_AV_JSON_OUTPUT_FORMAT`	Audio/Video	JSON schema for AV timestamp citations.
`CITATION_AV_REMINDER`	Audio/Video	Reminder variant that mentions timestamps.
`COMPACT_CITATION_PROMPT`	Compact	Omits `source_context` and `reasoning` from LLM output (reconstructed offline via hydrate). Use for latency-sensitive pipelines. Saves ~80–135 tokens per citation.
`COMPACT_CITATION_SCENARIO2_PROMPT`	Compact	Variant for annotating pre-existing user text (text is frozen; only `[N]` markers are inserted).
`COMPACT_CITATION_JSON_OUTPUT_FORMAT`	Compact	JSON schema for compact citations (`n`, `k`, `p`, `l` only).

See Prompts for when to choose each format.

Prompt ID Compression

When attachment IDs are long (e.g. UUIDs), compressing them reduces token count in large prompts.

import { compressPromptIds, decompressPromptIds, type CompressedResult } from "deepcitation";

// Compress: replace full IDs with minimal unique prefixes
const { compressed, prefixMap } = compressPromptIds(promptObject, attachmentIds);

// Decompress: restore full IDs from prefixes
const restored = decompressPromptIds(compressed, prefixMap);

Export	Signature	Description
`compressPromptIds`	`<T>(obj: T, ids: string[] \\| undefined) => CompressedResult<T>`	Replace all occurrences of each ID in `obj` with its minimal unique prefix. Returns the compressed object and a `prefixMap` for decompression. Throws if a safe prefix cannot be found.
`decompressPromptIds`	`<T>(compressed: T \\| string, prefixMap: Record<string, string>) => T \\| string`	Restore full IDs from a compressed object or string using the `prefixMap` from `compressPromptIds`.
`CompressedResult<T>`	`{ compressed: T, prefixMap: Record<string, string> }`	Return type of `compressPromptIds`. Pass `prefixMap` to `decompressPromptIds` to reverse the operation.

SDK Reference

Client Class

Constructor

File Preparation

prepareAttachments(files)

uploadFile(file, options?)

prepareUrl(options)

convertToPdf(input)

prepareConvertedFile(options)

Citation Verification

verify(input, citations?)

verifyAttachment(attachmentId, citations, options?)

Batch & Iterative Verification

verifyBatch(citations, options?)

verifyIterative(attachmentId, citations, options)

Attachment Management

getAttachment(attachmentId, options?)

deleteAttachment(attachmentId)

extendExpiration(options)

Standalone Utility Functions

Citation Parsing

Core Parsing

Text Manipulation

Type Guards

Prompt Wrapping

Verification Helpers

Rendering

prepareCitations(input, options?)

Related types

Display Utilities

Error Classes

Constants

Delimiters

Prompt Format Constants

Prompt ID Compression

`prepareAttachments(files)`

`uploadFile(file, options?)`

`prepareUrl(options)`

`convertToPdf(input)`

`prepareConvertedFile(options)`

`verify(input, citations?)`

`verifyAttachment(attachmentId, citations, options?)`

`verifyBatch(citations, options?)`

`verifyIterative(attachmentId, citations, options)`

`getAttachment(attachmentId, options?)`

`deleteAttachment(attachmentId)`

`extendExpiration(options)`

`prepareCitations(input, options?)`