“Private AI” because the LLM call originates from inside Ntropii Tenant, under the customer’s own provider credentials, against the customer’s own data — never as a hosted service we run for you. SameDocumentation Index
Fetch the complete documentation index at: https://docs.ntropii.com/llms.txt
Use this file to discover all available pages before exploring further.
extract() surface whether the underlying provider is Anthropic, Azure OpenAI, a self-hosted model, or anything else routed through your tenant’s provider configuration.
Install
The API
ExtractionResult with:
| Field | Type | What’s in it |
|---|---|---|
result.fields | dict[str, Any] | The extracted typed values, keyed by the schema’s field names |
result.confidence_scores | dict[str, float] | 0.0–1.0 per field, used to drive HITL routing |
result.line_items | list[dict] | Repeating rows (invoice line items, journal lines) when the schema expects them |
result.summary | str | A human-readable summary the UI surfaces alongside the structured fields |
schema_slug is the contract between the runbook and the provider — it routes the call to the right prompt template and output schema. New extraction schemas live in your provider configuration, not in the SDK.
Canonical example
Lifted from thedocument-ingest runbook:
extract() the plain text from files.parse(), optionally pass the cell grid as structured_context, and stash the typed result in your runbook’s domain model.
Why structured_context matters
The extractor’s prompt template can choose to use or ignorestructured_context, but for tabular sources it’s the difference between getting GL line allocations right and getting them wrong. A trial balance flattened to plain text loses column boundaries. The cell grid preserves them, so the model can disambiguate “Account: 200 | Debit: 1,500.00 | Credit: 0.00” from a flattened “200 1,500.00 0.00”.
The convention is: always pass cell_grid if you have one. The provider decides whether to use it.
Confidence scores feed HITL routing
Every field comes back with a confidence score. The downstream HITL step typically routes the document by aggregate confidence:What the schema_slug routes to
ai.extract() doesn’t know what an invoice is. It knows that schema_slug="invoice-v1" means “use the prompt template registered under that slug in the configured provider”. The provider holds:
- The prompt template (with your tenant’s specific tone, jurisdiction, accounting framework)
- The output JSON schema the LLM is constrained to
- The model selection (Claude Opus for hard cases, Haiku for cheap ones)
- Any fine-tuning, retrieval, or routing logic
Related
Collect files
Produces the input to
ai.extract().Quality checks
The natural follow-up — verify what AI extracted.