The unified KnowQL Query API is the way to use your curated knowledge programmatically. You send a question scoped to one or more contexts; Nexus plans its own retrieval over them and returns a grounded answer with citations — the same behaviour as the console's query console, available over HTTP. Multi-turn conversations are sessions.
Before you start
- Create and curate a context (upload sources, let the first curate finish), and note its slug.
- Get an API key and send it as
Authorization: Bearer <key>.
Endpoint
https://prod.nexus.pinecone.io/api/v0/queryOne turn
Send ask and a scope of contexts to start a
new session. The response is a query object: read the
answer from output[].content[].text (the output_text parts), citations it used, token usage, and a session_id you can continue from.
curl https://prod.nexus.pinecone.io/api/v0/query \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"ask": "What were the key risks in the Q3 board minutes?",
"scope": ["my-context"]
}'Latency & async
A query plans and runs its own multi-step retrieval, so an answer
typically takes 20–60 seconds (longer for hard
questions; the hard cap is 15 minutes). A plain synchronous POST
holds the connection that whole time and will often hit an
intermediate HTTP timeout. Unless your client streams or has a long
(2+ minute) timeout, use the asynchronous flow: submit with background: true, get a 202 back
immediately with an in_progress query, then poll GET /queries/{id} until status is completed or failed. There is no webhook —
you poll; results persist on the query/session rows, so you can poll
as long as you need.
# 1) Submit — returns 202 immediately with an in_progress query id
curl https://prod.nexus.pinecone.io/api/v0/query \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"ask": "What were the key risks in the Q3 board minutes?",
"scope": ["my-context"],
"background": true
}'
# 2) Poll the query id every few seconds until status is terminal
curl https://prod.nexus.pinecone.io/api/v0/queries/qry_... \
-H "Authorization: Bearer $NEXUS_API_KEY"
# -> status: "in_progress" | "completed" | "failed"The synchronous call shown above (no background / stream) is simplest but only safe for fast queries or
long-timeout clients. background and stream are mutually exclusive.
Continue the session
Pass session_id (or previous_query_id) to keep
the conversation going — Nexus remembers earlier turns and the session's
scope, so you don't repeat it.
curl https://prod.nexus.pinecone.io/api/v0/query \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"ask": "And who raised them?",
"session_id": "sess_..."
}'Streaming
If your client can consume Server-Sent Events, set stream: true to receive the turn live — each retrieval step as Nexus works, the answer text streamed as deltas, then
the final query object. This is what powers the live steps in the
console. Clients that can't hold an SSE connection should prefer background + poll above.
{ "ask": "...", "scope": ["my-context"], "stream": true }
# SSE events, in order:
# response.created ({ query_id, session_id })
# response.step (one per retrieval step, status running → completed)
# response.output_text.delta (answer text streamed as it is written)
# response.completed | response.failed
# query (final query object: output + citations + usage)Structured output
Add a shape (a JSON Schema) to get back exact fields instead
of prose. output_json returns JSON matching it.
curl https://prod.nexus.pinecone.io/api/v0/query \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"ask": "List each vendor and its contract value.",
"scope": ["my-context"],
"shape": {
"type": "object",
"properties": {
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"vendor": { "type": "string" },
"value": { "type": "number" }
},
"required": ["vendor"]
}
}
},
"required": ["items"]
}
}'Shape is a JSON Schema subset — allowed: type, properties, required, items, enum, format, description, nullable, minimum/maximum, minLength/maxLength, minItems/maxItems. Not allowed: $ref, oneOf/anyOf/allOf, not, additionalProperties, patternProperties.
Request fields
ask required The natural-language question for this turn.scope Context slugs / UUIDs to query across (1–10). Required to start a new session; omit when continuing.session_id Continue an existing session.previous_query_id Continue the session this query belongs to (alternative to session_id).system_prompt Instructions pinned to a new session.guardrails Guardrails / constraints pinned to a new session.shape JSON Schema for structured output — output_json returns JSON matching it.model / models provider/model selection; models is an ordered fallback list.tools Restrict the tool subset for this turn.background true returns 202 immediately with an in_progress query; poll GET /queries/{id} for the result. Good for non-streaming clients. Mutually exclusive with stream.stream true streams the turn as SSE; otherwise a synchronous JSON query object. Mutually exclusive with background.timeout_seconds Lower the 15-minute deadline (it can only be lowered, never raised).Good to know
- Every answer includes citations back to the source files Nexus used.
- Nexus only reads contexts in the session's scope — it never indexes anything itself.
- A new session requires its contexts to be curated; an uncurated context returns a clear error.
- Fetch a single turn with
GET /api/v0/queries/{id}, and a whole conversation withGET /api/v0/sessions/{id}.