How Nexus works

Most RAG systems chop documents into chunks and hope the right ones surface at query time. Nexus does the hard work up front: it distills your sources into curated artifacts so the information an answer needs is already organized and easy to find — then answers questions with Search-as-Code.

The idea: distill, don't just chunk

Raw chunks are noisy: the same fact is scattered across pages, counts require reading everything, and relationships are implicit. Curation reorganizes the corpus into artifacts — condensed, typed units of knowledge — so retrieval gets a head start.

Raw chunkscurateCurated artifactssummaryentitiesevents (table)topicsdoc / page+ relationships
Pre-computed, typed knowledge
Curation produces artifacts by kind — summaries, topics, entities, events, per-doc and per-page units. Each is small, focused, and describes itself, so the right one is easy to retrieve.
Structure for exact answers
Structured artifact types become queryable tables, so "how many" and "list every…" questions get exact answers instead of a guess from prose.
Relationships made explicit
Edge types connect artifacts (who mentions what, what happened when), turning a flat pile of text into a graph you can walk.

How answers happen: Search-as-Code

Curated artifacts are only half the story. At query time the agent doesn't just stuff chunks into a prompt — it writes Python against a retrieval SDK to gather exactly the evidence it needs, then answers from that.

QuestionAgent writesretrieval codeArtifacts + indexEvidence → answer
The model orchestrates retrieval
One turn can read artifacts off disk, run semantic and keyword search, walk the relationship graph, and query structured tables — composing several operations instead of a single top-k lookup.
Only evidence enters the prompt
Raw tool output stays in the code sandbox; just the distilled evidence the agent selected reaches the model. That keeps context small and answers grounded — every claim traces to a source.
It can iterate
If the first pass isn't enough, the agent refines its code and searches again — so hard questions get more work, not a worse answer.

Why it matters

Distilling artifacts up front plus Search-as-Code at query time means higher-quality answers on large, messy corpora: exact counts, real relationships, and citations you can trust — without dumping everything into the model's context.

Next