OpenKnowledge

Agentic search

How an agent finds things with no vector database: it searches, greps, and follows backlinks in a loop over live files that come back with their graph context attached, not a vector copy.

OpenKnowledge answers questions across thousands of files with no vector database. Two techniques do the work: retrieval runs as a loop, over virtualized files that hand the agent a briefing on every read.

Retrieval is a loop

Classic RAG embeds your question once and pastes in the nearest chunks; if they are wrong, so is the answer. Here each step is an MCP tool call and the model picks the next one: search, read, follow a backlink, reformulate. A wrong first hit costs one more step, not the whole answer.

Every read is a briefing

A cat through OpenKnowledge returns more than the bytes on disk: the file, frontmatter and all, plus its place in the graph: the backlinks pointing at it, its outbound links, and its version history.

The same holds for a whole folder. A raw ls is a list of filenames; through OpenKnowledge it comes back as a map, so the agent knows where to look next without opening anything.

And a grep returns more than matching lines: each hit carries its file's title, status, and backlink count, so the agent can tell a well-connected hub from a stray mention before opening either.

That folder briefing is why the loop stays short: the backlinks are what it follows, and the purpose plus recency tell it where to look next.

Three read tools ride this layer: ranked search (BM25 and recency, the same index as cmd-K), exec (sandboxed grep/ls/cat, straight off disk, works with the server down), and the link graph (dead, orphans, hubs, suggest).

Writing talks back too

A cat >> or a vim save drops bytes on disk and goes quiet. A write or edit through OpenKnowledge validates what you wrote and hands back what it found, so mistakes surface at write time instead of rotting in the graph.

brokenLinks comes back on every write and edit (empty when they all resolve), so the agent never needs a follow-up dead-link check. A brand-new doc nothing points to comes back flagged as an orphan with a hub to link it from. Every write and edit is versioned, attributed, and the preview updates as it lands. None of that happens when you append to a file by hand. Skills teach the conventions, and the write path enforces the connective tissue.

The index is authored, not extracted

No vector store means no second copy to rebuild, mis-chunk, or drift. Your links, folders, titles, and folder descriptions are the index. Other tools extract structure with an embedding model or an entity-guessing LLM; OpenKnowledge reads the structure you already wrote, so the agent traverses the source of truth. A well-linked base retrieves better because good links shorten the loop. Folder templates keep that structure consistent as it grows: a folder can carry a template, so every new doc the agent creates starts with the same frontmatter and shape, and the fields search ranks on stay uniform.

Semantic search (optional)

Off by default. Enable it per project and an embeddings signal blends into the ranking; it never replaces lexical search. Vectors sit in a local cache. Turning it on sends your query and matching text to your configured provider. See Configuration.