Product•March 2, 2026•Empathy AI Team

Knowledge Base. Your knowledge, ready to talk

Transform scattered organizational knowledge into a private, conversational AI platform with full data sovereignty. No cloud dependencies. No third-party access.

Most organizations have plenty of documentation. GitHub repositories, Confluence spaces, internal wikis, uploaded PDFs, product manuals, support guides. The knowledge exists. But you're still struggling to find it. It's not you. It's the retrieval that fails.

A customer success manager fields the same integration question for the tenth time because the answer is buried three levels deep in a Confluence page nobody bookmarks. A sales representative spends an afternoon building an RFP response that should have taken an hour. An engineer searches for an API endpoint and finds a page last updated two years ago.

This isn't a knowledge problem. It's an access problem.

Search was supposed to solve it. Keyword search helped, but it requires you to already know what you're looking for: the right term, the right phrasing, the right document. It has no understanding of intent. It returns links, not answers.

The shift happening now isn't about making search faster. It's about making it conversational and contextually aware.

And that's the gap Knowledge Base is built to close.

What Knowledge Base actually does

Knowledge Base turns your existing documentation into a conversational search interface. Connect your GitHub repositories, Confluence spaces, PDFs, and other sources. Ask questions in plain language. Get structured, referenced answers; not a list of links to go investigate yourself.

The experience is closer to asking a well-informed colleague than running a search query. For example, if you go to the motive.co site and ask the Knowledge Base: "What steps does a customer need to take if Motive isn't appearing on their Magento 2 site?" It returns a usable, step-by-step breakdown with direct references to the relevant documentation. You can read the source, share it with your customer, or ask a follow-up. That's sharp and simple.

What it isn't: a black box that generates plausible-sounding text. Every answer surfaces its sources. The quality of the output is tied directly to the quality of the documentation you've indexed. That's a feature, not a limitation, which means the system is honest about what it knows and where it learned it.

The retrieval problem (and how we address it)

Traditional document search, including earlier RAG (Retrieval-Augmented Generation) approaches, has a known weakness. When you split large documents into smaller chunks for indexing, you often strip away the context that makes a chunk meaningful.

A chunk that reads "the previous quarter's revenue grew by 3%" is nearly useless on its own. Which company? Which quarter? Without that context, even a sophisticated AI system will struggle to retrieve the right information at the right moment.

Knowledge Base addresses this with contextual retrieval: before a document chunk is indexed, the system uses an AI model to add a short, precise summary of where that chunk fits within the broader document. The chunk about quarterly revenue now carries the context (which company, which filing, which period) so it can be retrieved accurately even when a user's question doesn't use the exact phrasing from the source.

This, combined with a reranking step that scores and filters retrieved chunks by relevance before they're used to generate an answer, significantly reduces retrieval failures. The practical effect: fewer hallucinations, more accurate answers, better references.

None of this requires you to restructure your documentation. You connect your sources. The system handles the rest.

Worth mentioning that this approach draws directly from Anthropic's published research on contextual retrieval, which demonstrated that combining contextual embeddings with lexical matching and reranking can reduce retrieval failure rates by more than 60%.

What this looks like in practice

We've been using Knowledge Base ourselves. Here's what that looks like.

Our growth team used Knowledge Base to respond to a 30-item RFP from one large bookseller, a prospective customer doing serious technical due diligence on every product feature. Roughly 20 out of 30 questions were answered accurately and well-structured on the first pass, with precise references. The team estimated it saved at least six hours on that document alone, while delivering higher-quality responses than a manual search-and-edit workflow would have produced.

The gaps were real and acknowledged: pricing information isn't indexed, and some personalization content is scattered across sources that haven't been connected yet. Those are solvable documentation problems, not system failures.

It works the same way across different teams and contexts. The same platform can, for instance, serve as an engineering tool if the right technical information is indexed. Our developers have queried service components, explored code paths, returned configuration settings, and surfaced release history. The questions change. The infrastructure doesn't.

The same knowledge, different lenses

Knowledge Base is configurable by design. The same indexed knowledge can power different conversations depending on context and audience, shaped by prompt configurations that adapt to different roles and define the tone, scope, and depth of each interaction.

A good example is Empathy.co’s Playboard, our own dashboard that brings together analytics and configuration settings for search and discovery products in ecommerce. It's a complex platform with a broad user base: customers exploring their data, support teams diagnosing issues, and engineers working at the configuration level.

Each of those audiences has different needs from the same knowledge base. A customer asking about a feature gets an explanation of what it does, how it helps their business, and how to use it. A support technician asking about a specific instance gets structured configuration data. An engineer gets a technical breakdown with code-level detail. Same tool. Same indexed knowledge. Different conversations.

For organizations running multiple products or brands, the same logic applies across separate knowledge bases, each with its own configuration and content.

Independence, privacy, and data governance

Knowledge Base is built to run without routing your data through third-party AI APIs. No OpenAI. No AWS. No subscriptions to external model providers. The open-weight models that power ingestion, embedding, and generation run on your infrastructure or on Empathy.ai's, depending on your deployment model.

This matters for two reasons that are becoming harder to ignore.

The first is compliance. Organizations operating under strict data residency requirements, for example, in financial services, legal, healthcare, or public sector contexts, can't afford to route sensitive documentation through cloud AI providers without careful scrutiny. A self-hosted deployment on hardware like Empathy.ai's NVIDIA DGX Spark keeps everything local: embeddings, retrieval, generation, and storage.

The second is dependency. Building core workflows on top of third-party API providers means your access, your pricing, and your capabilities are subject to someone else's roadmap and rate limits. Open-weight models, which are capable, well-maintained, and deployable on-premise, make it reasonable to build an AI infrastructure you actually own.

Your data doesn't need to leave your infrastructure to power a capable AI-based knowledge search system. That's the point.

The shift toward conversational, AI-assisted information retrieval is already underway. It's showing up in how customers research products, how teams respond to commercial requests, and how organizations are discovered by the models powering mainstream AI tools. Companies with well-structured, accessible knowledge are increasingly findable in ways that paid advertising alone can't achieve.

Knowledge Base is designed for organizations that want to participate in that reality on their own terms, without handing their data to big tech providers, without building brittle workflows on top of external APIs, and without waiting for AI to become approachable enough to deploy independently.

The knowledge you've built over the years is already there. Empathy.ai's Knowledge Base is what it looks like when your knowledge can finally speak for itself.

The best way to understand, it's to try it. Knowledge Base is live on empathy.ai, empathy.co, and motive.co. Go ahead, ask it anything.

A note on what it isn't

Knowledge Base is not a replacement for good documentation. If your sources are incomplete, inconsistent, or out of date, the system will reflect that. And it will tell you, because the answers reference their sources. That transparency is deliberate.

It's also not a general-purpose AI assistant. It's scoped to what you've indexed, configured for your needs, and grounded in documents you control. The value isn't novelty; it's reliability.

Organizations that treat knowledge as infrastructure—something worth maintaining, structuring, and keeping current—will get the most out of it. That knowledge was worth having before AI search existed, and is worth more now.

Frequently Asked Questions

What is Knowledge Engine?

Knowledge Engine is Empathy AI's enterprise AI knowledge management platform. It centralizes documentation from GitHub, Confluence, PDFs, and APIs into a unified conversational system, running entirely on private infrastructure with no cloud dependencies.

How does Knowledge Engine differ from ChatGPT Enterprise or Microsoft Copilot?

Unlike ChatGPT Enterprise or Copilot, Knowledge Engine processes all data on Empathy AI's self-hosted GPU infrastructure. Your documents are never transmitted to external servers, never used to train third-party models, and remain under your complete control.

What is contextual retrieval?

Contextual retrieval is a preprocessing technique that enriches each document chunk with surrounding context before indexing. This preserves meaning and significantly improves answer accuracy, reducing retrieval failures by up to 67% compared to standard approaches.

What data sources does Knowledge Engine support?

Knowledge Engine ingests from GitHub repositories, Confluence spaces, uploaded documents (PDF, DOCX, Markdown), and external APIs. Additional source integrations are actively being developed.

Is Knowledge Engine suitable for regulated industries?

Yes. With all processing happening on dedicated infrastructure in Asturias, Spain, Knowledge Engine meets strict data residency and sovereignty requirements for finance, legal, healthcare, and government sectors.

Knowledge Base. Your knowledge, ready to talk

What Knowledge Base actually does

The retrieval problem (and how we address it)

What this looks like in practice

The same knowledge, different lenses

Independence, privacy, and data governance

A note on what it isn't

Frequently Asked Questions

What is Knowledge Engine?

How does Knowledge Engine differ from ChatGPT Enterprise or Microsoft Copilot?

What is contextual retrieval?

What data sources does Knowledge Engine support?

Is Knowledge Engine suitable for regulated industries?

Continue reading

Project Gutenberg AI: Discovering Books by What They Actually Mean

The Anti-ChatGPT: Why Empathy AI Keeps Your Data Off Big Tech Servers

Open-Source LLMs: Why Empathy AI Rejects Proprietary AI Models

What Knowledge Base actually does

The retrieval problem (and how we address it)

What this looks like in practice

The same knowledge, different lenses

Independence, privacy, and data governance

Built for knowledge sharing

A note on what it isn't

Frequently Asked Questions

What is Knowledge Engine?

How does Knowledge Engine differ from ChatGPT Enterprise or Microsoft Copilot?

What is contextual retrieval?

What data sources does Knowledge Engine support?

Is Knowledge Engine suitable for regulated industries?

Continue reading

Project Gutenberg AI: Discovering Books by What They Actually Mean

The Anti-ChatGPT: Why Empathy AI Keeps Your Data Off Big Tech Servers

Open-Source LLMs: Why Empathy AI Rejects Proprietary AI Models