Getting Started with Knowledge Bases

Sema4.ai Knowledge Bases(KBs) allow agents to work with enterprise knowledge that lives outside of traditional databases — like documents, internal guides, FAQs, and chat transcripts. This page gives you a high-level view of how to go from zero to a fully functional agent that can reason over that knowledge.

How It Works: KB Lifecycle

The lifecycle of a Sema4.ai KB guides you from initial setup to full agent integration and deployment. Here’s a detailed look at each stage:

Create a vector storage target

Set up a vector-capable database, such as pgvector or ChromaDB, to store embeddings. This is where your KB will live. It acts as the persistent storage layer for your KB, enabling fast and scalable semantic search.

Define your KB (Using SDK)

Select embedding and reranking models: Choose models (e.g., OpenAI, Azure OpenAI) for generating embeddings and ranking results.
Map columns: Specify which columns from your data source will be used as content (for semantic search), metadata (for filtering), and unique IDs.
Connect to storage: Link your KB definition to the vector storage target you created.

Insert data into the KB

Populate your KB by inserting documents, records, or files.

As data is ingested, the system automatically generates embeddings for the specified content columns.
Semantic indexing is performed, making your KB ready for advanced search and retrieval.

Query your KB

Retrieve information using flexible, SQL-like queries.

Semantic search: Find relevant content based on meaning, not just keywords.
Metadata filtering: Narrow results by tags, authors, dates, or other structured fields.

Package your queries into Action Package

Publish your most useful and frequent queries as Actions to Studio.

Action Packages make it easy to reuse and share queries across agents.
Publish these packages to Sema4.ai Studio for integration.

Use the KB in an agent

Integrate your KB-powered actions into an agent’s runbook or workflow.

Add the Action Package to your agent.
Use KB queries as steps in your agent’s logic, enabling retrieval-augmented generation and grounded responses.

Deploy your agent with the KB

Publish and deploy your agent (with KB integration) to the Control Room.

The agent is now fully equipped to leverage the KB for intelligent, context-rich responses.
Your team can interact with the agent, benefiting from accurate, up-to-date knowledge retrieval.

How It All Comes Together

Here’s the lifecycle of working with a KB:

This documentation is organized around this lifecycle:

Build a Knowledge Base→Define, add data, and test your Knowledge Base using the Sema4.ai SDK Use Knowledge Base in an Agent→Connect Knowledge Base to your agent and test Deploy Agent with Knowledge Base→Make Agents that use Knowledge Base accessible by your team

Prerequisites

Before you start, make sure you have installed the following extensions in your VS Code or Cursor.
- Sema4.ai SDK
- Sema4.ai Data Access
You must have access to Embedding models and API keys for those models.
A vector-capable database (e.g., PostgreSQL with pgvector extension) must be set up and accessible.

Knowledge Base Build a Knowledge Base