Getting Started with Knowledge Bases
Sema4.ai Knowledge Bases(KBs) allow agents to work with enterprise knowledge that lives outside of traditional databases — like documents, internal guides, FAQs, and chat transcripts. This page gives you a high-level view of how to go from zero to a fully functional agent that can reason over that knowledge.
How It Works: KB Lifecycle
The lifecycle of a Sema4.ai KB guides you from initial setup to full agent integration and deployment. Here’s a detailed look at each stage:
Create a vector storage target
Set up a vector-capable database, such as pgvector or ChromaDB, to store embeddings. This is where your KB will live. It acts as the persistent storage layer for your KB, enabling fast and scalable semantic search.
Define your KB (Using SDK)
- Select embedding and reranking models: Choose models (e.g., OpenAI, Azure OpenAI) for generating embeddings and ranking results.
- Map columns: Specify which columns from your data source will be used as content (for semantic search), metadata (for filtering), and unique IDs.
- Connect to storage: Link your KB definition to the vector storage target you created.
Insert data into the KB
Populate your KB by inserting documents, records, or files.
- As data is ingested, the system automatically generates embeddings for the specified content columns.
- Semantic indexing is performed, making your KB ready for advanced search and retrieval.
Query your KB
Retrieve information using flexible, SQL-like queries.
- Semantic search: Find relevant content based on meaning, not just keywords.
- Metadata filtering: Narrow results by tags, authors, dates, or other structured fields.
Package your queries into Action Package
Publish your most useful and frequent queries as Actions to Studio.
- Action Packages make it easy to reuse and share queries across agents.
- Publish these packages to Sema4.ai Studio for integration.
Use the KB in an agent
Integrate your KB-powered actions into an agent’s runbook or workflow.
- Add the Action Package to your agent.
- Use KB queries as steps in your agent’s logic, enabling retrieval-augmented generation and grounded responses.
Deploy your agent with the KB
Publish and deploy your agent (with KB integration) to the Control Room.
- The agent is now fully equipped to leverage the KB for intelligent, context-rich responses.
- Your team can interact with the agent, benefiting from accurate, up-to-date knowledge retrieval.
How It All Comes Together
Here’s the lifecycle of working with a KB:

This documentation is organized around this lifecycle:
Prerequisites
- Before you start, make sure you have installed the following extensions in your VS Code or Cursor.
- Sema4.ai SDK
- Sema4.ai Data Access
- You must have access to Embedding models and API keys for those models.
- A vector-capable database (e.g., PostgreSQL with pgvector extension) must be set up and accessible.