Define Queries for the Knowledge Base

Define Named Queries for Knowledge Base

Once your Knowledge Base is created and populated, the next step is defining how your agent can interact with it. You do this by writing named queries using the @query decorator inside your SDK action package.

This page will walk through building named queries for searching and retrieving data from your Knowledge Base. It will also cover other operations like inserting new data or updating existing records and deleting entries. You will learn how to test these queries locally before publishing them to your Sema4.ai Studio.

You’ll write these in data_actions.py and publish them as part of your Knowledge Base action package.

If you are starting from the Data Access/Knowledge Base template, it includes stub code for these queries. You can modify them to fit your needs. You can rename the function, parameters, or logic to match your use case.

Using a Semantic Search Query

@query
def search(
    data_source: KnowledgeBaseDataSource,
    req: SearchRequest
) -> Response[List[SearchResult]]:
 
    sql = f"""
    SELECT *
    FROM sema4ai.{data_source.datasource_name}
    WHERE content = $query
    AND relevance >= $relevance_threshold
    ORDER BY relevance DESC
    LIMIT $top_k
    """
    
    params = {
        "query": req.query,
        "relevance_threshold": req.relevance_threshold,
        "top_k": req.top_k
    }
    
    result = get_connection().query(sql, params=params)
    # Convert the raw results to SearchResult objects
    search_results = [SearchResult(**row) for row in result.to_dict_list()]
    return Response(result=search_results)

How it works

The SQL query selects from your knowledge base table using vector search with the search query passed in the request by the agent. It uses parameters for the query, relevance threshold, and top k results to ensure flexibility. You can define these parameters in the Runbook of your agent. You can keep higher thresholds to filter out less relevant results. Or keep a lower threshold to get more results, and let the agent decide which ones to use.

This action returns a list of SearchResult objects, which contain the document chunks and their metadata including the relevance score.

Using Advanced Search Queries

You can also define more complex queries that combine multiple filters or conditions. For example, you might want to search for documents that match a specific topic and also filter by date or author. In the data_actions.py file, you can see the advanced_search function:

    # Add record IDs filter if specified (example : single RAG usecase)
    if req.filter.record_ids:
        placeholders = ", ".join([
            f"'{rid}'" for rid in req.filter.record_ids
        ])
        where_conditions.append(f"id IN ({placeholders})")
    
    # Add metadata filters if specified (example: searching in multiple records)
    if req.filter.metadata_filters:
        for column, value in req.filter.metadata_filters.items():
            where_conditions.append(f"{column} = ${column}")
            params[column] = value

This function allows you to filter results based on specific metadata fields, such as record IDs or other custom metadata. You can extend this to include any additional filters that your Knowledge Base supports.

Use the SDK terminal panel or scratchpad.sql to run test queries before publishing.

  • Use the scratchpad.sql to run test queries before using them in actions. This helps you validate the SQL syntax and logic and iterate faster.
  • Try to keep queries small and composable. Runbooks can sequence them for more complex behavior.

Publish to Studio

Once all your queries are ready and tested, it's time to ship it!

  • To publish to your own Sema4.ai Studio, simply choose the Sema4.ai Publish Action Package to Sema.ai Studio option from the Command Palette (Cmd/Ctrl + Shift + P).
  • If you want to share the action package to someone else, use the Sema4.ai: Build Action Package (zip) command, which gives you a zip file that you can send around.
  • Alternatively, you can publish visually using the SDK interface. Simply click Publish to Sema4.ai Studio in the commands section of the SDK.

What's Next?

Now that you have defined your queries, the next step is to use them in your agents. You can do this by attaching the action package to your agent in Sema4.ai Studio and using the natural language runbook.