Using DataFrames with Named Queries

When building Named Queries that work with DataFrames, using Response[Table] as the return type enables automatic DataFrame conversion and unlocks the full power of agent data analysis. This guide explains how to structure your Named Queries to seamlessly integrate with DataFrames.

Why Response[Table] matters

Using Response[Table] as your return type enables the automatic DataFrame conversion that makes agent data analysis possible. When agents execute your Named Queries, the results instantly become DataFrames—giving agents an intelligent workspace to transform, join, and analyze data with mathematical precision.

This automatic conversion enables:

Multi-source analysis: Agents can join your query results with uploaded spreadsheets or other Named Queries
SQL-powered accuracy: All calculations use SQL, not error-prone LLM math
Natural language interface: Business users can query your data without writing SQL

When to use each response type

Return Type	Use Case	DataFrame Conversion	Best For
`Response[Table]`	Tabular query results	✅ Automatic	Data analysis, multi-source joining, agent exploration
`Response[str]`	Formatted text	❌ Manual needed	Status messages, formatted reports
`Response[bool]`	Validation checks	❌ Not applicable	Status checks, existence validation

Implementation

This section explains how to implement Named Queries that return tabular data using Response[Table].

Basic structure

Every Named Query that returns tabular data should follow this pattern:

from sema4ai.actions import Response, Table
from sema4ai.data import query, get_connection
 
@query
def your_named_query(param1: str, param2: int) -> Response[Table]:
    """
    Brief description of what this query returns.
    
    Args:
        param1: Description of first parameter
        param2: Description of second parameter
    
    Returns:
        Table: Description of the returned columns and data
    """
    sql = """
        SELECT column1, column2, column3
        FROM your_table
        WHERE condition = $param1
        AND other_condition = $param2
        LIMIT 100;
    """
    params = {'param1': param1, 'param2': param2}
    
    result = get_connection().query(query=sql, params=params)
    return Response(result=result.to_table())

Required elements

Import the necessary types

from sema4ai.actions import Response, Table
from sema4ai.data import query, get_connection

Add the @query decorator

This decorator registers your function as a Named Query that agents can discover and use.

@query
def your_function() -> Response[Table]:

Return Response with to_table()

This converts the query results into the Table format that enables automatic DataFrame conversion.

return Response(result=result.to_table())

Common query patterns

Simple list queries

Return a straightforward list of values for agent exploration:

@query
def get_customer_names() -> Response[Table]:
    """
    Get all active customer names.
    
    Returns:
        Table: Customer names and IDs
    """
    sql = """
        SELECT DISTINCT customer_id, customer_name
        FROM customers
        WHERE status = 'active'
        ORDER BY customer_name ASC;
    """
    result = get_connection().query(sql)
    return Response(result=result.to_table())

Aggregation queries

Provide summarized data that agents can analyze and compare:

@query
def get_sales_by_region(year: int) -> Response[Table]:
    """
    Get total sales by region for a specific year.
    
    Args:
        year: The year to analyze (e.g., 2024)
    
    Returns:
        Table: Columns include region, total_sales, order_count
    """
    query = """
        SELECT 
            region,
            SUM(sale_amount) as total_sales,
            COUNT(order_id) as order_count,
            AVG(sale_amount) as avg_sale
        FROM sales
        WHERE EXTRACT(YEAR FROM sale_date) = $year
        GROUP BY region
        ORDER BY total_sales DESC;
    """
    params = {'year': year}
    result = get_connection().query(query=query, params=params)
    return Response(result=result.to_table())

Time-series queries

Enable trend analysis by returning data organized by time periods:

@query
def get_monthly_revenue_trends(year: int) -> Response[Table]:
    """
    Get monthly revenue trends for a specific year.
    
    Args:
        year: The year to analyze
    
    Returns:
        Table: Monthly revenue with columns: month, revenue, transaction_count
    """
    query = """
        SELECT 
            DATE_TRUNC('month', transaction_date) as month,
            SUM(amount) as revenue,
            COUNT(*) as transaction_count
        FROM transactions
        WHERE EXTRACT(YEAR FROM transaction_date) = $year
        GROUP BY DATE_TRUNC('month', transaction_date)
        ORDER BY month;
    """
    params = {'year': year}
    result = get_connection().query(query=query, params=params)
    return Response(result=result.to_table())

Filtered queries with parameters

Enable flexible data access while maintaining security through parameterized queries:

@query
def get_orders_by_customer(customer_id: str, start_date: str, end_date: str) -> Response[Table]:
    """
    Get all orders for a specific customer within a date range.
    
    Args:
        customer_id: The customer identifier
        start_date: Start date in YYYY-MM-DD format
        end_date: End date in YYYY-MM-DD format
    
    Returns:
        Table: Order details including order_id, date, amount, status
    """
    query = """
        SELECT 
            order_id,
            order_date,
            order_amount,
            order_status,
            product_count
        FROM orders
        WHERE customer_id = $customer_id
        AND order_date BETWEEN $start_date AND $end_date
        ORDER BY order_date DESC;
    """
    params = {
        'customer_id': customer_id,
        'start_date': start_date,
        'end_date': end_date
    }
    result = get_connection().query(query=query, params=params)
    return Response(result=result.to_table())

How agents use your Named Queries

When you structure Named Queries with Response[Table], agents can:

Execute queries through natural language

Business users ask questions, agents execute your queries:

User: "Show me sales by region for 2024"
Agent: Executes get_sales_by_region(2024) → Creates DataFrame → Answers question

Combine with other data sources

Agents automatically join your query results with other DataFrames:

User: "Compare the sales data against our target spreadsheet"
Agent: 
1. Executes get_sales_by_region(2024)
2. Reads uploaded targets.xlsx
3. Joins both DataFrames
4. Performs comparison with SQL accuracy

Perform multi-step analysis

Agents use your queries as building blocks for complex analysis:

User: "Which customers in the Northeast had declining orders this quarter?"
Agent:
1. Executes get_orders_by_region("Northeast")
2. Executes get_customer_trends(quarter=1)
3. Joins results in DataFrame workspace
4. Identifies declining customers with SQL calculations

Best practices

Write descriptive documentation

Clear docstrings help agents understand when and how to use your queries:

@query
def get_product_inventory(warehouse_id: str) -> Response[Table]:
    """
    Get current inventory levels for all products in a specific warehouse.
    
    Use this query when users ask about:
    - Stock levels or inventory counts
    - Product availability at a location
    - Warehouse-specific product data
    
    Args:
        warehouse_id: The warehouse identifier (e.g., "WH-001")
    
    Returns:
        Table: Columns include product_id, product_name, quantity, 
               last_updated, reorder_level
    """

Troubleshooting

Import errors

Ensure all required types are imported:

from sema4ai.actions import Response, Table
from sema4ai.data import query, get_connection

Missing decorator

The @query decorator is required for agent discovery:

# ✅ Correct
@query
def your_function() -> Response[Table]:
    ...
 
# ❌ Missing decorator - agents won't find this query
def your_function() -> Response[Table]:
    ...

Incorrect return format

Always use Response(result=result.to_table()):

# ✅ Correct
return Response(result=result.to_table())
 
# ❌ Incorrect - missing Response wrapper
return result.to_table()
 
# ❌ Incorrect - missing to_table() conversion
return Response(result=result)

What's next

Now that you understand how to structure Named Queries for DataFrames:

Create Named Queries that connect agents to the data they need
Test them with agents using natural language questions
Build runbooks that combine multiple Named Queries for complex workflows
Monitor how business users leverage your queries through agent conversations

DataFrames Knowledge Base