Semantic Data Models from Databases
This tutorial will guide you through the process of creating a Semantic Data Model from a database. We will use the North Dakota Department of Mineral Resources (DMR) database as an example. Read more about the dataset here. The database connection to the demo dataset is already configured in your Studio!
Step-by-step guide
Follow these easy steps to create the Semantic Data Model based on a database!
Connect your database
During preview phase, only PostgreSQL, Redshift and Snowflake are supported. More databases will supported shortly.
First, you'll be asked to connect to your database. You can choose to connect to a new database or to an existing one already configured in your Studio. The list of connections will be populated based on the data connections you have configured in your Studio's Settings page.
You can click Create New to create new connection, which takes you to a form to fill in all the connection details.

Once you choose your connection, it gets validated. This might take a few moments depending on the amount of tables and views your credentials have access to.
For the rest of the steps, we'll show how the Semantic Data Model creation happens with the pre-configured database connection called Sema4.ai Public Demo DB Credentials.
Once validated, you can click Continue to proceed to the next step.
Business context
Next, you'll be asked to provide business context.

You can provide a freeform text description of the data you want to include in the Semantic Data Model. This will be used to help Studio generate the Semantic Data Model automatically. More detailed the better. Look for existing descriptions of your dataset in Word docs, Google Docs or elsewhere. Just copy and past the text from your sources to the text area, and don't sweat about the formatting - this is only for AI's eyes!
With the example data, you can copy and paste the following text:
Operational and analytical model of North Dakota oil and gas well production, sourced from the NDIC Oil & Gas Division monthly and annual production reports. Contains state-reported well-level monthly production totals in gallons (oil, gas, water), days on production, cumulative volumes, and regulatory identifiers (API and file numbers) linked to operator, field/pool, county, and TRS legal location, supporting regulatory reporting, production trending, operator benchmarking, flaring analysis, and geospatial queries by well and time period.
If asked for the well scout data for a specific well go read this url and return the data: https://www.dmr.nd.gov/oilgas/basic/getscoutticket.asp?FileNumber=${well file number}Select data
In the same view as the business context, you'll be able to select the data you want to include in the Semantic Data Model. The view shows all table and views in the database along with the columns in each table and view.
Only select the data that is REALLY needed for the model and agent. More data means more chances of going wrong!
You can select the data by clicking the checkbox next to the table or view name. You can also select individual columns by clicking the checkbox next to the column name.
With the example data, you are good by choosing all columns from og_production_reports table.
Once you've selected the data, you can click Continue to proceed to the next step.
Generate the Semantic Data Model
Time to watch Sai working! Our AI looks at your database tables and columns, samples some data, and then uses the provided business context to generate the Semantic Data Model. This might take a few minutes depending on the amount of data you've selected.

Once the Semantic Data Model is generated, you have two options:
- Review - Review the generated Semantic Data Model and make improvements if needed.
- Go to Agent - Go straight to the agent and see the Semantic Data Model in action. You can always come back to make edits later!
You are seeing a "Data Understanding" score but it's grayed out, right? It's a thing that we are working on and will be available shortly. The score will tell you how well we think the model represents your data and business context, and will help you understand if the model is good enough for your use case.
In this tutorial, we'll now go straight to the agent and see the Semantic Data Model in action! Feel free to review the Editing Semantic Data Model section later.
Try the model in an agent
Once you click Go to Agent, you'll be taken to the agent view. You can start asking questions to the agent right away!
Ask your agent a question about the data, something like "Which were the top 5 wells by production of oil?". The agent will use the Semantic Data Model to generate a SQL query and show it to you in tool call envelopes - and run it against the database.
The resulting dataset goes to a DataFrame, so it will not fill up the agent's context window even with large results.