How MindsDB Agents Work
MindsDB Agents are built on the Pydantic framework and follow a structured workflow to interpret and answer user questions over the connected data.Step 1. Input Processing
Step 1. Input Processing
When a query is received, the agent:
- Builds a real-time data catalog generated dynamically and based on a sample of 5 rows from each connected data object
- Extracts prompts and user messages
- Prepares structured input for reasoning This lightweight catalog enables the agent to understand available schemas and data types.
Step 2. Planning
Step 2. Planning
Using the processed input, the agent:
- Determines which connected data sources are relevant
- Plans query execution steps
- Prepares SQL queries as needed This stage ensures that the agent selects appropriate tables and avoids unnecessary exploration.
Step 3. Exploration Loop
Step 3. Exploration Loop
The agent enters an iterative execution cycle:
- Executes queries against connected tables or knowledge bases
- Collects and evaluates results
- Adjusts queries if needed This loop continues until sufficient relevant data is collected, up to a maximum of 20 queries.
Step 4. Error Handling & Learning
Step 4. Error Handling & Learning
If execution errors occur:
- Errors are analyzed
- The agent attempts corrective adjustments
- Up to three accumulated errors are retained for context This iterative correction improves answer accuracy within session constraints.
Step 5. Output Synthesis
Step 5. Output Synthesis
Finally, the agent:
- Aggregates collected data
- Synthesizes a natural language or structured response
- Returns the answer based on the query format
Recommended Usage
To ensure optimal performance and accuracy, follow these guidelines.-
Data Preparation
High-quality input significantly improves agent performance.
Clean your data:
- Remove irrelevant columns
- Filter unnecessary rows
- Normalize inconsistent formats
- Filtered views: Keep only relevant tables, columns, and rows
- Aggregated views: Provide summary tables for frequent analytical queries
- Joined views: Combine tables that are commonly queried together
-
Agent Data Setup
To maintain performance:
- Limit connected objects (tables + knowledge bases) to 10 or fewer
- Ensure objects are relevant to the use case
- Provide descriptive context in the prompt
- Describe data stored in connected tables or knowledge bases
- Clarify relationships between available data objects
- Specify expected output format
-
Querying an Agent
Agent outputs are non-deterministic, due to the nature of large language models. However, MindsDB provides mechanisms for limited output control.
To receive a natural language answer:
This ensures the agent returns a human-readable response contained in the
answercolumn. To enforce a specific output structure:The agent will format its response to match the defined columns. This enables:- Programmatic consumption
- Dashboard integration
- Downstream automation workflows
Usage Example
Agents enable conversation with data, including structured and unstructured data connected to MindsDB. Connect your data to MindsDB by connecting databases or applications or uploading files. Users can opt for using knowledge bases to store and retrieve data efficiently. Create an agent, passing the connected data and defining the underlying model.MindsDB Agents vs Minds: Feature Comparison
Both MindsDB Agents (open-source) and Minds (enterprise) are designed to answer questions over data connected to MindsDB. The key difference lies in scope, scale, and advanced functionality:| Feature | MindsDB Agents (Open-Source) | Minds (Enterprise) |
|---|---|---|
| Data catalog | Built dynamically from sample data (5 rows per object) | Full data catalog with complete metadata |
| Context window | Limited | Extended |
| Error memory | Up to 3 accumulated errors | Extended memory and learning |
| Message history | Cleared with thread reset | Persistent across threads |
| Production controls | Basic | Advanced governance and controls |
| Scalability | Recommended ≤10 connected objects | Designed for complex, large-scale environments |
- Prototyping
- Small-to-medium complexity use cases
- Controlled data environments
- You require full data catalog visibility
- You need persistent conversation memory
- You are connecting many tables or knowledge bases
- Your workflows involve complex multi-step reasoning
- You require production-level governance and control