Text Summarization with MindsDB and OpenAI using SQL
Introduction
In this blog post, we present how to create OpenAI models within MindsDB. In this example, we ask a model to provide a summary of a text. The input data is taken from our sample MySQL database.
Prerequisites
To follow along, install MindsDB locally via Docker or Docker Desktop.
Tutorial
In this tutorial, we create a predictive model to summarize an article.
We use a table from our MySQL public demo database, so let’s start by connecting MindsDB to it:
CREATE DATABASE mysql_demo_db
WITH ENGINE = 'mysql',
PARAMETERS = {
"user": "user",
"password": "MindsDBUser123!",
"host": "samples.mindsdb.com",
"port": "3306",
"database": "public"
};
Now that we’ve connected our database to MindsDB, let’s query the data to be used in the example:
SELECT *
FROM mysql_demo_db.articles
LIMIT 3;
Here is the output:
+----------------------------------------------------------------+--------------------------------------------------------------+
| article | highlights |
+----------------------------------------------------------------+--------------------------------------------------------------+
| Video footage has emerged of a law enforcement officer… | The 53-second video features… |
| A new restaurant is offering a five-course drink-paired menu… | The Curious Canine Kitchen is… |
| Mother-of-two Anna Tilley survived after spending four days… | Experts have warned hospitals not using standard treatment… |
+----------------------------------------------------------------+--------------------------------------------------------------+
Let’s create a model table to summarize all articles from the input dataset:
Before creating an OpenAI model, please create an engine, providing your OpenAI API key:
CREATE ML_ENGINE openai_engine
FROM openai
USING
openai_api_key = 'your-openai-api-key';
CREATE MODEL text_summarization_model
PREDICT highlights
USING
engine = 'openai_engine',
prompt_template = 'provide an informative summary of the text text:{{article}} using full sentences';
In practice, the CREATE MODEL
statement triggers MindsDB to generate an AI table called text_summarization_model
that uses the OpenAI integration to predict a column named highlights
. The model lives inside the default mindsdb
project. In MindsDB, projects are a natural way to keep artifacts, such as models or views, separate according to what predictive task they solve. You can learn more about MindsDB projects here.
The USING
clause specifies the parameters that this handler requires.
- The
engine
parameter defines that we use theopenai
engine. - The
prompt_template
parameter conveys the structure of a message that is to be completed with additional text generated by the model.
Follow this instruction to set up the OpenAI integration in MindsDB.
Once the CREATE MODEL
statement has started execution, we can check the status of the creation process with the following query:
DESCRIBE text_summarization_model;
It may take a while to register as complete depending on the internet connection. Once the creation is complete, the behavior is the same as with any other AI table – you can query it either by specifying synthetic data in the actual query:
SELECT article, highlights
FROM text_summarization_model
WHERE article = "Apple's Watch hits stores this Friday when customers and employees
alike will be able to pre-order the timepiece. And boss Tim Cook is
rewarding his staff by offering them a 50 per cent discount on the device.";
Here is the output data:
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+
| article | highlights |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+
| Apple's Watch hits stores this Friday when customers and employees alike will be able to pre-order the timepiece. And boss Tim Cook is rewarding his staff by offering them a 50 per cent discount on the device. | Apple's Watch hits stores this Friday, and employees will be able to pre-order the |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+
Or by joining with another table for batch predictions:
SELECT input.article, output.highlights
FROM mysql_demo_db.articles AS input
JOIN text_summarization_model AS output
LIMIT 3;
Here is the output data:
+----------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| article | highlights |
+----------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| Video footage has emerged of a law enforcement officer… | A video has emerged of a law enforcement officer grabbing a cell phone from a woman who was |
| A new restaurant is offering a five-course drink-paired menu… | A new restaurant in London is offering a five-course drink-paired menu for dogs |
| Mother-of-two Anna Tilley survived after spending four days… | Sepsis is a potentially life-threatening condition that occurs when the body's response to an |
+----------------------------------------------------------------+------------------------------------------------------------------------------------------------+
The articles
table is used to make batch predictions. Upon joining the text_summarization_model
model with the articles
table, the model uses all values from the article
column.
Leverage the NLP Capabilities with MindsDB
By integrating databases and OpenAI using MindsDB, developers can easily extract insights from text data with just a few SQL commands. These powerful natural language processing (NLP) models are capable of answering questions with or without context and completing general prompts.
Furthermore, these models are powered by large pre-trained language models from OpenAI, so there is no need for manual development work. Ultimately, this provides developers with an easy way to incorporate powerful NLP capabilities into their applications while saving time and resources compared to traditional ML development pipelines and methods. All in all, MindsDB makes it possible for developers to harness the power of OpenAI efficiently!
MindsDB is now the fastest-growing open-source applied machine-learning platform in the world. Its community continues to contribute to more than 70 data-source and ML-framework integrations. Stay tuned for the upcoming features - including more control over the interface parameters and fine-tuning models directly from MindsDB!
Experiment with OpenAI models within MindsDB and unlock the ML capability over your data in minutes.
Finally, if MindsDB’s vision to democratize ML sounds exciting, head to our community Slack, where you can get help and find people to chat about using other available data sources, ML frameworks, or writing a handler to bring your own!
Follow our introduction to MindsDB’s OpenAI integration here. Also, we’ve got a variety of tutorials that use MySQL and MongoDB:
- Sentiment Analysis in MySQL
- Question Answering in MySQL
- Sentiment Analysis in MongoDB
- Question Answering in MongoDB
- Text Summarization in MongoDB
What’s Next?
Have fun while trying it out yourself!
- Bookmark MindsDB repository on GitHub.
- Engage with the MindsDB community on Slack or GitHub to ask questions and share your ideas and thoughts.
If this tutorial was helpful, please give us a GitHub star here.
Was this page helpful?