Amazon S3
This documentation describes the integration of MindsDB with Amazon S3, an object storage service that offers industry-leading scalability, data availability, security, and performance.
Prerequisites
Before proceeding, ensure that MindsDB is installed locally via Docker or Docker Desktop.
Connection
Establish a connection to your Amazon S3 bucket from MindsDB by executing the following SQL command:
CREATE DATABASE s3_datasource
WITH
engine = 's3',
parameters = {
"aws_access_key_id": "AQAXEQK89OX07YS34OP"
"aws_secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"region_name": "us-east-2",
"bucket": "my-bucket",
};
Required connection parameters include the following:
aws_access_key_id
: The AWS access key that identifies the user or IAM role.aws_secret_access_key
: The AWS secret access key that identifies the user or IAM role.bucket
: The name of the Amazon S3 bucket.
Optional connection parameters include the following:
aws_session_token
: The AWS session token that identifies the user or IAM role. This becomes necessary when using temporary security credentials.region_name
: The AWS region to connect to. Default isus-east-1
.
Usage
Retrieve data from a specified object (file) in the S3 bucket by providing the integration name and the object key:
SELECT *
FROM s3_datasource.`my-file.csv`;
LIMIT 10;
Wrap the object key in backticks (`) to avoid any issues parsing the SQL statements provided. This is especially important when the object key contains spaces, special characters or prefixes, such as my-folder/my-file.csv
.
At the moment, the supported file formats are CSV, TSV, JSON, and Parquet.
The above examples utilize s3_datasource
as the datasource name, which is defined in the CREATE DATABASE
command.
Troubleshooting Guide
Database Connection Error
- Symptoms: Failure to connect MindsDB with the Amazon S3 bucket.
- Checklist:
- Make sure the Amazon S3 bucket exists.
- Confirm that provided AWS credentials are correct. Try making a direct connection to the S3 bucket using the AWS CLI.
- Ensure a stable network between MindsDB and AWS.
SQL statement cannot be parsed by mindsdb_sql
- Symptoms: SQL queries failing or not recognizing object names containing spaces, special characters or prefixes.
- Checklist:
- Ensure object names with spaces, special characters or prefixes are enclosed in backticks.
- Examples:
- Incorrect: SELECT * FROM integration.travel/travel_data.csv
- Incorrect: SELECT * FROM integration.‘travel/travel_data.csv’
- Correct: SELECT * FROM integration.`travel/travel_data.csv`
Was this page helpful?