Apache Hive
This documentation describes the integration of MindsDB with Apache Hive, a data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. The integration allows MindsDB to access data from Apache Hive and enhance Apache Hive with AI capabilities.
Prerequisites
Before proceeding, ensure the following prerequisites are met:
- Install MindsDB locally via Docker or Docker Desktop.
- To connect Apache Hive to MindsDB, install the required dependencies following this instruction.
Connection
Establish a connection to Apache Hive from MindsDB by executing the following SQL command and providing its handler name as an engine.
Required connection parameters include the following:
host
: The hostname, IP address, or URL of the Apache Hive server.database
: The name of the Apache Hive database to connect to.
Optional connection parameters include the following:
username
: The username for the Apache Hive database.password
: The password for the Apache Hive database.port
: The port number for connecting to the Apache Hive server. Default is10000
.auth
: The authentication mechanism to use. Default isCUSTOM
. Other options areNONE
,NOSASL
,KERBEROS
andLDAP
.
Usage
Retrieve data from a specified table by providing the integration and table names:
Run HiveQL queries directly on the connected Apache Hive database:
The above examples utilize hive_datasource
as the datasource name, which is defined in the CREATE DATABASE
command.
Troubleshooting
Database Connection Error
- Symptoms: Failure to connect MindsDB with the Apache Hive database.
- Checklist:
- Ensure that the Apache Hive server is running and accessible
- Confirm that host, port, user, and password are correct. Try a direct Apache Hive connection using a client like DBeaver.
- Test the network connection between the MindsDB host and the Apache Hive server.
Was this page helpful?