Ollama is a project that enables easy local deployment of Large Language Models (LLMs).
All models supported by Ollama are available in MindsDB through this integration.
For now, this integration will only work in MacOS, with Linux and Windows to come later.
Locally deployed LLMs can be desirable for a wide variety of reasons. In this case, data privacy, developer feedback-loop speed and inference cost reduction can be powerful reasons to opt for a local LLM.
Ideal predictive use cases, as in other LLM-focused integrations (e.g. OpenAI, Anthropic, Cohere), will be anything involving language understanding and generation, including but not limited to:
- zero-shot text classification
- sentiment analysis
- question answering
- A macOS machine, M1 chip or greater.
- A working Ollama installation. For instructions refer to their webpage. This step should be really simple.
- For 7B models, at least 8GB RAM is recommended.
- For 13B models, at least 16GB RAM is recommended.
- For 70B models, at least 64GB RAM is recommended.
More information here. Minimum specs can vary depending on the model.
Before creating a model, it is required to create an AI engine based on the provided handler.
You can create an Ollama engine using this command:
CREATE ML_ENGINE ollama FROM ollama;
The name of the engine (here,
ollama) should be used as a value for the
engine parameter in the
USING clause of the
CREATE MODEL statement.
CREATE MODEL statement is used to create, train, and deploy models within MindsDB.
CREATE MODEL mindsdb.my_llama2
engine = 'ollama',
model_name = 'llama2'
|It defines the Ollama engine.
|It is used to provide the name of the model to be used
Supported commands for describing Ollama models are:
Once you have connected to an Ollama model, you can use it to make predictions.
SELECT text, completion
WHERE text = 'hi there!';