This handler was implemented using the replicate library that is provided by Replicate.

The required arguments to establish a connection are,

  • model_name: Model name which you want to access in MindsDB. e.g ‘air-forever/kandinsky-2’
  • version: version hash/id which you want to use in MindsDB.
  • api_key: API key from Replicate Platform you can found here.

Before you can use Replicate, it’s essential to authenticate by setting your API token in an environment variable named REPLICATE_API_TOKEN. This token acts as a key to enable access to Replicate’s features.

  1. Using pip:

If you’re working in a standard Python environment (using pip for package management), set your token as an environment variable by running the following command in your terminal:

On Linux, Mac:

export REPLICATE_API_TOKEN='YOUR_TOKEN'

On Windows:

set REPLICATE_API_TOKEN=YOUR_TOKEN
  1. Using Docker:

For Docker users, the process slightly differs. You need to pass the environment variable directly to the Docker container when running it. Use this command:

docker run -e REPLICATE_API_TOKEN='YOUR_TOKEN' -p 47334:47334 -p 47335:47335 mindsdb/mindsdb

Again, replace ‘YOUR_TOKEN’ with your actual Replicate API token.

Usage

To use this handler and connect to a Replicate cluster in MindsDB, you need an account on Replicate. Make sure to create an account by following this link.

To establish the connection and create a model in MindsDB, use the following syntax:


CREATE MODEL audio_ai
PREDICT audio
USING
    engine = 'replicate',
    model_name= 'afiaka87/tortoise-tts',
    version ='e9658de4b325863c4fcdc12d94bb7c9b54cbfe351b7ca1b36860008172b91c71',
    api_key = 'r8_BpO.........................';

You can use the DESCRIBE PREDICTOR query to see the available parameters that you can specify to customize your predictions:

DESCRIBE PREDICTOR mindsdb.audio_ai.features;

Output

+--------------+---------+-----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| inputs       | type    | default                                                                                       | description                                                                                                                                                                             |
+--------------+---------+-----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| seed         | integer | 0                                                                                             | Random seed which can be used to reproduce results.                                                                                                                                     |
| text         | string  | The expressiveness of autoregressive transformers is literally nuts! I absolutely adore them. | Text to speak.                                                                                                                                                                          |
| preset       | -       | fast                                                                                          | Which voice preset to use. See the documentation for more information.                                                                                                                  |
| voice_a      | -       | random                                                                                        | Selects the voice to use for generation. Use `random` to select a random voice. Use `custom_voice` to use a custom voice.                                                               |
| voice_b      | -       | disabled                                                                                      | (Optional) Create new voice from averaging the latents for `voice_a`, `voice_b` and `voice_c`. Use `disabled` to disable voice mixing.                                                  |
| voice_c      | -       | disabled                                                                                      | (Optional) Create new voice from averaging the latents for `voice_a`, `voice_b` and `voice_c`. Use `disabled` to disable voice mixing.                                                  |
| cvvp_amount  | number  | 0                                                                                             | How much the CVVP model should influence the output. Increasing this can in some cases reduce the likelyhood of multiple speakers. Defaults to 0 (disabled)                             |
| custom_voice | string  | -                                                                                             | (Optional) Create a custom voice based on an mp3 file of a speaker. Audio should be at least 15 seconds, only contain one speaker, and be in mp3 format. Overrides the `voice_a` input. |
+--------------+---------+-----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Now, you can use the established connection to query your ML Model as follows:

Audio Generation

Custom Audio Cloning

SELECT * FROM audio_ai
WHERE 
    text = "This is breaking news that first humans have landed on Mars, and they have found something very unusual there. By the way, this is the future."
USING 
    voice_a = 'custom_voice',
    custom_voice = 'https://123bien.com/wp-content/uploads/2019/05/i-want-to-work-2.mp3';

OUTPUT

+------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| audio                                                                                          | text                                                                                                                                                 |
+------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| https://replicate.delivery/pbxt/ffOCXeL4fa5yekAL4ybfFiJBbqENSEjhSLpA2zp1ElsBxxhSE/tortoise.mp3 | This is breaking news that first human are landed on mars and they find something very unusal their ehich is not yet out, by the way this is future  |
+------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+

If above predicted url don’t work , then use this

Audio Generation

SELECT * FROM audio_ai
WHERE 
    text = "An image captured by NASA's Mars Curiosity Rover shows a faint figure of a woman against the desert landscape of Mars. If you take a closer look, it will seem that the lady is standing on a cliff overlooking the vast undulating expanse. She seems to wear a long cloak and has long hair."
USING 
    voice_a = 'random';

OUTPUT

+---------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| audio                                                                                       | text                                                                                                                                                                                                                                                                                         |
+---------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| https://replicate.delivery/pbxt/EQj2DtBn5fxVA6P97GfPYthmgd0I3VaOEGFnweE4hvl5BPUiA/audio.wav | An image captured by NASA's Mars Curiosity Rover shows a faint figure of a woman against the desert landscape of Mars. If you take a closer look, it will seem that the lady is standing on a cliff overlooking the vast undulating expanse. She seems to wear a long cloak and has long hair. |
+---------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Above predicted url will work , therfore use this.

This is just an one model used in this example there are more with vast variation and usescases. Also there is no limit to imagination, how can you use this.

  • IMPORTANT NOTE: PREDICTED URL will only work for 24 hours after prediction.

Note: Replicate provides only a few free predictions, so choose your predictions wisely. Don’t let the machines have all the fun, save some for yourself! 😉