Mistral7B Model Fine-Tuning
Follow this blog post for a comprehensive tutorial on how to fine-tune a Mistral 7B model.
All Anyscale models belong to the group of Large Language Models (LLMs).
These are some of the supported models:
- Mistral7B
- Llama-2-7b
- Llama-2-13b
- Llama-2-70b
- Code Llama
Let’s create a model to answer questions about MindsDB’s custom SQL syntax.
First, create an AnyScale engine, passing your Anyscale API key:
CREATE ML_ENGINE anyscale_engine
FROM anyscale_endpoints
USING
anyscale_endpoints_api_key = 'your-anyscale-api-key';
Then, create a model using this engine:
CREATE MODEL mymistral7b
PREDICT completion
USING
engine = 'anyscale_engine',
model_name = 'mistralai/Mistral-7B-Instruct-v0.1',
prompt_template = 'Return a valid SQL string for the following question about MindsDB in-database machine learning: {{prompt}}';
You can check model status with this command:
DESCRIBE mymistral7b;
Once the status is complete, we can query for predictions:
SELECT prompt, completion
FROM mymistral7b as m
WHERE prompt = 'What is the SQL syntax to join input data with predictions from a MindsDB machine learning model?'
USING max_tokens=400;
On execution, we get:
+---------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
| prompt | completion |
+---------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
| What is the SQL syntax to join input data with predictions from a MindsDB machine learning model? | The SQL syntax is: SELECT * FROM input_data INNER JOIN predictions ON input_data.id = predictions.id |
+---------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
If you followed one of the MindsDB tutorials before, you’ll see that the syntax provided by the model is not exactly as expected.
Now, we’ll fine-tune our model using a table that stores details about MindsDB’s custom SQL syntax.
Let’s connect to a DB that hosts a table we’ll use to fine-tune our model:
CREATE DATABASE example_db
WITH ENGINE = "postgres",
PARAMETERS = {
"user": "demo_user",
"password": "demo_password",
"host": "3.220.66.106",
"port": "5432",
"database": "demo"
};
Now we can take a look at the fine-tuning data:
SELECT prompt, completion
FROM example_db.demo_data.chat_llm_mindsdb_docs
LIMIT 5;
And here are the first few rows:
message_id | role | content | ||
---|---|---|---|---|
0 | system | You are a helpful assistant. Your task is to answer a user’s question regarding the SQL syntax supported by MindsDB, a machine learning product for training models and seamlessly deploying them where your data lives. | ||
1 | user | In the context of MindsDB: 1. Testing CREATE DATABASE | ||
2 | assistant | CREATE DATABASE example_db WITH ENGINE = "postgres", PARAMETERS = { "user": "demo_user", "password": "demo_password", "host": "3.220.66.106", ... }; | ||
Output: | ||||
status | ||||
------ | ||||
Query successfully completed | ||||
3 | system | You are a helpful assistant. Your task is to answer a user’s question regarding the SQL syntax supported by MindsDB, a machine learning product for… | ||
4 | user | In the context of MindsDB: 2. Testing Preview the Available Data Using SELECT |
Notice it is formatted as a series of chats that conform to the standard OpenAI chat format. Every message has a “role” and some “content”. By chaining together a series of messages, we can create a conversation.
Now, you can fine-tune a Mistral model with this data like so:
FINETUNE mymistral7b
FROM example_db
(SELECT * FROM demo_data.chat_llm_mindsdb_docs);
The FINETUNE
command creates a new version of the mistralai/Mistral-7B-Instruct-v0.1
model. You can query all available versions as below:
SELECT *
FROM models_versions
WHERE name = 'mymistral7b';
Once the new version status is complete and active, we can query the model again, expecting a more accurate output.
SELECT prompt, completion
FROM mymistral7b as m
WHERE prompt = 'What is the SQL syntax to join input data with predictions from a MindsDB machine learning model?'
USING max_tokens=400;
On execution, we get:
+---------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
| prompt | completion |
+---------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
| What is the SQL syntax to join input data with predictions from a MindsDB machine learning model? | SELECT * FROM mindsdb.models.my_model JOIN mindsdb.input_data_name; |
+---------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+