In today's blog, we will be going to discuss and see how do we create a LLM using Donut-LLM-Tools.
The first and foremost thing we need to train a model is to create a valid dataset that contains all the training data required to train the model.
To get a dataset you can either use huggingface or use Donut-LLM-Tools to create a dataset by scrapping Wikipedia.
from donutllmtools import Tools
Tools.DatasetCreator()
Tools.LLMCreator()
The above code imports the Tools class from the Donut-LLM-Tools module, this class have the functions DatasetCreator() and LLMCreator(). The DatasetCreator() function, once called will automatically start scrapping Wikipedia and write it to a DoDS (Donut DataSet) file. The LLMCreator() function is in a menu driven format, once called it presents the user with a menu for 1. Creating a model, 2. Loading a model to ask prompts to it and 3. Exit.
When the user presses 1. then the menu asks for the dataset directory and filename as well as the number of iterations you want to train the model for. When the user presses 2. then the menu asks for the model's directory and filename and then prompts user to enter a prompt and runs in loop until the user types 'exit' or 'quit'.
Hooray.! You have now learnt to use Donut-LLM-Tools to create your own datasets and models and run them locally.
Comments