Skip to main content

Using Donut-LLM-Tools to create a model

 In today's blog, we will be going to discuss and see how do we create a LLM using Donut-LLM-Tools.

The first and foremost thing we need to train a model is to create a valid dataset that contains all the training data required to train the model.

To get a dataset you can either use huggingface or use Donut-LLM-Tools to create a dataset by scrapping Wikipedia.



from donutllmtools import Tools
Tools.DatasetCreator()
Tools.LLMCreator()

The above code imports the Tools class from the Donut-LLM-Tools module, this class have the functions DatasetCreator() and LLMCreator(). The DatasetCreator() function, once called will automatically start scrapping Wikipedia and write it to a DoDS (Donut DataSet) file. The LLMCreator() function is in a menu driven format, once called it presents the user with a menu for 1. Creating a model, 2. Loading a model to ask prompts to it and 3. Exit.

When the user presses 1. then the menu asks for the dataset directory and filename as well as the number of iterations you want to train the model for. When the user presses 2. then the menu asks for the model's directory and filename and then prompts user to enter a prompt and runs in loop until the user types 'exit' or 'quit'.

Hooray.! You have now learnt to use Donut-LLM-Tools to create your own datasets and models and run them locally.

Comments

Popular posts from this blog

Developing a simple Linux Distro from scratch using Busybox

Greetings, and welcome to my blog. Today, I will discuss how to create a simple Linux distribution using BusyBox . This can be done on any system, whether it is Windows, macOS, or Linux. For Windows, you need WSL, a Docker (Ubuntu) container, or a VM with a Linux distribution installed (Ubuntu or its derivatives are recommended). First, install the prerequisites: Note : If you are using a container, ensure you run it in privileged mode. Bash sudo apt install bc cpio bison libssl-dev libncurses-dev libelf-dev bzip2 make sudo apt install automake autoconf git syslinux dosfstools xz-utils build-essential gcc wget Once you have these dependencies installed, start by creating a directory named distro : Bash sudo mkdir /distro cd /distro After creating and changing the directory, obtain the Linux Kernel, either from git or wget: Note : If you use git, you might clone the beta or release candidate version of the kernel. Bash sudo git clone --depth 1 https://github.com/torvalds/linux OR...

DeepSeek : The New Contender in AI

 The AI landscape is buzzing with excitement over the latest innovation from China-based startup DeepSeek. Their new AI model, DeepSeek-R1 , has taken the tech world by storm, challenging established giants like OpenAI. Here's what makes DeepSeek-R1 so special: Unmatched Performance at a Fraction of the Cost DeepSeek-R1 has demonstrated remarkable performance on various benchmarking tools, often rivaling or even surpassing OpenAI's flagship o1 model. What's more impressive is that DeepSeek-R1 achieves this at a fraction of the cost. While OpenAI's o1 model costs $15 per million tokens, DeepSeek-R1's API input cost is just $0.55 per million tokens. Versatility Across Multiple Domains DeepSeek-R1 excels in multiple domains, including language understanding, coding, math, and Chinese language processing. It scored 90.8 on the Massive Multitask Language Understanding (MMLU) benchmark, compared to OpenAI's o1 model which scored 92.3. In coding benchmarks, DeepSeek-R1...

Windows Whistler: The Birth of Windows XP

 In the early 2000s, Microsoft embarked on a mission to unify its consumer and business operating systems into a single, versatile platform. This mission led to the creation of Windows Whistler , a project that would eventually evolve into the widely acclaimed Windows XP . Let's explore the journey of Windows Whistler and its significance in the world of computing. The Genesis of Whistler Windows Whistler was born out of the need to merge two separate projects: Windows Odyssey and Windows Neptune . Odyssey was aimed at business users, building on the stability and security of Windows 2000, while Neptune targeted home users with a more user-friendly interface. Recognizing the potential benefits of combining these efforts, Microsoft decided to merge the two projects into a single codebase, codenamed Whistler. Key Features and Innovations Whistler introduced several groundbreaking features that would later become hallmarks of Windows XP: Unified Codebase : By merging Odyssey and Nept...