Skip to main content

Using Donut-LLM-Tools to create a model

 In today's blog, we will be going to discuss and see how do we create a LLM using Donut-LLM-Tools.

The first and foremost thing we need to train a model is to create a valid dataset that contains all the training data required to train the model.

To get a dataset you can either use huggingface or use Donut-LLM-Tools to create a dataset by scrapping Wikipedia.



from donutllmtools import Tools
Tools.DatasetCreator()
Tools.LLMCreator()

The above code imports the Tools class from the Donut-LLM-Tools module, this class have the functions DatasetCreator() and LLMCreator(). The DatasetCreator() function, once called will automatically start scrapping Wikipedia and write it to a DoDS (Donut DataSet) file. The LLMCreator() function is in a menu driven format, once called it presents the user with a menu for 1. Creating a model, 2. Loading a model to ask prompts to it and 3. Exit.

When the user presses 1. then the menu asks for the dataset directory and filename as well as the number of iterations you want to train the model for. When the user presses 2. then the menu asks for the model's directory and filename and then prompts user to enter a prompt and runs in loop until the user types 'exit' or 'quit'.

Hooray.! You have now learnt to use Donut-LLM-Tools to create your own datasets and models and run them locally.

Comments

Popular posts from this blog

Windows Neptune: The Unreleased Vision of Microsoft's Future

 In the late 1990s, Microsoft embarked on an ambitious project to revolutionize its operating systems. Codenamed Neptune , this project aimed to create a consumer-oriented version of Windows based on the Windows NT codebase. Although it never saw an official release, Neptune played a crucial role in shaping the future of Windows. The Genesis of Neptune Neptune was conceived as a successor to Windows 98 and Windows Me. The goal was to merge the stability and advanced features of Windows NT with a user-friendly interface suitable for home users. The project began in 1999 and was led by a dedicated team within Microsoft. Key Features and Innovations Neptune introduced several groundbreaking features that were ahead of their time: Activity Centers : One of the most notable innovations was the introduction of Activity Centers. These task-based user interfaces focused on daily activities such as browsing the internet, communication, document management, and entertainment. The idea was to...

Microsoft Cairo: The Vision That Never Was

When you think of Microsoft's operating systems, names like Windows XP, Windows 7, and Windows 10 probably come to mind. However, nestled deep in the annals of tech history is an ambitious project that never saw the light of day—Windows Cairo. The Vision Behind Cairo In the early 1990s, Microsoft was riding high on the success of Windows 3.0 and was gearing up for the release of Windows 95. But parallel to these efforts, a more visionary project was in development. Windows Cairo was envisioned as the pinnacle of Microsoft's future operating systems—a sophisticated, object-oriented OS that would redefine how users interacted with their computers. Key Features of Windows Cairo At its core, Windows Cairo was designed to be a fully integrated system with a focus on the following features: Object-Oriented Interface : Unlike the traditional file-and-folder structure, Cairo aimed to introduce an object-oriented environment where users could manage documents and applications more intui...

Introducing Linea 2: Redefining Simplicity and Performance in Programming

As developers, we constantly strive to push boundaries and innovate. Today, I am thrilled to announce the release of Linea 2—the culmination of dedication, creativity, and a vision for a language that simplifies programming while enhancing efficiency. Linea 2, codenamed "Coconut," is a significant milestone in our journey. Building on the foundation of its predecessor, Linea 1.x, this version represents a complete revamp of the codebase, introducing powerful features and a refined syntax designed to cater to both beginners and experienced developers. Key Highlights of Linea 2 - Performance Boost : The codebase has undergone a full overhaul, ensuring optimized performance and improved maintainability. With Linea 2, coding is faster and more responsive than ever. - Enhanced Libraries:  The inclusion of math and weblet libraries in the core package empowers users to seamlessly integrate complex calculations and web functionalities into their projects. - Refined Syntax:  The new ...