Fine-tuning LLaMA to Recreate Eminescu's Literary Style
Knowledge
As artificial intelligence continues to evolve, we've embarked on a fun project that bridges the gap between classical literature and modern technology. Our goal was to successfully develop an AI model that can generate text in the distinctive style of Mihai Eminescu, Romania's preeminent poet and a defining voice of Romanian poetry.
We've used Google Collab with Python for this project.
Understanding the Challenge
Before diving into the technical details, it's important to understand what makes this project unique. Teaching an AI model to write like Eminescu isn't just about vocabulary and grammar — it's about capturing the essence of his romantic style, his philosophical depth, and his masterful use of Romanian language. This presents several interesting challenges, from handling Romanian diacritics to understanding the complex structures of 19th-century literary Romanian.
Technical Foundation: The LLaMA Model
We chose to build upon Meta's LLaMA model for several reasons. LLaMA is particularly well-suited for fine-tuning tasks due to its efficient architecture and strong multilingual capabilities. We experimented with three versions:
- LLaMA-3.2-3B: Our initial implementation
- LLaMA-3.1-8B: An alternative version for comparison
- LLaMA-3.3-70B-Instruct: Our latest iteration
The Fine-Tuning Process
The first crucial step was preparing our training data and splitting it into small chunks for the tokenisation part. We broke down Eminescu's works into manageable chunks while maintaining context through overlap. The overlap is crucial as it helps the model understand longer-range dependencies in the text.
One of our biggest challenges was handling the large model efficiently. We implemented several memory optimization techniques, including 4-bit quantization which significantly reduces memory usage while the CPU offloading helps manage GPU memory constraints.
Low-Rank Adaptation (LoRA)
We used LoRA to efficiently fine-tune the model without updating all parameters. This approach allows us to fine-tune the model with significantly fewer parameters, making the process more efficient while maintaining performance. The target modules focus on the attention mechanisms, which are crucial for capturing the stylistic elements of Eminescu's writing.
Real-World Applications
The implications of this project extend beyond just generating Eminescu-style text. This technology can be applied to:
- Educational tools for studying Romanian literature
- Creative writing assistance
- Cultural heritage preservation
- Literary style analysis and research
We're continuing to improve the model by expanding the training dataset with more of Eminescu's works, experimenting with larger model variants, and developing better evaluation metrics for Romanian poetry.
The complete codebase is available on our GitHub repository, and we've made our models available on HuggingFace for the broader AI and literary communities to use and build upon.