Nvidia Parakeet in Basque

Nvidia Parakeet in Basque: fast and CPU-friendly

Over the last few days, I have been fine-tuning Nvidia Parakeet for Basque, and I have published the result here. The goal was simple: to have a lightweight Basque speech-to-text model that runs fast and is practical on modest hardware. Accuracy vs speed To be clear, this model is not as accurate as my best Basque Whisper model: xezpeleta/whisper-large-v3-eu However, it has a major advantage: it is very fast and can run on CPU-only setups. ...

March 15, 2026 · map[name:Xabi Ezpeleta]
Latxa VL model

New Latxa VL Models

Recently, the HiTZ center published the Latxa VL models. These models are based on Qwen3-VL models and possess vision capabilities. Currently, two sizes are available, both very small: models with 2B and 4B parameters. The tests presented below were performed with the 4B model. Models that know Basque Generally, models of such small size tend to only have the ability to answer decently in English. These Latxa VL models, however, have been trained to perform in Basque. ...

February 7, 2026 · map[name:Xabi Ezpeleta]
Kimu eredua

Kimu

Orai has just released a language model called Kimu. Based on Gemma 2, they have created models with 2B and 9B parameters. They have successfully injected the knowledge required to speak and understand Basque into the base model. I have converted both Kimu models to the GGUF format and published them on Hugging Face. This makes it possible to use the Kimu model with applications like Llama.cpp or Ollama. ...

October 13, 2024 · map[name:Xabi Ezpeleta]
Real-time transcription

Real-time Transcriptions in Basque

In international conferences and events, it is increasingly common to have automatic real-time transcriptions available during live presentations. These technologies are very valuable as accessibility measures. Not only for people with hearing problems or cognitive difficulties but also for those who do not fully master the language. Is something like this possible in Basque? So far, the examples I’ve seen are in English. Taking advantage of the fact that I recently published new Whisper models for Basque, I decided to conduct a small experiment. Using a small and lightweight model like whisper-tiny-eu, would we be able to create automatic real-time transcriptions in Basque? ...

March 4, 2024 · map[name:Xabi Ezpeleta]
Speech-to-text

Basque from speech to text: improving transcription models

Speech-to-text (STT) technology converts spoken language into written text using natural language processing. These systems are becoming increasingly important in digital interfaces, accessibility solutions, and various communication platforms. Since 2022, I’ve been fine-tuning the original Whisper speech recognition model for the Basque language, using the Mozilla Common Voice dataset. Compared to the original models, I’ve seen significant improvements in performance. As the Mozilla Common Voice initiative has grown, the model’s accuracy has continued to improve. ...

February 27, 2024 · map[name:Xabi Ezpeleta]