In recent months, Local LLMs have significantly improved, and some of them perform surprisingly well with Basque language. Among them, I want to highlight Tülu 3 70B, which shows good results in Basque when using the quantized version (q4_K_M). Until the Latxa instruction model becomes available, this is probably the best option for having conversations or generating text in Basque.

Conversation in basque with Tulu3 model, using Open-WebUI

An external resource has been used to answer a question, using RAG

Components for setting up the RAG system

The components needed to set up a RAG (Retrieval Augmented Generation) system:

  1. Local LLM: Tülu-3 70B q4_K_M
  2. Embedding model: Snowflake/snowflake-arctic-embed-l-v2.0
  3. User interface: Open-WebUI
  4. Ollama: For managing LLMs and embedding models

Advantages

The main advantages of this system:

  • Privacy: Everything runs locally, data stays on your machine
  • Basque support: Tülu-3 model performs well with Basque language
  • Free: All components used are open source and free
  • Easy: Thanks to Open-WebUI, everything can be managed from a graphical interface

I will update this article when the Latxa instruction model is released, but for now, Tülu-3 is an excellent choice for setting up a RAG system in Basque.