3 min read

Tiny Aya

Cohere released Tiny Aya, a new family of multilingual models designed to democratise access to AI researchers in low-resource nations.

Tiny Aya
Photo by Hannah Wright / Unsplash

Most large language models are not evenly multilingual. Even among non-English languages, there are higher resourced ones, like Mandarin or French. What if you speak Thai, Nepali, Basque or Wolof? Most major models don't work as well.

Along with low resource languages, we also face the issue of speakers of low resource languages not usually having access to computing resources and infrastructure. Major LLM models are massive, and usually require many more tokens for other languages, especially those with non-Latin scripts. They also tend to not have been as well trained in lower resourced languages.

Cohere Lab's Tiny Aya, a family of open-weight multilingual models, was released today and addresses some of these issues.

Tiny Aya is small and efficient enough to run locally, even on consumer hardware and mobile phones all while delivering state-of-the-art translation quality and broad language coverage. For communities without access to large-scale infrastructure, this changes what's possible.

Tiny Aya is built on a 3.35 billion parameter architecture — small enough to run on a laptop or a mobile device, but covering over 70 languages, including many that are drastically underrepresented in standard pretraining corpora. Languages like Hausa, Khmer, Shona, Wolof.

The release includes several model variants, each tuned for different deployment contexts:

  • TinyAya-Base — the pretrained foundation, 70+ languages
  • TinyAya-Global — instruction-tuned across 67 languages, the strongest general-purpose option
  • TinyAya-Earth — deeper coverage of African and West Asian languages
  • TinyAya-Fire — specialised for South Asian languages
  • TinyAya-Water — strongest across Asia-Pacific and European languages

All weights are open. The release also includes a massively multilingual fine-tuning dataset and a detailed technical report documenting the training strategy, evaluation methodology, and design decisions.

Why It's Worth Paying Attention To

We've spent a lot of time thinking about who AI is actually for, and this release touches on something we've been watching.

Most models' multilingual efforts feel like an afterthought: bolted on top of fundamentally English- or Western- centric architectures. Tiny Aya appears to take a different approach. The team built multilingualism into the model from the ground up, starting with the tokenizer. They designed it specifically to reduce fragmentation across diverse scripts and linguistic structures, meaning a sentence in Gujarati or Tamil doesn't get inflated into twice as many tokens as the equivalent in English.

That matters more than it might sound: higher token counts translate directly into greater memory consumption, slower inference, and degraded user experience.

The benchmarks are promising Across translation, language understanding, mathematical reasoning, and open-ended generation tasks, Tiny Aya advances the state of the art for its parameter class in languages from West Asia and Africa. What stands out isn't just peak performance but consistency. The model appears to maintain stable quality even for languages with minimal web representation in CommonCrawl, which is where most pretraining pipelines draw their data. Independent evaluation will tell us more, but the initial results are promising.

The open-weight release is significant too. Open weights mean researchers can inspect, fine-tune, and extend these models for their own languages and domains without depending on a corporate API.

How to Try It

If you want to explore, the Hugging Face Space lets you interact with the model directly in your browser.

If you want to run it locally, download the weights from the Hugging Face collection. At 3.35B parameters, the model fits comfortably on most modern machines with 8GB+ of RAM using quantized formats (GGUF, GPTQ, or similar). If you've deployed Llama or Mistral locally, the workflow is essentially the same.

The Bigger Picture

The future of multilingual AI probably won't be one enormous model that handles everything adequately. It's more likely to be a diverse ecosystem of smaller, specialized models shaped by the communities they serve, running on the hardware those communities actually have access to.

Tiny Aya is a step in that direction. How large a step depends on how the research community picks it up, stress-tests it, and builds on it. But the ingredients are there: open weights, efficient architecture, serious multilingual research, and a model that can actually run where it's needed most.


Tiny Aya models, datasets, and the full technical report are available on Hugging Face. Cohere's blog post also contains important information on training, accessibility and model size.

Subscribe to my newsletter

Join our newsletter for AI safety news and research updates