Cohere presents Aya, a revolutionary LLM covering over 100 languages

Cohere presents Aya, a revolutionary LLM covering over 100 languages

In the last two years, i linguistic models large (LLM, Large Language Models) have revolutionized the field of artificial intelligence. Most of them, however, were developed using primarily corpus of information available in English. An aspect that has led to accuracy problems in the generation of sentences in other languages, including Europen.

The Canadian startup Coherewhich recently secured $1 billion in funding, just unveiled Aya, an LLM that covers more than twice as many languages ​​as existing open source models. Just recently we highlighted the main open source alternatives to OpenAI’s ChatGPT chatbot and its underlying ones Generative pre-trained transformer (GPT). The idea of ​​transformer, moreover, was born from a work developed and published in 2017 by a Google team.

What Cohere Aya is and how it works: here is its behavior in Europen

Cohere for AI, a non-profit research organization linked to the startup Cohere, developed Aya involving 3,000 researchers from 119 countries. The new LLM, which can now be publicly demonstrated, is designed to cover over 100 languages, going well beyond the limitations of pre-existing models.

Il dataset used by Cohere for AI is extremely broad, including data automatically translated into over 100 languages. Among these, half represent languages ​​that cannot be learned adequately using existing textual datasets. In short, the idea was to give a strong push for the use of generative models based on AI and natural language, also for all those individuals who speak and write less widely used languages. An initiative that promises to reduce linguistic disparities and increase the accessibility of technology to all previously overlooked communities.

According to the software engineers at Cohere for AI, Aya would far surpass models such as mt0 e bloomz in the main ones benchmarkconsistently achieving 75% in human ratings and an 80-90% simulation win rate compared to other market-leading open source models.

An enormous language model, which also convinces with the types of input used

Sarah Hooker, vice president of research at Cohere, reveals that in its early days, supporters of the project had no idea how far it would evolve. In the end, however, the dataset it has become enormously extensive and has proven incredibly accurate and reliable for generating relevant and well-argued output.

Aya models are hosted in the Hugging Face repository under the Apache 2.0 license, and a dataset containing 513 million input types in 114 languages ​​it is also available on the same platform and with the same license. The term “input types” refers to the various categories or forms of data used to train or interact with a language model. The 513 million input types indicate the presence in Aya of a vast set of information used to train and evaluate the language model.

Each “input type” can be thought of as a sentence, a question, a description, or any other piece of text that the model must understand, process, or can use to generate responses.

Cohere Aya: how to try it

How to try the LLM Cohere Aya

To put Aya to the test right away, you can visit the official project page then click on the “Try Aya in the playground“. After logging in to dashboard of Cohere (with a Google account, GitHub or via email), you must then select Go to playground. To try directly Aya with Europen supportyou can paste this address into the address bar of your web browser.

By acting on the parameters contained in the right column, you can choose the model to use, the maximum number of tokens obtainable in output, decide whether the LLM can be more or less creative (Temperature) and more.

With one click on the button Runyou can start generating the text starting from prompt provided by the user.

Cohere Aya API

Thanks to API that Cohere makes available, with a click on View code You have ready-to-use code examples available in Python and JavaScript. There is also syntax to use from the terminal window by sending a request with the curl command.

Leave a Reply

Your email address will not be published. Required fields are marked *