Cool Secret Llama: How to Chat with Generative Models from Web Browsers

When using i chatbot available on the Web there is always a data transfer between the user’s client and the service provider’s servers. Think for example of ChatGPT: not only the generative model (GPT, Generative pre-trained transformer) is running on OpenAI’s servers but the company led by Sam Altman can use i prompt, i.e. the information provided by users as input, to improve the model. The authorization to use the data provided by the user for thecontinuous training of the model can be revoked by acting on the ChatGPT settings.

Secret Llama, an intelligent chatbot that allows you to use the main generative models without downloading anything

These days the presentation of Secret Llamaan innovative project that allows the use of generative models such as TinyLlama, Llama 3, Mistral 7B e Phi 1.5 directly from your web browser. The main one advantage is that it is not necessary to use any server or a machine dedicated to the purpose: the Secret Llama chatbot relies entirely on the user’s browser, although it requires – as essential requirements – Chrome/Edge and a discrete video card (the GPU possibly integrated at processor level).

Open source project, shared on GitHub, Secret Llama was born following the publication of Web LLM. Web LLM is a modular, customizable JavaScript package that brings chat language models directly to web browsers, leveraging hardware acceleration. Everything is done within the ei browser workload they are supported by the WebGPU library, a low-level API that allows applications running in the browser to “talk” to the GPU on your system.

Web LLM is fully compatible with the OpenAI API. This means that to communicate with anyone open source model locallyit is possible to use the same “syntax” adopted to send requests to OpenAI GPT models.

The main features of Secret Llama

In addition to the fact that no dedicated machines are needed, Secret Llama boasts another important feather in its cap: no data of the user and no information relating to the conversations never leaves the computer. This is because Secret Llama downloads the generative model chosen on a one-off basis to answer the question entered, then always uses the copy stored locally. It is therefore not necessary to send a single bit to remote servers.

The “litmus test” is very simple: after leaving download the template to Secret Llama (messages will be shown Loading model e Fetching param cache), you can disconnect your computer from the Internet. You will see that also in offline mode, artificial intelligence will continue to respond to requests; precisely because the generative model is stored locally and queried in real time via Web LLM.

Finally, Secret Llama is characterized by ainterface very simple to use: at the bottom there is the classic box for entering the input to be sent to the chatbot; with the button at the top right you can start a new chat; the drop-down menu on the right allows you to choose the LLM (Large Language Model) favorite.

Obviously, the “leaner” models such as TinyLlama and Phi 1.5 – especially in processing chats in Europen – tend to offer less reliable, relevant and reasoned results. You can compare the output obtained with those produced by models with a greater number of parameterslike Mistral 7B and Flame

How to use the chatbot in your web browser

Starting to use Secret Llama is as simple as connecting to the project home page and starting to type a question, using the natural languagein the field Message down. As noted above, Secret Llama will first download the template and place it in the cache locale del browser.

In the case of some templates, it may be necessary to launch the web browser executable with an additional option: Secret Llama clearly states the syntax to use.

Download operations and caching Larger LLMs are obviously quite expensive and can take longer. Even by closing the Secret Llama tab in the browser and then reopening it later, you will not have to download the model again.

At this point, once the initialization phase has been completed, it is possible converse with the chatbot as you would with ChatGPT or any similar product.

Developers can too recompile the project Secret Llama from React code. The instructions for proceeding in this direction are given in this guide.

Opening image credit: iStock.com – BlackJack3D

Cool Secret Llama: How to Chat with Generative Models from Web Browsers

Secret Llama, an intelligent chatbot that allows you to use the main generative models without downloading anything

The main features of Secret Llama

How to use the chatbot in your web browser

Leave a Reply Cancel reply