Programming

Google I/O 2024, lots of AI news: the most important announcements

Google I/O 2024, lots of AI news: the most important announcements

As expected, the conference Google I/O 2024 turned out to be entirely focused on the many (and some truly surprising) news from Big G in terms of artificial intelligence. The Mountain View company has presented new features for all its services, such as Photos, Workspace and Search, some of which will however only be available starting in the next few months and/or initially only in the USA.

Google I/O 2024: the most important announcements

Ask Photo

Over 6 billion photos are uploaded to every day Google Photo and for many users it is often difficult to find that precise photo they need at that moment. Using the new feature Ask Photoyou can communicate with Gemini with a voice request asking the AI ​​to perform a search to find the right result.

Google Photos - Ask Photos with Gemini

And thanks to multimodal capabilities of Gemini you can get relevant results even when photos are missing some information. Artificial intelligence is in fact able to understand the context and subject of photos to extract details.

Generative AI in Google Search

With function AI Overviews, searching on Google becomes even smarter, simpler and more complete. Already used billions of times through Search Labs, this provides a quick overview of a topic, with useful links for further information. As of now, the feature is available to all users in the United States.

As mentioned, the search becomes smarter and therefore users will no longer have to divide their search into multiple questions. In other words, now they can ask complex questions without the fear of crashing the search engine.

Another particularly interesting piece of news, which struck almost everyone during today’s conference, concerns the possibility of record a video to ask about anythingsuch as on a turntable you have just purchased but which doesn’t seem to work (this is the example proposed by Google on stage).

Thanks to Google Lenssearching with a video translates into time saved because with a video, in fact, it is even easier to describe a problem and get an overview with all the steps to solve it.

News for creators, including Veo

I see it is Google’s most advanced model for creating videos starting from text prompts. Can generate Full HD (1080p) videos with a wide range of cinematic and visual styles. The duration is over a minute. The company, after showing absolutely convincing examples, specified that the model creates uniform videos where people, animals and objects move realistically during filming.

Among other new features there is also Image 3, a model that generates images “with an incredible level of detail”, with a drastically reduced number of artifacts compared to the previous version. There are also new AI tools for musicians, all included in the suite called Music AI Sandbox.

Project Astra

An important news related to the future of AI assistants is Project Astra. And agent, explains the company, must be able to understand and respond to the complexity and dynamism of what surrounds people, just as people themselves do. He must then receive and understand what he sees to understand the context and, consequently, act. Finally, it must be proactive, teachable and personal, because the user must be able to communicate with him in a natural way.

All this was seen in the Project Astra demo shown during the conference, where both a smartphone and smart glasses were used which we will almost certainly hear about in the future.

Gemini Flash

Gemini 1.5 Flash is the latest model in the Gemini family and is also the fastest one available via API. It is optimized for large-scale, high-volume, high-frequency tasks, and despite being less powerful than 1.5 Pro, it has excellent multimodal reasoning capabilities over large amounts of information.

How Gemini 1.5 Pro improves

In addition to having extended the context window to 2 million tokens, the Sundar Pichai-led company has improved coding experiences, logical reasoning and planning, multi-turn conversation, and understanding audio and images. Given these upgrades, Gemini 1.5 Pro can therefore follow more complex and nuanced instructions.

Google’s AI built into Android

After the debut on the latest generation top-of-the-range smartphones from the Samsung brand, from now on the function Circle and Search it can also be used by students to complete homework. «Imagine that a student is struggling with a math or physics problem. By circling the question, he will receive detailed instructions to solve it, without leaving the program or digital information sheet he is working on», spiega Big G.

In the coming months, it will be possible move the Gemini overlay over open apps to make the most of AI (drag and drop generated images into Gmail or ask about a YouTube video, for example). Advanced subscribers will also be able to request clarification on a PDF document.

Gemini AI on Android

With regard to Gemini Nanoan integrated and on-device basic model, will be able to count on starting from the second half of 2024 comprehensive multi-mode functions. In Talkback, for example, it will help blind or visually impaired people receive clearer descriptions of the content of an image. But it will also be able to alert you to suspected fraud during calls (for example, detecting conversation patterns commonly associated with fraud).

Gemini Live

In the coming months the Subscribe to Gemini Advanced they will have access to Livea new mobile conversation experience that, using the company’s most advanced voice technology, makes conversations with AI more intuitive.

You will be able to speak at your own pace and even interrupt to ask for clarification, just like you do during a normal conversation with a real person.

Leave a Reply

Your email address will not be published. Required fields are marked *