Introducing Gemini Omni, our new AI model that can create anything from any input, starting with video | Google | 33 comments

LinkedIn respects your privacy

LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

View organization page for Google

41,682,015 followers

2d

Gemini Omni is our new AI model that can create anything from any input, starting with video. 🪄 Hear from Demis Hassabis on how you can mix text, audio, and images to generate and edit high-quality videos just by having a conversation.

33 Comments

Transcript

I'm excited to announce Gemini Omni. Our new model that can create anything from any input. It combines Geminis intelligence with the best of our generative media models for a new level of world understanding, multimodality and editing. Models like Bio, Nano, Banana and Genie are able to create extremely realistic videos, images and interactive simulations. Although not perfect, they already demonstrate some impressive notions of intuitive physics. And with Omni, we've now made even more progress. It's a step change in simulating things like kinetic energy and gravity. Previous systems would have found these concepts difficult. Gemini's world knowledge and reasoning really shine in Omni. It can translate complex ideas into highly accurate videos. So for example you can give it a simple prompt like make a claymation explainer or protein folding and get this. Proteins start as chains of amino acids. They fold into patterns like the alpha Helix and flat sections called beta sheets, forming a perfect three-dimensional shape. But the initial generation is just the start. The creative process is rarely a single step, it's usually iterative. Just like Nana Banana redefine image editing, Omni gives you a more natural way to edit video with conversational language. What's really cool is you can give it your own videos, for example this selfie, and change reality in a really fun way. You can easily adjust the details and style or even add elements. And the whole scene walked into reflect your new idea. Today, we're launching the first model in the Omni family, Gemini Omni Flash. It's now available across our products and you'll hear more about this later. We're excited with the progress we're making and we'll be able to share more about Omnipro soon. We can't wait to see what you create.

Byounghee Lee, graphic

Byounghee Lee 2d

The Omni capability people will actually use is not creating from a blank prompt. It is editing by asking. That is where multimodal stops being a generation problem and becomes a feedback loop. Feedback loops are bounded by context persistence and how fast the runtime can act before the user's attention moves on. The model creates the media. The runtime will decide where multimodal creation lives.

Vicky Price, graphic

Vicky Price 2d

What stands out is how quickly AI is evolving from a prompt-response tool into a continuous interaction layer. The shift toward agentic systems, multimodal reasoning, and autonomous execution feels like a much bigger transition than incremental model upgrades alone.

BraivIQ AI Agency, graphic

BraivIQ AI Agency 1d

The multimodal angle that matters most for businesses isn't generating from scratch. It's the ability to iterate on existing assets through conversation rather than rebuilding them. That changes the economics of content production, training material creation, and customer-facing media. The question for most operators isn't "can we create video with AI?" It's "how does this fit into the workflow we already have, and who owns the output?"

Sadeem Mueen, graphic

Sadeem Mueen 2d

Google This is the true transformation we've been waiting for: from an "AI tool" to a "real-life creative dialogue partner." Integrating text, audio, and images into a single model not only simplifies production but also redefines the very meaning of "editing"—we're no longer dealing with a timeline, but with a live conversation. What's most interesting is that starting with video isn't arbitrary; video is the most complex and information-rich medium. If the model masters video, it means it's beginning to understand the world in its natural form: time, space, relationships, and the logical sequence of events. The fundamental question now isn't "What can we create?" but "How will this change the very nature of human creativity?"

Masud Parvez, graphic

Masud Parvez 1d

The nuance people miss is that "any input to any output" sounds like a feature announcement, but it's actually a platform strategy. Google is positioning Gemini as the connective tissue across modalities - the same way APIs became infrastructure. The question worth sitting with is whether creative professionals will adopt conversational workflows or resist them the way designers initially resisted templates. Behavior change is always the harder problem than the technology itself.

Atoyebi Kayode, graphic

Atoyebi Kayode 2d

Google Most people are focusing on the quality of the AI outputs. The deeper shift is that Google is collapsing the entire creative stack into one conversational interface. When text, image, audio and video stop being separate workflows, the real competitive advantage moves from “production skill” to “idea velocity.” That changes creative industries more than the model itself.

Idan Furda, graphic

Idan Furda 2d

the concept that AI is becoming so much more realistic by incorporating actual physics is amazing and terrifying at the same time

Gerald Enriquez, graphic

Gerald Enriquez 2d

Hello! I am Gerald Grey Enriquez, I'm sorry for asking, but I'm a financially struggling working student in the Philippines seeking support for my education. I'm a first-year student balancing a 6PM-6AM call center job and 7AM classes to keep up with expenses. And recently my girlfriend got infection in her uterus so I'm trying to help help them out as well.. I'm sorry my parents are already at retirement age not meaning that they have money. but they're too old to continue working I'd really want to graduate before they die, I love them very much and I'm trying my best to finish school and support them someday. Any help means a lot. Thank you I really hope you can help me out and have the time, I can send all the details like student enrollments and ID https://gogetfunding.com/support-my-education-22/

Social Monk Digital, graphic

Social Monk Digital 1d

Gemini Omni is our new model that creates anything from any input, starting with video. Hear Demis Hassabis explain how text, audio, and images can be used to generate and edit video through conversation.

See more comments

To view or add a comment, sign in

BERJAYA

Google

41,682,015 followers

View Profile Connect

More from this author

Explore content categories