Google I/O: Gemini Omni Model Introduced for Multimodal Video Generation

Google launched Gemini Omni, a new family of multimodal AI models capable of processing various inputs like text, images, and audio to generate video. The initial version, Omni Flash, allows users to create short videos and edit them with text commands. The technology is also being integrated into YouTube Shorts to allow users to remix clips with AI.
Google I/O: Gemini Omni Model Introduced for Multimodal Video Generation

Google I/O: Gemini Omni Model Introduced for Multimodal Video Generation Google is turning its flagship AI model into a full-blown video engine, raising fresh excitement and concern over how easily realistic clips—and deepfakes—can now be made.

At Google I/O, the company unveiled Gemini Omni, described as a “new family of multimodal models” designed to “create anything from any input,” with an initial focus on video. The first release, Omni Flash, lets users combine images, audio, video and text so the system can reason across them and generate a single, coherent output, including short clips and photo edits from plain-language prompts.

In public messaging, Google executives framed Omni as a major technical leap. CEO Sundar Pichai said the model “doesn’t just build scenes that look real, it reasons about what should happen next,” combining “an intuitive understanding of physics” with broader knowledge of “history, science, and cultural context.” DeepMind chief Demis Hassabis called Omni “a major leap in world understanding & multimodal editing,” claiming it can take “photos, video & audio and build entirely new scenes” and will “over time” handle “any input & any output.” He added that Omni Flash is “super fun to play with” and available in the Gemini app and Google’s Flow video platform.

Early hands-on tests highlight both the promise and the flaws. TechCrunch reported that Gemini Omni “turns images, audio, and text into video — and that’s just the start,” emphasizing its ability to reason across modalities to produce high‑quality, consistent clips. The Verge, however, found the results “such a mixed bag they’re baffling,” noting more consistency than Google’s previous Veo model but also “AI jump scares,” like characters suddenly flipping orientation mid‑scene.

Omni is also moving quickly into consumer platforms. On YouTube, a new Shorts Remix feature powered by Gemini Omni lets users “restyle clips or even insert themselves into other people’s videos,” turning footage into pixel art, anime, or horror, or inflating heads and adding costumes. Creators can disable remixing, and Google says all Omni‑remixed shorts will carry a digital watermark and a link back to the original video in an attempt to curb abuse.


[1] TechCrunch – “Google’s Gemini Omni turns images, audio, and text into video — and that’s just the start”
https://techcrunch.com/2026/05/19/googles-gemini-omni-turns-images-audio-and-text-into-video-and-thats-just-the-start/

[2] @sundarpichai on X – “Gemini Omni doesn’t just build scenes that look real, it reasons about what should happen next. It combines an intuitive understanding of physics with Gemini’s knowledge of history, science, and cultural context. Rolling out today starting with video outputs to Google AI Plus,…”
https://twitter.com/sundarpichai/status/2056816915717443862

[3] @demishassabis on X – “Gemini Omni is a major leap in world understanding & multimodal editing! It can take photos, video & audio and build entirely new scenes. Over time it’ll be able to handle any input & any output - starting w/ video You can even give it your own videos & iterate…”
https://twitter.com/demishassabis/status/2056831486251380783

[4] @demishassabis on X – “It’s super fun to play with - can’t wait to see what people create! Try Gemini Omni Flash in @GeminiApp and @FlowByGoogle More info in the blog: https://t.co/NLvsxWySMS https://t.co/vAml9DM2yJ
https://twitter.com/demishassabis/status/2056831487836889178

[5] The Verge – “Google’s new anything-to-anything AI model is wild”
https://www.theverge.com/tech/936507/gemini-omni-hands-on-deepfake-ai-video

[6] The Verge – “You can now remix other people’s YouTube Shorts with AI”
https://www.theverge.com/tech/934704/google-gemini-omni-youtub-shorts-remix-ai

Continue reading https://foxvector.com

Write a comment
No comments yet.