| iMini AI

Gemini Omni may be the most significant AI model Google has ever built — and it hasn't even been officially announced yet. Ten days before Google I/O 2026 (May 19–20), a UI string surfaced inside the Gemini interface pointing to a new unified model capable of generating text, images, and video in a single pipeline. Here's everything we know: the discovery, what it means, how it stacks up against today's best models, and what to expect at the keynote.

How Gemini Omni Was Discovered

On May 2, 2026, X user @Thomas16937378 spotted an unusual string inside Gemini's video generation tab: "Start with an idea or try a template. Powered by Omni." The phrase appeared right next to "Toucan" — Google's internal codename for Veo 3.1 — making it immediately clear this was a separate, newer system.

This marks the first time Google has used a public-facing name change for its video generation tool. Every previous version kept the "Veo" branding. Switching to "Omni" is a deliberate signal that something architecturally different is coming, as Android Authority first reported.

The leak spread within hours, with leaked demo videos circulating shortly after, suggesting Omni already outperforms Veo 3.1 on prompt adherence and voice generation quality.

Three Possible Interpretations

The AI community has converged on three explanations for what Gemini Omni actually is:

1. A Veo Rebrand — Omni is simply a new consumer-facing name for the same Veo-powered pathway inside Gemini. Technically the least exciting, but the most conservative read of the leak.

2. A New Parallel Model — Omni is an entirely new model trained within the Gemini ecosystem, running alongside Veo 3.1 rather than replacing it. Developers would choose between the two depending on use case.

3. A True Unified Omni-Model — Omni is a single model capable of handling text, image, and video generation in one pipeline. This would make it the first top-tier omni-model with native video output, and represents a genuinely new product category. Leaked demos and the "Omni" naming both point toward this interpretation being the most likely.

Why a True Omni-Model Changes Everything

Every current video model — Seedance, Kling, Runway, Hailuo — specializes exclusively in video. A unified system would offer advantages that none of them can match:

A single prompt generating a coherent image and video sequence with visual consistency
Simplified developer workflows (one API, one model, one context window)
Better cross-modal understanding: the model knows what it drew before it animates it
Potential infrastructure cost reductions for teams running multiple specialized models

If Interpretation 3 is correct, Gemini Omni doesn't just compete with Veo 3.1 — it makes the entire single-modality video model category look narrow.

Gemini Omni vs Current AI Video and Image Models

Model	Developer	Type	Status	Best For	Max Resolution	Notable Edge
Gemini Omni	Google	Text + Image + Video	Upcoming (I/O 2026)	Unified multimodal generation	TBA	First omni-model with video
Veo 3.1 (Toucan)	Google DeepMind	Video	Available	Cinematic quality, native audio	4K	Best character consistency
Seedance 2.0	ByteDance	Video + Audio	Available	Lip-sync, multi-shot storytelling	4K	90%+ commercial usability score
HappyHorse-1.0	Alibaba	Video	Available	Benchmark-leading quality	4K	#1 on ELO rankings (May 2026)
Kling 3.0	Kuaishou	Video	Available	4K/60fps, multi-shot sequences	4K	Most natural motion physics
Runway Gen-4.5	Runway	Image + Video	Available	Reference image, camera control	4K	Best visual fidelity all-rounder
Midjourney V8.1	Midjourney	Image	Available	Photorealism, 2K output	2K	Fastest rendering in class

Other Google Leaks Ahead of I/O 2026

Gemini Omni isn't the only thing Google has quietly revealed before the keynote. Several other codenames and features have surfaced alongside it:

Gemini 3.2 and 3.5 — Performance-focused language model versions currently in internal testing
Gemini 3.1 Flash-Lite — Already launched on May 8, 2026, a lightweight speed-optimized variant
Teamfood — A long-term persistent chat memory feature coming to Gemini
Spark Robin — A visual model codename, possibly a companion image generation system to Omni

The volume of leaks suggests Google is running a broader AI platform update at I/O, not just a single model announcement. Gemini Omni appears to be the headline feature of a larger ecosystem refresh.

Expected Release Timeline

Based on Google's historical I/O patterns and the current leak density, here's what to expect:

May 19 (Keynote) — Official Gemini Omni announcement with live demo
May 19–20 — Developer documentation released alongside or immediately after keynote
Late May – Early June — Third-party platform integration (expect API access for Gemini Ultra subscribers)
June 2026 — Broader rollout including possible free-tier access with usage limits

Access will almost certainly be tiered to Gemini subscription plans, mirroring how Veo 3.1 is currently distributed: limited free access, higher resolution and longer durations behind Gemini Advanced.

Start Exploring Multi-Model AI Now

Gemini Omni isn't live yet — but the smartest move is to build your multi-model workflow before it drops. iMini AI already puts Kling, Seedance, Runway, Seedream, and more in one canvas, so you can compare outputs, iterate across models, and find what works for your specific content style.

When Gemini Omni launches, you'll be able to benchmark it against every current model without switching tools. No vendor lock-in, no wasted time. Start on iMini AI today and hit the ground running the moment Omni goes live.

FAQ

What is Gemini Omni?

Gemini Omni is an unreleased Google AI model leaked in early May 2026. It is expected to unify text, image, and video generation into a single pipeline inside the Gemini interface — a first for any top-tier AI model.

When will Gemini Omni be released?

Google I/O 2026 (May 19–20) is the most likely announcement date. A broader public release is expected within 2–4 weeks of the keynote, tiered to Gemini subscription plans.

How is Gemini Omni different from Veo 3.1?

Veo 3.1 (internally codenamed Toucan) handles video generation only. Gemini Omni is expected to cover text, image, and video from a single model, and early leaked demos already show improvements in prompt adherence and voice generation over Veo 3.1.

Will Gemini Omni be free?

Free-tier access is possible but will likely come with strict daily usage limits. Full resolution and longer video outputs are expected to require a Gemini Advanced subscription, consistent with how Veo 3.1 is currently priced.

Can Gemini Omni generate images as well as video?

Yes — if the "true omni-model" interpretation is correct, Gemini Omni will handle image generation alongside video, potentially replacing both Veo and Google's current Nano Banana image models in one unified system.

Is Gemini Omni the same as Spark Robin?

Not necessarily. Spark Robin is a separate visual model codename that leaked alongside Omni. They may be companion systems — Omni handling the unified pipeline, Spark Robin serving as a specialized image generation layer — but Google has not confirmed either product officially.