The real product is a pipeline, not two models

Read as separate launches, Nano Banana 2 Lite and Gemini Omni Flash look like a routine tier expansion and a video model finally reaching the API. Read together, they describe a single chained workflow: generate a still with the fast, cheap image model, then hand that image to Omni Flash as a reference to animate into a high-quality clip [1]. The connective tissue is the Interactions API, which preserves session history and context across turns, so a creator can iterate on a scene without re-establishing the whole request each time [1]. Google has effectively packaged an image-to-video assembly line and exposed it as an API primitive.
That framing matters because it changes what the pricing means. A four-second, $0.034 image is not just a cheap picture; it is the low-cost front end of a video pipeline whose back end runs at $0.10 per second of output [3]. Developers can afford to generate and discard many candidate frames before committing to the expensive animation step. The multi-turn design, allowing up to three consecutive edits with preserved context, is what makes that iterate-then-commit loop practical rather than a series of disconnected one-shot calls [3].



