Google’s Gemini AI has introduced a photo-to-video feature that allows users to transform still photos into dynamic, eight-second video clips complete with synchronized audio, including dialogue, sound effects, and ambient noise. This capability is powered by Google’s latest video generation model, Veo 3.
Let’s look at How it works:
- Users select the “Videos” option from the tool menu in the Gemini app or web interface.
- Upload a photo and provide a text description of the desired movement and audio instructions.
- Gemini generates an 8-second video in MP4 format, 720p resolution, and 16:9 aspect ratio.
- The videos include a visible watermark indicating AI generation and an invisible SynthID digital watermark to prevent tampering.
Availability:
The feature is rolling out to Google AI Pro ($19.99/month) and Ultra ($249.99/month) subscribers in select countries.
Initially available on the Gemini web platform, with mobile app support coming shortly.
Not available in the European Economic Area, Switzerland, or the United Kingdom yet.
Use case samples:
Animate everyday objects, illustrations, artworks, or nature scenes.
Add creative audio layers such as spoken dialogue or environmental sounds to bring photos to life.
Safety and quality:
Google employs extensive red teaming and policy enforcement to prevent misuse and unsafe content.
User feedback via thumbs up/down buttons helps improve the experience.
All videos are clearly marked as AI-generated for transparency.
This feature builds on Google’s existing Flow AI filmmaking tool, integrating video generation directly into Gemini for a more seamless user experience. Gemini’s photo-to-video feature offers a powerful, creative tool for turning static images into vivid, short videos with sound, accessible to paying subscribers in many countries worldwide.