It’s a beautiful, balmy afternoon at Dolores Park in San Francisco, and I’m singing a birthday song to a prehistoric dinosaur. A cupcake with a pink candle magically appears in my empty hand as I finish my serenade. When I blow out the flame, a calm look of contentment washes over the CGI-esque creature.
While the man in this AI video looks and sounds just like me, the clip was actually generated using one of the new features available in Google’s Gemini app: avatars. These digital recreations are similar to the core features of OpenAI’s now-defunct Sora app. It’s a digital clone of you that can be inserted into AI videos. Avatars are powered by the company’s new Omni video model, and the feature is only available to subscribers.
I pay $20 a month for Google’s AI Pro plan and quickly maxed out Gemini’s usage limits, which reset every 5 hours. I simply asked a few questions and generated two 10-second clips featuring my avatar, before I was told to wait until later.
Video: Reece Rogers
My first two glimpses of what Omni can do with my likeness were of me singing to a dino in San Francisco and surfing under the Golden Gate Bridge. I was simultaneously impressed and freaked out. The content was cringeworthy, with some jumbled moments and nonsensical outfits, but that man in the video was me. I used my fingers to zoom in on its face and really watch the mouth move. The teeth were a bit off, but otherwise that’s Reece, right on down to the chin fat.
Unlike OpenAI, which previously let users decide whether they wanted others to generate AI videos using their likeness, Google only lets adult users make videos with their own avatar.
It took me about five minutes to set up my avatar through the Gemini app. The process involved sitting in a well-lit room with my phone’s camera pointed at my face and reading a string of two-digit numbers. Then I slowly looked to the right and swivelled my head to the left, and it was all over. Reece 2.0 was born and ready to be my deepfake star. (Be mindful of what you’re wearing during this process, since your fit will likely show up in the AI generations, but more on that later.)
Let’s break down the birthday clip frame by frame to really unpack my feelings here. Full prompt: Generate a video of me singing the happy birthday song to an aging dinosaur at the top of the hill at Dolores Park.
AI-generated clip by Reece Rogers
The first second starts with a millennial pause because even AI Reece has some ingrained habits. What’s most striking initially is the photorealistic setting. Rather than placing my avatar on some oversized hill at a random park, the background of Google’s AI video is remarkably similar to the actual location. From the palm tree-lined sidewalks to the looming Salesforce in the distance, it’s immediately evident which park is depicted here, even though the output isn’t perfect. It makes sense that a company known for mapping the planet could pull this off.
As AI me started to sing, with a less pitchy baritone than I can actually pull off, the first few bars seemed natural. I bounced my hands up and down on the beat, like a mini conductor. Then, I stutter on the word “to,” and Gemini cuts to a wider-angle shot as the real chaos begins. A vanilla cupcake appears randomly, and I exhale a cloud of smoke to blow out the celebration candle. (Honestly, how rude of AI Reece. It’s not your special day.)
