14 articles
4,580 Minds Enriched
Boost Your Knowledge

Google is Leveling Up With AI Videos

Darren Wilden
By Darren Wilden
8 min read

Google Gemini Veo 2 is simply amazing, even though it's "only" a Text-to-Video generator... for now! However, there is still room for improvement. Google has truly found its confidence and is picking up speed after some initial trial and error in the early stages of commercializing its AI products. It's important to remember that Google has always been at the forefront of AI development and a leading player, even if its advancements weren't publicly released until a couple of years ago with the introduction of Bard in March 2023. Bard didn't achieve significant success on its own, and in February 2024, Gemini was launched, which is the AI tool we currently experience within the Google ecosystem. This situation highlights how the release of ChatGPT forced every major tech company out of its comfort zone, which may ultimately be a positive development.

Google Gemini Veo 2 AI Video Generation Interface
Google's Gemini Veo 2 represents a significant advancement in AI-powered video creation, allowing users to generate high-quality videos from text prompts with impressive speed and visual fidelity.

It appears Google is becoming more proactive in launching new updates. From a user's perspective, there's no clear indication of when these updates or any upcoming events announcing their release will occur, or even if they will eventually be integrated into the Google ecosystem. We receive small teasers but never any firm guarantees. The pace of these releases has undoubtedly shaken the traditional yearly big tech announcements of upcoming features. Nevertheless, you can often get a sneak peek at Google's current projects by visiting Google Labs. Alongside Google AI Studio, Google has created these small testing grounds where you can try out some of their products. Keep in mind that access to some of these playgrounds might be restricted to certain countries, which you might be able to bypass using a VPN connection.

March 2023

Google Bard Launch

Initial foray into the conversational AI space following the success of ChatGPT, but with limited features and capabilities.

February 2024

Gemini Rollout

Significant upgrade from Bard, with deeper integration into Google's ecosystem and enhanced capabilities.

2025

Gemini Veo 2 Release

The latest advancement bringing text-to-video generation capabilities to Google One subscribers, creating high-quality videos from text prompts.

The Acceleration of AI Development

Today, we are seeing updates released much more frequently, and from a user's perspective, this is genuinely good news. It provides us with improved products and systems at a much faster rate, which ultimately can lead to numerous new opportunities. However, the speed is sometimes so rapid that this article might become outdated before I even manage to post it. This illustrates the fast pace of AI productivity, which isn't likely to slow down, in fact, it's poised to accelerate.

720p
Video Resolution
8s
Video Duration
<2min
Rendering Time

Welcome Google Gemini Veo 2

Currently, Google Gemini Veo 2 is limited to Text-to-Video generation, meaning it doesn't support Image-to-Video like Wan 2.1 (free) and FramePack (free), to name just a couple of free local generators. Kling and Runway are also excellent paid alternatives that have been around for a considerable time, allowing them to accumulate intelligence based on user experience and identify areas for improvement.

When you generate a video using Text-to-Video, you need to know precisely what you're aiming for, which involves preparing all the necessary instructions (prompt) down to the finest detail. The significant advantage here is that you're not restricted to any specific photo or frame. Once you've entered your prompt and initiated the process, you can simply sit back and let Veo 2 do the work. This typically takes a couple of minutes for an 8-second sequence in 720p format, which is quite impressive.

Text-Driven Creation

Generate videos solely from text descriptions without needing image references or video editing skills.

Rapid Rendering

Create 8-second videos in under 2 minutes, powered by Google's vast server infrastructure for maximum efficiency.

Cinematic Quality

Enjoy professional-looking output with impressive color grading, visual effects, and cinematic lighting.

It will be interesting to follow what Veo 2 will evolve into and how this tool will influence social media platforms. Although currently limited to text prompts, I am sure it's only a matter of time before we see an Image-to-Video version as well. The upload icon is certainly present but currently disabled.

Subscription and Limits

To access Google Gemini Veo 2, I am using the Google One subscription plan. However, even with a subscription, it seems we don't have unlimited prompts when working with Veo 2. In my testing, I was limited to approximately 10 video generations per day. I'm uncertain if Google is exploring another payment plan for Veo 2. What I do know is that rendering a video demands significantly more server power than a typical chat conversation. This obviously increases Google's server farm operating costs, and I suspect they are gathering as much data as possible from these limited video generations to assess the impact and determine the average cost, but this is just speculation. Additionally, they need to consider the free tools emerging from China. I will certainly keep you updated as more news becomes available.

Official Usage Limits

This is what Google has announced regarding their limits: "There are limits for how many videos you can create. You'll get a notification when you're close to the limit." This doesn't provide a great deal of information about when we've reached our limits, nor does it indicate what to expect in the future. So, everyone's guesses are currently valid.

Feature Google Gemini Veo 2 Local AI Tools Paid Services
Cost Google One Subscription Free Monthly Subscription
Daily Limits ~10 videos Unlimited High Limits
Data Privacy Cloud Processing Fully Local Cloud Processing
Image-to-Video Not Available Available Available
Output Quality High (720p) Variable High

Getting Started

If you are already subscribed to Google One, here's how to get started:

How to Access and Use Google Gemini Veo 2

  • Visit Gemini Website

    Go to gemini.google.com.

  • Access Veo 2

    At the top, click Gemini Advanced and then Veo 2.

  • Enter Your Prompt

    In the text box, enter a prompt describing the video you want to generate.

  • Generate Your Video

    Click Submit and wait for your video to be generated.

Pro Tip

Once the video is generated, you can hover your mouse over it to download it. It doesn't include any watermarks, giving you clean footage for your projects.

Amazing Results

The videos that Google Gemini Veo 2 creates are quite stunning, and the quality and resolution are remarkably high. There's no doubt that Google has an advantage by owning its own server farms and being able to scale them as needed. For users, this translates to fast rendering speeds and superior quality from the AI tool. Although it currently outputs videos in 720 HD format, I have no doubt that AI upscalers can help elevate the quality further.

For creative and non-creative individuals who love to explore, this platform allows your thoughts to be visualized simply by typing in a prompt. All you have to do is sit back and watch it unfold before your eyes. I must also acknowledge that the speed at which it transforms these prompts into videos is truly fascinating.

Sample outputs from Google Gemini Veo 2 showing various scenes and styles
Google Gemini Veo 2 produces impressively cinematic results with lifelike motion, professional color grading, and atmospheric lighting effects - all from a simple text prompt.

Don't Expect Consistency with Text-to-Video

The nature of AI Text-to-Video generators is that they only have text as a reference. This means you need to be very precise to achieve your desired results, which might require one or several attempts before you reach your intended outcome, and there's no guarantee you'll ever fully achieve it. If you're determined to realize that one great idea you're stuck on, Text-to-Video generation shouldn't be your first choice, simply because of the inherent randomness in the video generator. Don't expect 100% accuracy when you choose this path, and don't blindly trust that repeatedly changing your prompt will eventually get you there. However, if your focus is more on style and visualization rather than strict accuracy, then Text-to-Video can work wonders.

Consistency is almost impossible to achieve with a Text-to-Video generator, and I don't believe it will ever be truly possible unless you have a specific LLM solely trained on particular individuals or scenes. Commercialized AI products are often trained on vast amounts of data, so be aware that the man and woman you generate walking in the park with one prompt will most likely look different in your next prompt. But again, consistency might not be what you're looking for in some videos, and in those cases, Google Gemini Veo 2 will work like a charm.

Consistency Challenges

When using text-to-video generation, expect variation between attempts even with identical prompts. For brand consistency or character continuity, Image-to-Video solutions (when available) will provide better results by anchoring the visuals to reference images.

Waiting for Image-to-Video

There's no doubt that the time-saving aspect from a business and practical standpoint will arrive with the release of an Image-to-Video version. When using an image or photo as a reference, we minimize the level of randomness, making it significantly more cost-effective for both businesses and Google. By reducing the number of attempts needed to achieve the desired results, we save valuable time that can be allocated elsewhere. From a server perspective, we also minimize the unnecessary load caused by repeatedly submitting prompts.

Reference-Based Creation

Start with an existing image and animate it into a cohesive video that maintains visual consistency.

Reduced Randomness

Achieve more predictable results by providing visual anchors for the AI to follow rather than relying solely on text interpretation.

Brand Consistency

Maintain visual identity and styling across multiple videos by starting with branded imagery.

There is room for both versions

Sometimes Text-to-Video will work perfectly well. You might not always have a reference image to work from, and in such situations, having the Text-to-Video option is invaluable. Also, when you simply need a quick visualization, Text-to-Video is definitely a great tool to have at your disposal.

Image-to-Video is ideal when you want to work more specifically with a subject. Instead of relying on a text prompt, you already have the concept visualized, allowing you to focus your energy on the desired actions within the scene.

There's no doubt that in the long run, Image-to-Video is more cost-effective. Businesses will require a certain level of consistency to maintain their corporate identity and brand awareness policies. Having an image or photo to work from will provide AI with a specific "rulebook" rather than requiring it to determine every detail, which is more efficient for servers and less time-consuming for users. When Google Gemini Veo 2 begins to support this feature, it will become even more appealing for businesses to have a subscription plan.

Text-to-Video Advantages

  • Completely open-ended creativity with no starting image constraints
  • Perfect for ideation and visualization of concepts
  • No need to create or source reference images
  • Can generate scenes that would be difficult to photograph
  • Great for quick exploration of visual styles

Image-to-Video Advantages

  • Greater consistency in character appearance and scene details
  • More precise control over visual elements
  • Fewer iterations needed to achieve desired results
  • Better for brand consistency and corporate identity
  • Time and resource efficient for specific visualization needs

GDPR and Sharing Data

Within the EU, we have laws protecting employees from having their data stored or shared. HR departments may need to include specific sections in their policies regarding the generation of AI content that could include staff members. There's no doubt that Google is also considering this, which could be one reason for not yet releasing the Image-to-Video option until their policies align with EU laws. This is just a speculation, but given the sensitivity of this issue, it's definitely something every company must consider.

When sharing data with a public chatbot, you are providing data that isn't stored by the company itself but by a third-party supplier. If Gemini Veo 2 is trained on this data, it could become problematic unless there's a legal loophole. However, this is a topic I will leave to the legal experts.

GDPR Compliance

Using cloud-based AI services requires careful consideration of data protection regulations, especially when processing images of identifiable individuals.

Employee Consent

Companies may need to obtain explicit consent from employees before using their likenesses in AI-generated content, even for internal purposes.

Policy Updates

Organizations should review and update their data protection policies to specifically address AI content generation and the handling of visual data.

Third-Party Processing

Using external AI services creates complexity around data ownership and processing agreements that must be carefully documented.

Business Considerations You Need to Make

When and if the Image-to-Video option is added to Google Gemini Veo 2, I believe it would be prudent for any company to re-evaluate and potentially update any GDPR policies that might encompass AI-generated content featuring staff members. It's always better to be safe than sorry in this regard. This could very well mean prohibiting employees from sharing photos with a public AI tool that includes one or more staff members (this would be my recommendation). If you want to create videos involving staff members, consider using a local AI such as FramePack or ComfyUI. Both of these tools operate locally and do not share data with any third-party vendors. Perhaps this is also the time to consider data ownership. Should only the marketing department be allowed to generate AI content? Who governs the generated data? These are definitely topics worth discussing. From years of experience, I know it's always good practice to keep these policies up-to-date and to seek appropriate legal advice when updating them.

Recommended Corporate Policy Updates

  • Review Current GDPR Policies

    Examine your existing data protection frameworks to identify gaps related to AI content generation.

  • Define Access Controls

    Determine which departments or individuals should have authorization to generate AI content using company or employee data.

  • Establish Content Guidelines

    Create clear rules about what types of imagery and videos can be generated and for what purposes.

  • Implement Training

    Ensure all staff understand the implications of using cloud-based AI tools and their responsibilities regarding data protection.

Conclusion

Google Gemini Veo 2 is truly remarkable and a very powerful video generator. Despite its current limitations, I must say it's quite astonishing what it can create. The 8-second footage is very professional, with impressive color grading, visual effects, and cinematic lighting. The speed is simply amazing; it generates an 8-second video in under two minutes at 24 fps. However, I should be transparent and mention that this conclusion is based on only 20 prompts, so there might be issues I haven't encountered. Nevertheless, the outputs I've seen are truly stunning. I also appreciate how it enables anyone to realize their creative ideas without prior video experience. However, to produce high-quality content, some video expertise is still beneficial, so I would definitely recommend involving the marketing team.

The limits become somewhat frustrating in the long run. Having to wait 24 hours before you can experiment again makes it quite impractical for businesses. And the lack of clarity regarding the specific limit doesn't help either. It would be beneficial if Google could devise a plan that accommodates different usage levels. Typically, when limits are implemented, there's a way to extend them. This doesn't seem to be the case here, which leads me to believe it's still more suited for a sandbox environment like Google AI Studio. While there's nothing wrong with showcasing the latest tools and being proud of them, I think Google might have moved too quickly without a clear and transparent business plan that users can easily understand. Of course, I recognize the intense competition and the necessity for trial and error, but with Gemini fully integrated into their ecosystem, most users will expect a certain level of transparency, especially when dealing with tools from one of the biggest tech giants. But we will simply have to wait and see what the future holds.

Conceptual image showing the potential future of Google Gemini Veo 2
As Google continues to develop Gemini Veo 2, we anticipate expanded capabilities, possibly including Image-to-Video conversion, higher resolution outputs, and increased usage allowances to better serve business and creative users alike.

In terms of following prompt instructions, there is definitely room for improvement. It sometimes struggles to grasp all the nuances you're trying to achieve, so try to keep your prompts concise and focus on the essentials. However, generally speaking, I think it's fair to say it mostly does an amazing job. Only highlight what you need without unnecessary explanations. If you find that it doesn't follow your instructions, be sure to give the result a thumbs down so Google can refine their tool. This also includes giving the result a thumbs up when Veo 2 does an excellent job of interpreting your prompt.

I do miss the option to adjust settings like creating vertical or horizontal videos, but based on experience, this functionality works best with Image-to-Video. However, this is a new release from Google, and they have now entered the AI video rendering arena within their ecosystem. I am confident we will see more updates in the near future.

Having said all of the above, does it justify a Google One subscription plan? Well, as a Google One subscriber, you gain access to many other AI features beyond just a Text-to-Video generator, so I would say it's a fair value proposition. You also receive more Google Drive storage. There's no doubt that Google is working diligently to maintain its position as a leading tech giant, and I must say they are doing an excellent job. They had a slow start, but lately, their pace has accelerated, and their achievements are moving beyond the traditional chatbot. In terms of advanced coding, Gemini still has some ground to cover, but from a workflow perspective, they have successfully integrated Gemini into their ecosystem, seamlessly working with Gmail, Docs, Sheets, Drive, etc.

Gemini Veo 2 Strengths

  • Impressive video quality with professional color grading and lighting
  • Extremely fast rendering (under 2 minutes for 8-second videos)
  • No watermarks on downloaded videos
  • Seamless integration with Google's ecosystem
  • Straightforward interface requiring minimal technical knowledge

Areas for Improvement

  • Daily generation limits are too restrictive for business use
  • Lack of Image-to-Video functionality
  • No clear pricing or upgrade path for increased usage
  • Limited output customization options (resolution, aspect ratio)
  • Inconsistent results when reusing similar prompts

Happy Rendering and let's make those creative ideas come to life.

Discussion

Join the conversation! Please log in or sign up to comment.

or

Other Interesting Articles

Supercharge Your Workflow

Supercharge Your Workflow

In today's fast-paced business landscape, staying competitive isn't just about working harder, it's ...
Read More
Nari Dia is an AI conversation generator

Nari Dia is an AI conversation generator

Nari released an AI conversation generator that's not only impressive but pushes the boundaries of w...
Read More
AI Powered Assistant

AI Powered Assistant

Ever find yourself staring blankly at a new or complex application, wishing you had a helping hand t...
Read More

Share this page

Share this article

Delete Comment

Are you sure you want to delete this comment? This action cannot be undone.