Google Expands AI Portfolio with Gemma 4 Open Models and Vids Creator Tools

By: TechVerseNow Editorial | Published: Fri Apr 03 2026

TL;DR / Summary

Google has significantly upgraded its Vids app with high-end video and music models alongside a new open-source Gemma 4 model, while ElevenLabs has officially expanded from voice synthesis into full-scale AI music production.

Layman's Bottom Line: Google has significantly upgraded its Vids app with high-end video and music models alongside a new open-source Gemma 4 model, while ElevenLabs has officially expanded from voice synthesis into full-scale AI music production.

Introduction

The landscape of generative AI is shifting from experimental chatbots to specialized, high-fidelity creative tools. This week, Google announced a massive overhaul for Google Vids, its AI-powered video creation app for work, by integrating its flagship Veo and Lyria models to allow for granular control over digital avatars. Simultaneously, the search giant refreshed its open-source offerings with Gemma 4, moving to a more developer-friendly Apache 2.0 license. Not to be outdone in the audio space, ElevenLabs—previously known for its industry-leading voice cloning—has launched ElevenMusic, a dedicated app for generating and remixing full musical compositions. These releases signal a new era where "AI-generated" is no longer a novelty but a professional standard.

!Google Vids interface featuring directable AI avatars and video editing tools

Heart of the Story

The centerpiece of Google’s productivity update is the evolution of Google Vids. Originally launched as a basic video assistant for Workspace, "Vids 2.0" now leverages Veo, Google’s most advanced video generation model, and Lyria, its specialized audio model. The most striking addition is the "directable AI avatars." Unlike previous iterations where avatars were static or followed rigid templates, users can now use natural language prompts to instruct these digital presenters. For instance, a user can command an avatar to "look more persuasive" or "gesture toward the data on the right." This effectively turns a spreadsheet or slide deck into a professional-grade presentation without a camera or microphone.

Parallel to this, Google has updated its open-source ecosystem with Gemma 4. This model represents the first major update to the Gemma line in a year, focusing on efficiency and multimodal capabilities. By switching to an Apache 2.0 license, Google is making a strategic play to lure developers away from Meta’s Llama ecosystem, offering more freedom for commercial integration without restrictive "acceptable use" policies often found in other "open" models.

In the audio domain, ElevenLabs is undergoing a fundamental transformation. Known primarily for its "Speech to Speech" and voice-cloning technology, the company has released ElevenMusic. This new app allows users to create entire songs—complete with vocals, instrumentation, and structure—using only text prompts. It also includes a "remix" feature, where users can upload audio snippets and have the AI transform the genre or tempo. This move places ElevenLabs in direct competition with AI music pioneers like Suno and Udio, suggesting that the company aims to own the entire "audio stack" rather than just the human voice.

Quick Facts / Comparison Section


FeatureGoogle Vids (Veo/Lyria)ElevenLabs ElevenMusicGemma 4
Primary OutputEnterprise Video & AvatarsFull Music Tracks/RemixesLarge Language Model
Target AudienceCorporate/Business UsersContent Creators & ArtistsDevelopers & Researchers
Model TypeClosed (Workspace)Proprietary (App-based)Open (Apache 2.0 License)
Key InnovationDirectable text-to-avatarText-to-song with remixingMultimodal open weights

Quick Facts:
  • Google Vids: Now supports text-based commands for avatar body language and tone.
  • Gemma 4: Switched from a custom "Gemma Terms of Use" to the standard Apache 2.0 license.
  • ElevenLabs: Expansion signifies a move beyond voice into the multi-billion dollar music tech industry.
  • Timeline:

  • Early 2024: Google Vids enters limited beta.
  • Mid 2024: ElevenLabs previews music capabilities.
  • Today: Google Vids 2.0 launches with Veo/Lyria; Gemma 4 released; ElevenMusic app goes live.
  • Analysis Section

    These developments indicate a pivot toward "controllability" in generative AI. The industry is moving past the "surprise me" phase of AI generation toward precise creative control. Google’s directable avatars solve a major pain point for enterprise users: the need for professional video content without the high cost of production. By integrating Veo and Lyria directly into Workspace, Google is making AI an invisible layer of the modern office suite, rather than a separate destination like ChatGPT or Midjourney.

    The release of Gemma 4 under the Apache 2.0 license is equally significant. It suggests Google is feeling the heat from Meta’s dominance in the open-source community. By adopting a truly open license, Google is lowering the barrier to entry for startups to build on their architecture, potentially creating a wider ecosystem of "Google-native" applications.

    Meanwhile, ElevenLabs’ expansion into music is a defensive and offensive maneuver. As voice cloning becomes commoditized, owning the "soundtrack" of digital content allows ElevenLabs to offer a comprehensive suite for YouTubers, filmmakers, and advertisers. The next frontier will likely be the seamless integration of these tools—where a user could use Gemma to write a script, ElevenLabs to generate the score, and Google Vids to produce the final visual.

    FAQs

    Q: Can I use the new Google Vids avatars for personal use? A: Currently, Google Vids is primarily targeted at Workspace (Enterprise and Education) users. While features often trickle down to personal accounts, the advanced Veo-powered avatars are currently optimized for business presentations.

    Q: Is Gemma 4 better than GPT-4? A: Gemma 4 is an "open" model designed for efficiency and local deployment. While it is highly capable for its size, it is generally compared to models like Llama 3 or Mistral rather than the massive, closed-source GPT-4.

    Q: Does ElevenMusic own the rights to the songs I create? A: Ownership typically depends on your subscription tier. Most generative AI audio platforms grant commercial rights to paid subscribers, but users should check the specific terms of service regarding copyright and training data.

    Q: What is the "Apache 2.0 license" and why does it matter for Gemma 4? A: It is a highly permissive open-source license. It allows developers to use, modify, and distribute the software (or model) for any purpose, including commercial ones, without paying royalties to Google.