Meta AI Launches Voice and Vision Features with Celebrity Voices

Meta AI has significantly expanded its capabilities, moving beyond text-based interactions to include both visual recognition and voice features with celebrity personalities. CEO Mark Zuckerberg unveiled these enhancements at Meta’s Connect event on Wednesday, marking a major evolution in the company’s AI assistant strategy.

The upgraded Meta AI chatbot can now analyze and respond to photos shared in conversations, enabling users to identify objects, animals, and other visual elements. For instance, users can snap a picture of an unfamiliar bird and ask Meta AI to identify the species. Beyond recognition, the AI assistant now offers photo editing capabilities, including background changes, object removal, and adding accessories to images.

Perhaps most notably, Meta AI now features celebrity voices from high-profile personalities including Awkwafina, Judi Dench, John Cena, Keegan-Michael Key, and Kristen Bell. Users can select their preferred celebrity voice to deliver AI responses across Meta’s platform ecosystem. This represents Meta’s second attempt at celebrity voice integration—the first iteration, which included voices from Kendall Jenner, Tom Brady, and Snoop Dogg with contracts worth up to $5 million for two years, was discontinued less than a year after launch. Meta subsequently pivoted to focus on its AI Studio platform, which Zuckerberg described as central to the company’s vision of enabling users to create personalized AI assistants.

The timing of these announcements coincides with OpenAI’s rollout of Advanced Voice Mode, creating direct competition in the voice AI space. Both platforms feature similar blue circle indicators for voice interactions, though OpenAI has faced its own celebrity voice controversies—notably when it used a voice resembling Scarlett Johansson, prompting legal action from the actress.

Meta AI’s voice features will be accessible across WhatsApp, Facebook, Instagram, and Messenger, with a phased rollout planned over the next month to users in the United States, Canada, Australia, and New Zealand. Additionally, Meta is experimenting with automatic video dubbing and lip-syncing features for Reels, designed to help users consume content in their preferred languages while enabling creators to expand their global reach. The company is also testing AI-generated “imagined content” on Facebook and Instagram, which will create personalized images based on user interests and potentially feature users themselves.

Key Quotes

Meta AI can reply to photos shared in the chat and answer questions about what’s in them

This statement from Meta describes the new visual recognition capability, highlighting how the AI assistant has evolved beyond text to understand and analyze images, enabling practical use cases like object and animal identification.

Zuckerberg said was an important part of Meta’s vision to help people create their own AIs

This quote refers to Meta’s AI Studio platform and reveals the company’s broader strategic direction—not just providing a single AI assistant, but enabling users to create personalized AI tools, representing a shift toward democratized AI creation.

Our Take

Meta’s aggressive push into multimodal AI represents a strategic necessity rather than innovation for innovation’s sake. With OpenAI, Google, and Anthropic advancing rapidly, Meta must leverage its unique advantage: distribution across billions of active users on WhatsApp, Instagram, and Facebook. The celebrity voice feature, despite its previous failure, shows Meta is willing to iterate and learn—though the $5 million price tag for the first attempt suggests expensive lessons. The most intriguing aspect is the “imagined content” feature, which could either enhance user engagement or trigger a backlash if users feel overwhelmed by AI-generated content in their feeds. Meta is essentially betting that AI integration will become as fundamental to social media as the news feed algorithm, but the success depends on whether users embrace or resist this AI-everywhere approach. The timing against OpenAI’s voice rollout suggests this is now a race for voice AI dominance.

Why This Matters

This announcement represents a critical escalation in the AI assistant wars between tech giants, particularly Meta and OpenAI. By integrating multimodal capabilities—vision, voice, and image editing—into its widely-used social platforms, Meta is positioning AI as a core feature rather than a standalone product, potentially reaching billions of users across its ecosystem.

The celebrity voice feature signals an important trend in humanizing AI interactions and making them more engaging and personalized. However, Meta’s previous failed attempt and OpenAI’s Scarlett Johansson controversy highlight the legal and ethical complexities surrounding AI voice replication and celebrity likeness rights.

For businesses and creators, the automatic dubbing and translation features could democratize global content distribution, removing language barriers that have traditionally limited audience reach. The “imagined content” feature also suggests Meta is moving toward AI-generated personalized content, which could fundamentally change how users interact with social media—though it raises questions about authenticity and the blurring of real versus AI-generated content. This development underscores how AI is rapidly transforming from a backend technology to a consumer-facing feature that will reshape digital communication and content creation.

Meta AI Launches Voice and Vision Features with Celebrity Voices

Key Quotes

Our Take

Why This Matters

Recommended Reading

Recommended Reading

Artificial Intelligence: A Modern Approach (4th Edition)

Deep Learning

Hands-On Machine Learning

Meta AI Launches Voice and Vision Features with Celebrity Voices

Key Quotes

Our Take

Why This Matters

Recommended Reading

Recommended Reading

Artificial Intelligence: A Modern Approach (4th Edition)

Deep Learning

Hands-On Machine Learning

Related Stories