Opera, the company behind the fifth most popular desktop browser, will allow users of its built-in AI assistant, Aria, to easily generate images with Google’s latest image generation model, Imagen 2.
The browser project announced an agreement on Tuesday that will also power Aria with Google Gemini for its text-based chatbot.
“We are excited to be announcing the deepening of this collaboration into the field of generative AI to further power our suite of browser AI services,” Per Wetterdal, head of partnerships at Opera, said in an official statement, citing two decades of past collaborations.
The feature was made immediately available in Opera Developer, the pre-release but publicly accessible version of Opera where users can preview and test upcoming features. The Developer release typically migrates to Opera Beta (formerly known as Opera Next) before finally becoming the next main, stable release.
Additionally, the update enables Opera’s AI to speak responses in a conversational manner, thanks to Google’s text-to-audio model.
While there is access to an older version of Aria on the Opera mobile browser, the Gemini-powered update is not yet available on smartphones. Opera did not respond to a request for comment from Decrypt.
Aria, Opera’s AI assistant first introduced in May 2023, used to rely on OpenAI’s ChatGPT but later integrated a fine tuned version of Google’s PaLM 2 mode. This model is now outdated as Google has shifted its AI development to Gemini, a brand new foundational model that powers its own AI services. Opera’s Aria processes the user’s commands and decides which model to use for which task, Gemini or Imagen 2.
With the integration of Google’s Gemini model, Opera will now be able to provide its users with higher-quality responses on par with those obtained via GPT-4. Right now, GPT-4o and Gemini 1.5 Pro rank first and second in the Chatbot Arena leaderboard, a ranking based on blind user ratings.
“We believe the future of AI will be open, so we’re providing access to the best of Google’s infrastructure, AI products, platforms, and foundation models to empower organizations to chart their course with generative AI,” said Eva Fors, managing director of Google Cloud for the Nordic region. “We’re happy to elevate our long standing cooperation with Opera by powering its AI innovation within the browser space.”
Opera has still retained its pre-existing integration with ChatGPT (now powered by GPT-4 and Dall-e 3) from OpenAI. Users just need to click on a different icon displayed just below the Aria button.
Opera has been tapping into the potential of browser AI for more than a year now with all of its flagship browsers, including its Opera GX gaming browser. The company also recently opened a green energy-powered AI data cluster in Iceland powered by NVIDIA DGX technology to quickly expand its AI program.
The AI browser wars
The battle to dominate the browser market has been significantly impacted by the integration of AI. While Google’s Chrome remains the undisputed leader, its lack of default, built-in AI integration has left room for other players to innovate and gain ground.
Microsoft, in particular, has bet big on AI, overhauling its once-maligned Edge browser with a lot of AI-powered features. Once a subject of ridicule, with users joking that its best use was to download Chrome, Edge has seen a remarkable resurgence in the last months.
By shifting to a Chromium-based engine and embedding AI capabilities, Microsoft’s browser has climbed the ranks, surpassing Apple’s Safari to claim the number two spot by late 2023, per Statcounter data. The turning point aligns with Microsoft’s announcement of its unified Copilot experience on September 26, 2023.
This business move from Opera and Google Cloud is significant as it offers a Google-based alternative in the market of AI-powered browsers. With Microsoft’s Edge rising thanks to its AI integrations, the collaboration between Opera and Google provides users with a solid option to leverage Google’s advanced models instead of those from OpenAI.
The joy of choice
If the convenience of tapping AI tools within a web browser is appealing, you now have three solid choices: Opera with Aria from Google and Edge with Copilot from Microsoft and Brave with Leo, powered by Mistral and Anthropic.
Brave, the renowned crypto-browser, may appeal to privacy-conscious users. Brave has integrated its own AI assistant, Leo, directly into its browser, and it can answer questions, provide summaries, generate new content, and more. It cannot generate images yet, however.
Leo is powered by large language models like Mixtral 8x7B, Claude Instant, and Llama 2 13B. Unlike other AI assistants, Brave hosts these models on its own servers, ensuring that user inputs and conversations with Leo are not retained or used for model training.
Between Opera and Edge, Copilot may have an advantage in text-based responses, but Google’s Imagen 2 capabilities beat Dall-E 3 in terms of realism and coherence. In tests conducted by Decrypt, Aria also proved to be more versatile and creative, understanding shorter prompts.
For example, Aria’s interpretation of a simple request for a dog eating a hamburger was realistic. Copilot asked for a more descriptive prompt, creating a less realistic image with a 3D render aesthetic only after we asked to create a dog with sunglasses eating a hamburger.
If you use Copilot in Edge, Aria in Opera is worth checking out—and the image generation capabilities of either may be worth trying something outside the private bubble provided by Leo in Brave.
Edited by Ryan Ozawa.
Generally Intelligent Newsletter
A weekly AI journey narrated by Gen, a generative AI model.
Credit: Source link