Alibaba's Qwen3-Omni: A Game Changer in Multimodal AI Technology

Alibaba's recent launch of the Qwen3-Omni marks a significant advancement in artificial intelligence. This innovative multimodal AI model seamlessly processes text, images, audio, and video, showcasing its versatility in real-time applications.

With 30 billion parameters, Qwen3-Omni excels in numerous benchmarks, surpassing established models like GPT-4o. Its capability to support a wide range of languages enhances its accessibility and functionality.

Additionally, Alibaba introduced Qwen3-Next, a faster language model leveraging a customized architecture, designed for improved performance across various tasks. This dual approach positions Alibaba as a formidable competitor in the AI landscape, challenging major players like OpenAI and Google.

Big Tech - South China Morning Post

23. September 2025 um 10:30

Alibaba challenges OpenAI’s GPT-4o and Google’s Nano Banana with new multimodal AI model

Alibaba has unveiled a new multimodal AI model called Qwen3-Omni, rivaling OpenAI's GPT-4o and Google's Gemini 2.5-Flash ('Nano Banana'). The model can process text, audio, images, and video inputs, responding with text and audio outputs. Benchmark tests showed two variants of Qwen3-Omni outperforming their predecessor and competing models in tasks such as audio recognition and image understanding.

Webrazzi

23. September 2025 um 13:00

Alibaba's Open Source AI Model for Real-Time Multimodal Input Processing: Qwen3-Omni

Alibaba has announced the new Qwen3-Omni open-source artificial intelligence model. This model is positioned as the first 'local end-to-end omni-modal AI' that can process text, images, audio, and video inputs all at once. The model can receive input and provide output while continuing to respond in real-time, and it supports 119 languages.

THE DECODER

23. September 2025 um 11:15

Alibaba's Qwen3-Next builds on a faster MoE architecture

Alibaba has released Qwen3-Next, a new language model built on a customized MoE architecture that runs faster than its predecessors without sacrificing performance. The model includes several tweaks to stabilize training and is available in two specialized versions: Instruct for general-purpose tasks and Thinking for reasoning-heavy problems. The Thinking model reportedly outperforms Google's Gemini 2.5 Flash Thinking on certain benchmarks and comes close to Alibaba's own top-tier Qwen3-235B-A..

THE DECODER

23. September 2025 um 11:01

Alibaba unveils Qwen3-Omni, an AI model that processes text, images, audio, and video

Alibaba has introduced Qwen3-Omni, a native multimodal AI model that can process text, images, audio, and video in real-time. The 30-billion-parameter model outperforms established models like Gemini 2.5 Flash and GPT-4o on 32 out of 36 benchmarks. Qwen3-Omni is capable of seamless streaming and supports 119 languages for text processing, 19 spoken languages, and can respond in 10 languages.

Account

Waiting list for the personalized area

Welcome!

infobud.news is an AI-driven news aggregator that simplifies global news, offering customizable feeds in all languages for tailored insights into tech, finance, politics, and more. It provides precise, relevant news updates, overcoming conventional search tool limitations. Due to the diversity of news sources, it provides precise and relevant news updates, focusing entirely on the facts without influencing opinion. Read moreExpand

Alibaba challenges OpenAI’s GPT-4o and Google’s Nano Banana with new multimodal AI model

Alibaba's Open Source AI Model for Real-Time Multimodal Input Processing: Qwen3-Omni

Alibaba's Qwen3-Next builds on a faster MoE architecture

Alibaba unveils Qwen3-Omni, an AI model that processes text, images, audio, and video

Alibaba's Qwen3-Omni: A Game Changer in Multimodal AI Technology

Account

Welcome!

Top Newsworthy Stocks

Front Page Figures

Global Hotspots

News

About

Legal

Contact

Alibaba's Qwen3-Omni: A Game Changer in Multimodal AI Technology

The press radar on this topic:

Alibaba challenges OpenAI’s GPT-4o and Google’s Nano Banana with new multimodal AI model

Alibaba's Open Source AI Model for Real-Time Multimodal Input Processing: Qwen3-Omni

Alibaba's Qwen3-Next builds on a faster MoE architecture

Alibaba unveils Qwen3-Omni, an AI model that processes text, images, audio, and video

Account

Welcome!

Top Newsworthy Stocks

Front Page Figures

Global Hotspots

News

About

Legal

Contact