Advancements in AI Reasoning Models

DeepSeek-AI has unveiled its first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1, which utilize reinforcement learning to significantly improve the reasoning capabilities of large language models (LLMs). The DeepSeek-R1-Zero model optimizes reasoning performance without relying on supervised data, achieving a remarkable pass rate of 71.0% on the AIME 2024 benchmark.

Meanwhile, DeepSeek-R1 incorporates cold-start data to ensure coherent and user-friendly outputs, with the 14B DeepSeek-R1-Distill-Qwen-32B model reaching a 72.6% pass rate. These models, available under the MIT License, promise enhanced multilingual support and efficient software engineering capabilities.

As competition in the AI landscape intensifies, DeepSeek's offerings suggest a growing ability for Chinese AI labs to match and potentially surpass established players like OpenAI.

TechCrunch

20. Januar 2025 um 17:44

DeepSeek claims its reasoning model beats OpenAI’s o1 on certain benchmarks

Technology

Politics

DeepSeek is not the only Chinese lab to produce models claimed to rival OpenAI's o1, with Alibaba and Kimi (owned by Moonshot AI) also releasing similar models, suggesting Chinese AI labs will continue to be "fast followers" in this space.

THE DECODER

20. Januar 2025 um 18:00

DeepSeek's latest R1-Zero model matches OpenAI's o1 in reasoning benchmarks

Technology

Economy

Finance

DeepSeek-R1 uses the MIT license, allowing free use of the model weights and outputs. All model variants and documentation are available on GitHub and HuggingFace, and the model can be accessed through DeepSeek's API, with pricing set at $0.14 per million input tokens for cache hits, $0.55 for cache misses, and $2.19 per million output tokens.

marktechpost.com

21. Januar 2025 um 04:27

DeepSeek-AI Releases DeepSeek-R1-Zero and DeepSeek-R1: First-Generation Reasoning Models that Incentivize Reasoning Capability in LLMs via Reinforcement Learning - MarkTechPost

Technology

DeepSeek-AI's DeepSeek-R1-Zero and DeepSeek-R1 leverage reinforcement learning to enhance LLM reasoning. DeepSeek-R1-Zero uses GRPO to optimize reasoning without supervised data, improving AIME 2024 pass@1 from 15.6% to 71.0%. DeepSeek-R1 combines cold-start data and reasoning-focused RL to produce coherent, user-friendly outputs. Distilled Qwen and Llama models retain strong reasoning, with the 14B DeepSeek-R1-Distill-Qwen-32B achieving 72.6% pass@1 on AIME 2024. The models address computatio..

Account

Waiting list for the personalized area

Welcome!

infobud.news is an AI-driven news aggregator that simplifies global news, offering customizable feeds in all languages for tailored insights into tech, finance, politics, and more. It provides precise, relevant news updates, overcoming conventional search tool limitations. Due to the diversity of news sources, it provides precise and relevant news updates, focusing entirely on the facts without influencing opinion. Read moreExpand

DeepSeek claims its reasoning model beats OpenAI’s o1 on certain benchmarks

DeepSeek's latest R1-Zero model matches OpenAI's o1 in reasoning benchmarks

DeepSeek-AI Releases DeepSeek-R1-Zero and DeepSeek-R1: First-Generation Reasoning Models that Incentivize Reasoning Capability in LLMs via Reinforcement Learning - MarkTechPost

Advancements in AI Reasoning Models

Account

Welcome!

Top Newsworthy Stocks

Front Page Figures

Global Hotspots

News

About

Legal

Contact

Advancements in AI Reasoning Models

The press radar on this topic:

DeepSeek claims its reasoning model beats OpenAI’s o1 on certain benchmarks

DeepSeek's latest R1-Zero model matches OpenAI's o1 in reasoning benchmarks

DeepSeek-AI Releases DeepSeek-R1-Zero and DeepSeek-R1: First-Generation Reasoning Models that Incentivize Reasoning Capability in LLMs via Reinforcement Learning - MarkTechPost

Account

Welcome!

Top Newsworthy Stocks

Front Page Figures

Global Hotspots

News

About

Legal

Contact