2025-01-21 05:40:10
Artificial Intelligence
Technology
Science
Advancements in AI Reasoning Models
The press radar on this topic:
THE DECODER
DeepSeek's latest R1-Zero model matches OpenAI's o1 in reasoning benchmarks
Technology
Economy
Finance
DeepSeek-R1 uses the MIT license, allowing free use of the model weights and outputs. All model variants and documentation are available on GitHub and HuggingFace, and the model can be accessed through DeepSeek's API, with pricing set at $0.14 per million input tokens for cache hits, $0.55 for cache misses, and $2.19 per million output tokens.
marktechpost.com
DeepSeek-AI Releases DeepSeek-R1-Zero and DeepSeek-R1: First-Generation Reasoning Models that Incentivize Reasoning Capability in LLMs via Reinforcement Learning - MarkTechPost
Technology
DeepSeek-AI's DeepSeek-R1-Zero and DeepSeek-R1 leverage reinforcement learning to enhance LLM reasoning. DeepSeek-R1-Zero uses GRPO to optimize reasoning without supervised data, improving AIME 2024 pass@1 from 15.6% to 71.0%. DeepSeek-R1 combines cold-start data and reasoning-focused RL to produce coherent, user-friendly outputs. Distilled Qwen and Llama models retain strong reasoning, with the 14B DeepSeek-R1-Distill-Qwen-32B achieving 72.6% pass@1 on AIME 2024. The models address computatio..
Welcome!
infobud.news is an AI-driven news aggregator that simplifies global news, offering customizable feeds in all languages for tailored insights into tech, finance, politics, and more. It provides precise, relevant news updates, overcoming conventional search tool limitations. Due to the diversity of news sources, it provides precise and relevant news updates, focusing entirely on the facts without influencing opinion. Read moreExpand