SambaNova Unveils Breakthrough for DeepSeek-R1: 198 Tokens per Second with 16-Chip Rack Solution

by Archynetys Economy Desk

SambaNova Launches Efficient AI Solution for DeepSeek-R1, Setting New Standards in Speed and Inference

Palo Alto, CA, February 13, 2025 – SambaNova, a pioneering generative AI company renowned for its cutting-edge AI chips and leading models, has unveiled a transformative cloud solution for DeepSeek-R1. This new offering not only accelerates DeepSeek-R1 to an unprecedented 198 tokens per second (t/s) but also significantly reduces inference costs, making advanced AI more accessible and efficient for developers and enterprises.

DeepSeek-R1, previously hindered by high inference costs and inefficiencies, has now been optimized to run on SambaNova Cloud, overcoming its adoption barriers. This development paves the way for real-time, cost-effective inference at scale, leveraging SambaNova’s proprietary dataflow architecture and three-tier memory design.

Revolutionizing AI with Enhanced Speed and Efficiency

Unlike traditional GPU-based systems, SambaNova’s solution shrinks the hardware requirements to just 16 chips in a single rack, delivering 3X the speed and 5X the efficiency compared to the latest GPUs. By year-end, SambaNova plans to scale its capacity to offer 100X the current global capacity for DeepSeek-R1, ensuring it remains ahead in the race for efficiency and performance.

CEO and co-founder of SambaNova, Rodrigo Liang, emphasized the transformative impact of his company’s technology, stating, “SambaNova is the first to offer inference at scale with DeepSeek-R1, the full-size model, fast and efficiently to developers. This will increase to 5X faster than the latest GPU speed on a single rack — and by year end, we will offer 100X the capacity for DeepSeek-R1.”

Dr. Andrew Ng, a prominent figure in the AI community and founder of DeepLearning.AI, also praised SambaNova’s achievement, saying, “Being able to run the full DeepSeek-R1 671B model — not a distilled version — at SambaNova’s blazingly fast speed is a game changer for developers. Reasoning models like R1 need to generate a lot of reasoning tokens to come up with a superior output, which makes them take longer than traditional LLMs. This makes speeding them up especially important.”

George Cameron, co-founder of Artificial Analysis, a prominent AI benchmarking firm, corroborated the impressive performance, observing, “Artificial Analysis has independently benchmarked SambaNova’s cloud deployment of the full 671 billion parameter DeepSeek-R1 Mixture of Experts model at over 195 output token/s, the fastest output speed we have ever measured for DeepSeek-R1. High output speeds are particularly important for reasoning models, as these models use reasoning output tokens to improve the quality of their responses. SambaNova’s high output speeds will support the use of reasoning models in latency-sensitive use cases.”

Breaking Down Barriers for AI Adoption

DeepSeek-R1 has already made a significant impact by reducing AI training costs by 10X, but its practical use has been limited by high inference costs. SambaNova’s innovative solution addresses this bottleneck by drastically reducing hardware requirements and increasing efficiency, making DeepSeek-R1 accessible to a wider audience.

By employing its unique dataflow architecture and three-tier memory design, SambaNova has managed to shrink the hardware needed to run DeepSeek-R1 from 40 racks to just 1 rack, representing a massive improvement in both cost and performance. This advancement not only accelerates adoption but also enhances the user experience by providing real-time inference capabilities.

Liang elaborated on the significance of this achievement, noting, “DeepSeek-R1 is one of the most advanced frontier AI models available, but its full potential has been limited by the inefficiency of GPUs. That changes today. We’re bringing the next major breakthrough — collapsing inference costs and reducing hardware requirements from 40 racks to just one — to offer DeepSeek-R1 at the fastest speeds, efficiently.”

Empowering Next-Gen AI Workflows with Blackbox AI

The collaboration between SambaNova and Blackbox AI exemplifies the transformative potential of this new technology. Robert Rizk, CEO of Blackbox AI, highlighted the synergy between the two companies, stating, “More than 10 million users and engineering teams at Fortune 500 companies rely on Blackbox AI to transform how they write code and build products. Our partnership with SambaNova plays a critical role in accelerating our autonomous coding agent workflows. SambaNova’s chip capabilities are unmatched for serving the full DeepSeek-R1 671B model, which provides much better accuracy than any of the distilled versions. We couldn’t ask for a better partner to work with to serve millions of users.”

Preparing for the Future with Scalable AI Solutions

Looking ahead, SambaNova is committed to continuing its innovation in AI technology. Sumti Jairath, Chief Architect at SambaNova, foresaw a future where the company’s technology would become even more efficient, explaining, “DeepSeek-R1 is the perfect match for SambaNova’s three-tier memory architecture. With 671 billion parameters, R1 is the largest open-source large language model released to date, which means it needs a lot of memory to run. GPUs are memory constrained, but SambaNova’s unique dataflow architecture means we can run the model efficiently to achieve 20,000 tokens/s of total rack throughput in the near future — unprecedented efficiency when compared to GPUs due to their inherent memory and data communication bottlenecks.”

By the end of 2025, SambaNova aims to offer more than 100X the current global capacity for DeepSeek-R1, cementing its position as the leading provider of AI inference solutions for reasoning models. This advancement positions SambaNova as a key player in the rapidly evolving landscape of AI technology.

Try DeepSeek-R1 on SambaNova Cloud Today

The full DeepSeek-R1 671B model is now available for all users on SambaNova Cloud, allowing developers to experience its power and potential firsthand. Additionally, select users can access the model via an API, enabling seamless integration into existing workflows. To learn more and try it out, visit cloud.sambanova.ai.

About SambaNova Systems

SambaNova Systems empowers enterprises to deploy state-of-the-art generative AI capabilities efficiently. Founded in 2017 by industry leaders from Sun/Oracle and Stanford University, SambaNova aims to revolutionize AI computing with its custom-built platform. With investments from top-tier venture capital firms and a global presence, SambaNova continues to drive innovation in the AI sector.

Headquartered in Palo Alto, California, SambaNova Systems believes in transforming industries through advanced AI solutions. To learn more, visit sambanova.ai or contact the team at info@sambanova.ai. Follow SambaNova Systems on LinkedIn and X.

To stay updated on the latest in AI technology and SambaNova’s advancements, subscribe to our newsletter or join the conversation on social media. Your feedback is invaluable to us.

Related Posts

Leave a Comment