Back to Companies

DeepSeek

85
Series AFoundation Models

Chinese AI research company developing cost-efficient foundation models with strong reasoning capabilities.

Total Raised

N/A

Valuation

N/A

Employees

101-250

Founded

2023

Company Info

HQ:
Hangzhou, China
Website:
deepseek.com

Score Breakdown

team quality
85
market position
78
funding strength
75
growth trajectory
95
technical leadership
90

Related Signals(8)

DeepSeek R1 Leads Open-Source Math Benchmarks

ModelsDec 16
85

DeepSeek R1 has emerged as the leading open-source model for mathematical reasoning, outperforming many closed-source alternatives on MATH and GSM8K benchmarks.

Related:

Open Source Models Reach 30% Market Share Equilibrium

ModelsDec 1
90

Open source models have stabilized at ~30% market share, with DeepSeek leading at 14.37T tokens. Chinese OSS models grew from 1.2% to nearly 30% of total usage, reshaping competitive dynamics.

Related:

GB200 NVL72 Delivers 4x Better TCO for DeepSeek R1 Inference

AcceleratorsOct 1
88

NVIDIA GB200 NVL72 with TRT-LLM Dynamo achieves 4x better TCO per million tokens than single-node servers for DeepSeek R1 at 30 tok/s/user. Rack-scale inference with disaggregated prefill, wide expert parallelism, and multi-token prediction (MTP) delivers 2-3x throughput gains.

Related:

Grok Code Fast 1 Dominates OpenRouter Usage at 572B Tokens

ModelsDec 29
93

xAI's Grok Code Fast 1 has surged to the #1 position on OpenRouter with 572.7B tokens processed weekly, more than 3x the second-place model. This dethroned mimo-v2-flash which dropped from #1 (170.9B) to #9 (77.6B), signaling a major shift toward specialized coding models.

Related:

Code-Specialized Models Capture 4 of Top 10 OpenRouter Positions

ModelsDec 1
89

Coding-optimized models now dominate OpenRouter's top 10: Grok Code Fast (#1), Claude Sonnet (#2), Claude Opus (#3), and Kwaipilot's kat-coder-pro-v1 (#4, 118.4B tokens). Mistral's Devstral 2512 (#8, 81.3B) adds to the coding focus. This reflects the broader industry shift where programming surpassed roleplay as the dominant LLM use case.

Related:

Reasoning Model Scaling Approaching Compute Infrastructure Limits

ModelsDec 31
85

Labs like OpenAI and Anthropic claim RL reasoning scaling cannot be sustained beyond 1-2 years due to compute infrastructure limits, suggesting the exceptional 2024-2025 capability growth could slow.

Related:

DeepSeek Achieves 10x Training Efficiency via Architecture Innovations

ModelsDec 31
90

DeepSeek V3 used 10x less compute than Llama 3 through MLA (multi-head latent attention), MoE innovations, and multi-token prediction, demonstrating 3x yearly algorithmic efficiency gains.

Related:

Frontier AI Capabilities Reach Consumer GPUs Within 12 Months

ModelsDec 31
89

The best open models runnable on consumer GPUs lag frontier AI by only ~1 year across GPQA, MMLU, and LMArena benchmarks, suggesting rapid capability democratization and regulatory implications.

Related: