Component Browser

Browse and compare AI components

Filters

0+
050100

Components

(153 of 153)
models

Gemini 3 Pro

99

Google's frontier model ranking ECI 154. exceptional across reasoning, coding, and math benchmarks. exceptional intelligence. API access.

ELO: 1Kcountry: United Kinprovider: Google
models

GPT-5.2

98

OpenAI's frontier model ranking ECI 152. exceptional across reasoning, coding, and math benchmarks. exceptional math.

ELO: 1Kcountry: United Staprovider: OpenAI
models

Claude Opus 4.5

96

Anthropic flagship model with 200K context. Record ARC-AGI performance. Exceptional reasoning (98/100) and intelligence (97/100). Best for research and complex tasks.

country: United Staprovider: Anthropiceci score: 149.86957925424835
accelerators

NVIDIA B200

96

Blackwell GPU with massive 192GB HBM. 4.5K TFLOPS FP16 for next-generation performance. 52K tok/s MLPerf.

tdp: 1000WMemory: 192tflops fp16: 5K
models

o3

95

OpenAI next-gen reasoning model with 200K context. Record ARC-AGI scores. Exceptional reasoning (99/100) and math (98/100). Best for research and complex problem-solving.

country: United Staprovider: OpenAIeci score: 148.78913668659862
frameworks

PyTorch

95

Training optimization framework. ThoughtWorks Radar: Adopt. 85K GitHub stars, 5.5M weekly downloads.

Stars: 85Kradar status: adoptframework type: training
cloud

CoreWeave

94

Regional GPU cloud. excellent GPU availability. 4 regions. offering H100/H200 and 3 more. H100 at $6.16/hr. spot instances available.

regions: 4pricing tier: competitivprice updated: 2025-12-28
models

Grok 4

94

xAI latest-gen reasoning model with 128K context. Exceptional reasoning (95/100) and intelligence (94/100). Record agentic benchmark performance.

country: United Staprovider: xAIeci score: 147.41867399925533
accelerators

NVIDIA H200

94

Hopper GPU with extended 141GB HBM. 2K TFLOPS FP16 for high-performance compute. 87K tok/s MLPerf. Available from 5 providers starting at $4.50/hr.

tdp: 700WMemory: 141tflops fp16: 2K
models

Claude Sonnet 4.5

93

Anthropic's frontier model ranking ECI 146. strong performance across major benchmarks. excellent intelligence. API access.

ELO: 1Kcountry: United Staprovider: Anthropic
frameworks

vLLM

93

Inference serving framework. ThoughtWorks Radar: Adopt. 38K GitHub stars, 850K weekly downloads.

Stars: 38Kradar status: adoptframework type: inference
models

Gemini 2.5 Pro (Jun 2025)

93

Google's frontier model ranking ECI 146. strong performance across major benchmarks. excellent intelligence. Released Jun 2025. API access.

country: United Kinprovider: Googleeci score: 146.2161106724776
models

DeepSeek V3

92

DeepSeek efficient MoE with 128K context. 671B parameters. Exceptional code (94/100) and math (93/100). State-of-the-art performance at low inference cost.

country: Chinaprovider: DeepSeekeci score: 145.11291603938145
models

kimi-k2-thinking (official)

92

Moonshot's frontier model ranking ECI 145. strong performance across major benchmarks. excellent intelligence. Released Nov 2025. Open weights available.

country: Chinaprovider: Moonshoteci score: 145.09977073063365
frameworks

HuggingFace Transformers

92

Model hub and library. ThoughtWorks Radar: Adopt. 135K GitHub stars, 12M weekly downloads.

Stars: 135Kradar status: adoptframework type: model_hub
frameworks

MCP Filesystem Server

92

Official MCP server for filesystem operations. Enables AI agents to read, write, search, and manage files. Sandbox-safe with configurable permissions.

provider: Anthropiccapabilities: read,writeradar status: adopt
models

Qwen 3 235B

92

Alibaba flagship MoE model with 128K context. 235B parameters (22B active). Exceptional math (94/100) and reasoning (94/100). Competes with closed-source frontier models.

country: Chinaprovider: Alibabaeci score: 145.28065460309054
accelerators

NVIDIA H100

92

Hopper GPU with large 80GB HBM. 2K TFLOPS FP16 for high-performance compute. 125K tok/s MLPerf. Available from 12 providers starting at $1.49/hr.

tdp: 700WMemory: 80tflops fp16: 2K
agents

Claude Code

92

Anthropic-powered autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities.

license: commercialprovider: Anthropicrun rate: 1000.0M
models

kimi-k2-thinking (turbo official)

92

Moonshot's frontier model ranking ECI 145. strong performance across major benchmarks. excellent intelligence. Released Nov 2025. Open weights available.

country: Chinaprovider: Moonshoteci score: 145.09977073063365
models

Qwen3-Max-Instruct

92

Alibaba's frontier model ranking ECI 145. strong performance across major benchmarks. excellent intelligence. Released Sep 2025. API access.

country: Chinaprovider: Alibabaeci score: 145.2734899249661
models

o4-mini (high)

92

OpenAI's top-tier model ranking ECI 145. competitive on reasoning and coding tasks. excellent intelligence. Released Apr 2025. API access.

country: United Staprovider: OpenAIeci score: 144.9937806547
models

Llama 3.1 405B

91

Meta largest open-weights model with 128K context. 405B parameters. Excellent instruction following (92/100) and code (90/100). Industry benchmark for open models.

provider: MetaParams: 405open source: Yes
models

Grok Code Fast 1

91

xAI's coding-optimized model with 128K context. exceptional code (96/100). Best for software development and code generation.

provider: xAImodel type: codingprice input: $0.30
agents

OpenAI CUA

91

OpenAI Computer-Using Agent. SOTA benchmark results across computer control tasks. OSWorld 38.1% (human: 72.4%), WebArena 58.1%, WebVoyager 87%.

license: commercialprovider: OpenAIagent type: computer_u
models

o1

90

OpenAI reasoning-first model with 200K context. Exceptional reasoning (98/100) and math (96/100). Uses extended thinking for complex multi-step problems.

country: United Staprovider: OpenAIeci score: 142.37147467669178
agents

Cursor

90

Anysphere-powered autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities.

license: commercialprovider: Anysphereagent type: coding
models

Qwen3-Coder-480B-A35B

90

Alibaba's top-tier model ranking ECI 143. competitive on reasoning and coding tasks. excellent intelligence. Released Jul 2025. Open weights available.

country: Chinaprovider: Alibabaeci score: 142.96146590080863
models

Grok 4.1

90

xAI's general-purpose model with 256K context. excellent intelligence (91/100). Best for general knowledge tasks.

ELO: 1Kprovider: xAIprice input: $2.00
frameworks

Model Context Protocol (MCP)

90

Open protocol for connecting AI agents to external tools, data sources, and resources. Enables standardized tool use across Claude Code, Cline, and other MCP-compatible agents.

Stars: 35Korganization: Anthropicradar status: adopt
agents

Devin

89

Cognition Labs' AI software engineer. SWE-bench Verified 53.8%. First autonomous coding agent to demonstrate end-to-end software development.

license: commercialprovider: Cognition valuation: 2000.0M
agents

Writer Action Agent

89

Writer's enterprise AI agent. GAIA Level 3 leader (61%), highest difficulty multi-step reasoning. Surpassed OpenAI Deep Research (~47.6%).

license: commercialprovider: Writeragent type: general_as
models

Grok-3 mini

89

xAI's top-tier model ranking ECI 141. competitive on reasoning and coding tasks. strong intelligence. Released Apr 2025. API access.

country: United Staprovider: xAIeci score: 140.80473869724742
cloud

Google Cloud

89

Global hyperscaler. strong GPU availability. 35 regions. offering H100/A100 and 2 more. H100 at $3/hr. spot instances available.

regions: 35pricing tier: premiumprice updated: 2025-12-28
models

Claude 3.7 Sonnet

89

Anthropic hybrid reasoning model with 200K context. First Claude with extended thinking. Exceptional code (95/100) and reasoning (96/100). Best for agentic tasks.

country: United Staprovider: Anthropiceci score: 141.67851910714836
agents

Perplexity

88

Perplexity AI-powered AI-native search agent. Strong tool use and planning and memory and self-correction capabilities.

license: commercialprovider: Perplexityrun rate: 100.0M
accelerators

Google TPU v5p

88

TPU GPU with large 95GB HBM.

tdp: 450WMemory: 95tflops bf16: 459
agents

Manus AI

88

General AI assistant acquired by Meta (Dec 2025). GAIA Level 3 score 57.7%. Now integrated into Meta's AI platform.

license: commercialprovider: Meta (acquagent type: general_as
models

Claude Haiku 4.5

88

Anthropic's top-tier model ranking ECI 141. competitive on reasoning and coding tasks. strong intelligence. Released Oct 2025. API access.

country: United Staprovider: Anthropiceci score: 140.55933670602477
models

Kimi K2 0905 (Novita)

88

Moonshot's top-tier model ranking ECI 140. competitive on reasoning and coding tasks. strong intelligence. Released Sep 2025. Open weights available.

country: Chinaprovider: Moonshoteci score: 140.48407017520103
models

Kimi K2 Instruct

88

Moonshot's top-tier model ranking ECI 140. competitive on reasoning and coding tasks. strong intelligence. Released Jul 2025. Open weights available.

country: Chinaprovider: Moonshoteci score: 140.48407017520103
models

DeepSeek R1

88

DeepSeek reasoning model with 128K context. Uses chain-of-thought. Exceptional math (96/100) and reasoning (96/100). Open-source competitor to o1.

ELO: 1Kcountry: Chinaprovider: DeepSeek
models

GPT-OSS 120B

88

OpenAI's top-tier model ranking ECI 140. solid benchmark performance. strong intelligence. Open weights available.

country: United Staprovider: OpenAIeci score: 139.76129702790502
models

Mistral Large

88

Mistral flagship model with 128K context. Strong instruction following (90/100) and code (88/100). European leader in frontier AI.

provider: Mistral AIopen source: NoOpen: No
frameworks

OpenAI Function Calling

88

OpenAI's structured output format for model-tool interaction. JSON schema-based function definitions enable reliable tool use. Adopted by most LLM API providers.

organization: OpenAIradar status: adoptframework type: protocol
agents

Cline

88

autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities. 28K GitHub stars.

license: Apache-2.0provider: Open Sourcagent type: coding
frameworks

MCP Git Server

88

Official MCP server for Git operations. Clone, commit, branch, merge, and push without leaving the agent context. Full Git workflow support.

provider: Anthropiccapabilities: clone,commradar status: adopt
frameworks

Ollama

88

Local LLM inference with one-line setup. CPU-optimized, edge deployment focus. 148K GitHub stars, 2.5M weekly downloads.

Stars: 148Kradar status: adoptframework type: inference
frameworks

LangChain

88

LLM orchestration framework. ThoughtWorks Radar: Adopt. 95K GitHub stars, 2.1M weekly downloads.

Stars: 95Kradar status: adoptframework type: orchestrat
frameworks

TensorRT-LLM

87

Inference serving framework. ThoughtWorks Radar: Trial. 9.5K GitHub stars, 180K weekly downloads.

Stars: 10Kradar status: trialframework type: inference
agents

Abridge

87

Abridge-powered healthcare automation agent. Strong tool use and planning and memory and self-correction capabilities.

license: commercialprovider: Abridgeuse case: clinical-n
agents

IBM CUGA

87

IBM Computer Use General Agent. WebArena SOTA (61.7%). Uses modular Planner-Executor-Memory architecture for web automation.

license: researchprovider: IBMagent type: browser
agents

Claude Cowork

87

Claude Code for non-technical work. Cowork lets you complete tasks like file organization, document creation, and data compilation using natural language. Features parallel task queuing, sandboxed folder access, and connector integrations. Built entirely with Claude Code itself.

license: commercialpricing: $100-200/mplatform: macOS Desk
cloud

AWS

87

Global hyperscaler. moderate GPU availability. 32 regions. offering H100/A100 and 2 more. H100 at $3.90/hr. spot instances available.

regions: 32pricing tier: premiumprice updated: 2025-12-28
agents

Replit Agent

87

Replit's AI coding agent. Top-3 revenue in coding agents. 35M MAUs with integrated cloud development environment.

license: commercialprovider: Replitagent type: coding
agents

Windsurf

86

Codeium's AI-native code editor. 2M MAUs. Cascade model for agentic coding with Flows feature for multi-file edits.

license: freemiumprovider: Codeiumagent type: coding
frameworks

LangSmith

86

LLM observability platform by LangChain. Native integration for tracing, debugging, and monitoring LLM applications. 5K free traces, $39/user/month cloud.

radar status: adoptframework type: observabilweekly downloads: 450K
models

GPT-4.1

86

Top-tier model from OpenAI (ECI 138). solid benchmark performance. strong intelligence. Released Apr 2025. API access.

country: United Staprovider: OpenAIeci score: 137.5095823764539
models

Codestral

86

Mistral code-specialized model with 32K context. Exceptional code generation (94/100). Optimized for software engineering and code completion.

provider: Mistral AIopen source: NoOpen: No
models

Devstral 2512

86

Mistral AI's coding-optimized model with 128K context. excellent code (94/100). Best for software development and code generation. Open weights.

provider: Mistral AImodel type: codingprice input: $0.10
models

Gemini 1.5 Flash

86

Google fast model with 1M context. Good instruction following (88/100) and reasoning (85/100). Cost-effective for production workloads.

provider: Googleopen source: NoOpen: No
cloud

Azure

86

Global hyperscaler. moderate GPU availability. 60 regions. offering H100/A100/ND H100 v5. H100 at $6.98/hr. spot instances available.

regions: 60pricing tier: premiumprice updated: 2025-12-28
frameworks

LangGraph

86

LLM orchestration framework. ThoughtWorks Radar: Adopt. 8.5K GitHub stars, 450K weekly downloads.

Stars: 9Kradar status: adoptframework type: orchestrat
frameworks

SGLang

86

Inference serving framework with RadixAttention for KV-cache reuse. Stable latency (4-21ms), optimized for multi-turn chat and RAG.

Stars: 12Kradar status: trialframework type: inference
frameworks

Weaviate

86

AI-native vector database with HNSW indexing. Open-source, supports hybrid search, product quantization, and multi-tenancy. Cloud and self-hosted options. 13K GitHub stars.

indexing: HNSWdeployment: self-hosteStars: 13K
agents

CrewAI

86

CrewAI Inc-powered multi-agent orchestration framework. Strong tool use and planning and memory and self-correction capabilities. 24K GitHub stars.

license: MITprovider: CrewAI Incagent type: multi_agen
agents

Lovable

86

Lovable-powered AI app building platform. Strong tool use and planning and memory and self-correction capabilities.

license: commercialprovider: Lovableagent type: app-builde
accelerators

AMD MI300X

85

CDNA3 GPU with massive 192GB HBM. 1.3K TFLOPS FP16 for production-grade compute. 169K tok/s MLPerf.

tdp: 750WMemory: 192tflops fp16: 1K
models

Command R+

85

Cohere enterprise model with 128K context. Strong instruction following (88/100). Optimized for RAG and tool use in enterprise deployments.

provider: Cohereopen source: NoOpen: No
models

kat-coder-pro-v1

85

Kwaipilot's coding-optimized model with 64K context. exceptional code (95/100). Best for software development and code generation.

provider: Kwaipilotmodel type: codingprice input: $0.20
models

Claude 3.5 Haiku

85

Anthropic fast model with 200K context. Strong instruction following (88/100) and code (85/100). Best for high-throughput production workloads.

provider: Anthropicopen source: NoOpen: No
agents

NotebookLM

85

Google-powered research and analysis agent. Strong tool use and planning and memory and self-correction capabilities.

license: commercialprovider: Googleagent type: research
agents

AutoGen

85

Microsoft-powered multi-agent orchestration framework. Strong tool use and planning and memory and self-correction capabilities. 35K GitHub stars.

license: MITprovider: Microsoftagent type: multi_agen
agents

Aider

85

autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities. 22K GitHub stars.

license: Apache-2.0provider: Open Sourcagent type: coding
frameworks

CrewAI Framework

85

Role-playing autonomous agent orchestration. Cutting-edge multi-agent collaboration. 42.5K GitHub stars, 350K weekly downloads.

Stars: 43Kradar status: trialframework type: multi_agen
frameworks

MCP Brave Search

85

Official MCP server for Brave Search API. Enables web, news, and image search with privacy-respecting results. Requires Brave API key.

provider: Anthropiccapabilities: web_searchradar status: trial
frameworks

LlamaIndex

85

Data pipeline framework. ThoughtWorks Radar: Trial. 38K GitHub stars, 680K weekly downloads.

Stars: 38Kradar status: trialframework type: data
cloud

Together AI

85

GPU cloud provider. strong GPU availability. 2 regions. offering H100/A100.

regions: 2pricing tier: valuegpus available: H100,A100
models

Gemini 2.0 Flash Thinking Exp

85

Top-tier model from Google DeepMind,Google (ECI 136). solid benchmark performance. strong intelligence. Released Jan 2025. API access.

country: United Kinprovider: Google Deeeci score: 136.09543782218225
models

ERNIE 5.0 Preview

84

Baidu's reasoning-capable model with 128K context. strong reasoning (85/100). Best for complex multi-step problems.

provider: BaiduOpen: NoContext: 128K
frameworks

DeepSpeed

84

Training optimization framework. ThoughtWorks Radar: Assess. 35K GitHub stars, 320K weekly downloads.

Stars: 35Kradar status: assessframework type: training
frameworks

AutoGen Framework

84

Microsoft multi-agent conversational framework. Customizable agent behaviors, enterprise-ready. 35K GitHub stars.

Stars: 35Kradar status: trialframework type: multi_agen
frameworks

Haystack

84

Deepset RAG framework. Lowest token usage in benchmarks, production-ready pipelines. 17.5K GitHub stars.

Stars: 18Kradar status: trialframework type: data
models

Gemini 2.0 Pro Exp (Feb 2025)

84

Top-tier model from Google (ECI 136). solid benchmark performance. solid intelligence. Released Feb 2025. Hosted access (no API).

country: United Kinprovider: Googleeci score: 135.51597856798938
agents

OpenHands

84

All Hands AI-powered autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities. 42K GitHub stars.

license: MITprovider: All Hands agent type: coding
frameworks

Arize Phoenix

84

Open-source LLM observability platform. OpenTelemetry-based tracing, evaluation, and monitoring. Free self-hosted, $50/mo managed cloud.

Stars: 12Kpricing free: Yesradar status: trial
models

GPT-4.1 mini

84

Top-tier model from OpenAI (ECI 135). solid benchmark performance. solid intelligence. Released Apr 2025. API access.

country: United Staprovider: OpenAIeci score: 135.4435967525866
models

Qwen2.5-Max

83

Competitive model from Alibaba (ECI 133). solid benchmark performance. solid intelligence. Released Jan 2025. API access.

country: Chinaprovider: Alibabaeci score: 133.17425841782563
frameworks

DSPy

83

Stanford programmatic prompting framework. Lowest framework overhead, algorithmic optimization. 18K GitHub stars.

Stars: 18Kradar status: trialframework type: orchestrat
cloud

Lambda Labs

83

Regional GPU cloud. strong GPU availability. 3 regions. offering H100/A100/A10. H100 at $2.49/hr.

regions: 3pricing tier: valueprice updated: 2025-12-28
frameworks

Langfuse

83

Open-source LLM engineering platform. Tracing, evaluation, prompt management, and metrics. 50K events free, self-host available.

Stars: 8Kradar status: trialframework type: observabil
accelerators

AWS Trainium2

82

Trainium GPU with large 96GB HBM.

tdp: 400WMemory: 96tflops bf16: 380
frameworks

MetaGPT

82

Multi-agent framework simulating software company. Agents take roles: PM, architect, engineer. 45K GitHub stars.

Stars: 45Kradar status: trialframework type: multi_agen
agents

GPT Researcher

82

Tavily-powered research and analysis agent. Strong tool use and planning and memory and self-correction capabilities. 16K GitHub stars.

license: MITprovider: Tavilyagent type: research
frameworks

Context7

82

Community MCP server for documentation lookup. Retrieves up-to-date docs and code examples for any library. Resolves library IDs and queries documentation.

provider: Communitycapabilities: documentatradar status: trial
models

Minimax M2

82

Minimax's code-capable model with 128K context. solid code (84/100). Best for software development and code generation.

provider: Minimaxprice input: $0.15Open: No
agents

Kortix Suna

82

Kortix open-source general agent framework. Self-hosted deployment for enterprise privacy. Flexible LLM backend support.

license: open-sourcprovider: Kortixagent type: general_as
models

Llama 4 Maverick (FP8)

82

Competitive model from Meta (ECI 133). solid benchmark performance. solid intelligence. Released Apr 2025. Open weights available.

country: United Staprovider: Metaeci score: 132.8850524536027
frameworks

Promptfoo

82

Open-source LLM evaluation framework with YAML configuration. Supports prompt testing, red-teaming, and CI/CD integration. Lightweight alternative to heavier eval platforms.

Stars: 6Kradar status: trialconfig format: yaml
frameworks

Helicone

82

Open-source AI gateway with observability. Proxy-based setup for instant LLM monitoring. 100K requests free, $20/seat/month paid.

Stars: 3Kradar status: trialframework type: observabil
models

Gemma 3 27B

81

Competitive model from Google (ECI 131). solid benchmark performance. solid intelligence. Released Mar 2025. Open weights available.

country: United Kinprovider: Googleeci score: 130.9457032062782
models

Phi-4

81

Competitive model from Microsoft Research (ECI 131). solid benchmark performance. solid intelligence. Released Dec 2024. Open weights available.

country: United Staprovider: Microsoft eci score: 130.98080665701002
models

Qwen Plus

81

Competitive model from Alibaba (ECI 131). solid benchmark performance. solid intelligence. Released Apr 2025. API access.

country: Chinaprovider: Alibabaeci score: 131.0447378036277
models

mimo-v2-flash

80

Xiaomi's code-capable model with 32K context. solid code (82/100). Best for software development and code generation. Open weights.

provider: Xiaomiprice input: $0.07Open: Yes
models

Qwen 2.5 72B

80

Alibaba open-weights model with 128K context. 72B parameters. Strong math (90/100) and instruction following (90/100). Top performer in open-weights category.

country: Chinaprovider: Alibabaeci score: 129.43221393758552
models

Llama 4 Scout

80

Competitive model from Meta (ECI 130). solid benchmark performance. solid intelligence. Released Apr 2025. Open weights available.

country: United Staprovider: Metaeci score: 130.0579178010959
models

GPT-4o

80

OpenAI flagship multimodal model with 128K context. Strong code (92/100) and instruction following (94/100). Handles vision, audio, and text in unified architecture.

country: United Staprovider: OpenAIeci score: 130.02827257640743
frameworks

Braintrust

80

End-to-end AI evaluation and observability platform. Combines eval frameworks with production logging, experiments, and model comparison. Enterprise-focused alternative to LangSmith.

capabilities: evals,prodradar status: trialframework type: evaluation
frameworks

Playwright MCP

80

Community MCP server for browser automation via Playwright. Navigate pages, fill forms, take screenshots, and interact with web applications from AI agents.

provider: Communitycapabilities: browser_auStars: 3K
agents

Browser Use

80

browser automation agent. Strong tool use and planning and memory and self-correction capabilities. 8.5K GitHub stars.

license: MITprovider: Open Sourcagent type: browser
models

GPT-4 Turbo

78

OpenAI GPT-4 Turbo with 128K context window. Balanced intelligence (88/100) and code generation (90/100). Being superseded by GPT-4o and o-series models.

country: United Staprovider: OpenAIeci score: 127.51320951492868
models

Llama 3.3 70B

78

Meta latest generation open-weights model with 128K context. 70B parameters. Improved instruction following (90/100) and reasoning (87/100). Best Llama 70B variant.

country: United Staprovider: Metaeci score: 127.28520683172442
frameworks

VoltAgent

78

TypeScript agent framework with built-in observability. Multi-provider support, workflow orchestration. 8.5K GitHub stars.

Stars: 9Kradar status: assessframework type: orchestrat
frameworks

Harbor

78

Containerized agent evaluation platform with task registry. Enables pre/post execution checks, sandbox environments, and reproducible agent testing. Used by Anthropic for internal evals.

capabilities: containeriradar status: trialframework type: evaluation
models

Claude 3 Opus

77

Anthropic flagship Claude 3 model with 200K context. Excellent reasoning (94/100) and instruction following (95/100). Premium tier for complex analysis tasks.

country: United Staprovider: Anthropiceci score: 126.85188232987468
frameworks

Agent Protocol

75

Open standard for agent-to-agent communication. REST API spec enabling agents to list tasks, execute steps, and share artifacts. Supported by AutoGPT and agent frameworks.

Stars: 2Korganization: AI Foundatradar status: assess
agents

Blink

75

AI-powered app builder for non-coders. Build websites, SaaS, and mobile apps by chatting with AI. Includes database, auth, hosting, and payment integrations.

license: freemiumfeatures: database,aprovider: Blink
models

Qwen3-32B

70

Emerging model from Alibaba. Released Apr 2025. Benchmark data.

country: Chinaprovider: Alibabaopen source: No
models

gpt-4o-mini-2024-07-18

70

Emerging model from OpenAI. Released Jul 2024. Benchmark data.

country: United Staprovider: OpenAIopen source: No
models

c4ai-command-a-03-2025

70

Emerging model from Cohere. Released Mar 2025. Benchmark data.

country: Canadaprovider: Cohereopen source: No
frameworks

Agent2Agent (A2A)

70

Google's emerging protocol for cross-platform agent interoperability. Enables agents from different platforms to discover, negotiate, and collaborate on tasks.

organization: Googleradar status: assessframework type: protocol
models

yi-lightning

70

Emerging model from 01.AI. Released Dec 2024. Benchmark data.

country: Chinaprovider: 01.AIopen source: No
models

Qwen3-235B-A22B

70

Emerging model from Alibaba. Released Apr 2025. Benchmark data.

country: Chinaprovider: Alibabaopen source: No
models

Llama-4-Maverick-17B-128E-Instruct

70

Emerging model from Meta AI. Released Apr 2025. Benchmark data.

country: United Staprovider: Meta AIopen source: No
models

Phi-3-medium-128k-instruct

70

Emerging model from Microsoft. Released Apr 2024. Benchmark data.

country: United Staprovider: Microsoftopen source: No
models

Qwen2.5-Coder-32B-Instruct

70

Emerging model from Alibaba. Released Nov 2024. Benchmark data.

country: Chinaprovider: Alibabaopen source: No
models

Mistral-7B-v0.1

70

Emerging model from Mistral AI. Released Sep 2023. Benchmark data.

country: Franceprovider: Mistral AIopen source: No
models

gpt-3.5-turbo-1106

70

Emerging model from OpenAI. Released Nov 2023. Benchmark data.

country: United Staprovider: OpenAIopen source: No
models

chatgpt-4o-01-29-2025

70

Emerging model from OpenAI. Released Jan 2025. Benchmark data.

country: United Staprovider: OpenAIopen source: No
models

o4-mini-2025-04-16 medium

70

Emerging model from OpenAI. Released Apr 2025. Benchmark data.

country: United Staprovider: OpenAIopen source: No
models

chatgpt-4o-03-27-2025

70

Emerging model from OpenAI. Released Mar 2025. Benchmark data.

country: United Staprovider: OpenAIopen source: No
models

Yi-6B

70

Emerging model from 01.AI. Released Nov 2023. Benchmark data.

country: Chinaprovider: 01.AIopen source: No
models

falcon-180B

70

Emerging model from Technology Innovation Institute. Released Sep 2023. Benchmark data.

country: United Araprovider: Technologyopen source: No
models

Llama-2-7b

70

Emerging model from Meta AI. Released Jul 2023. Benchmark data.

country: United Staprovider: Meta AIopen source: No
models

Llama-2-70b-hf

70

Emerging model from Meta AI. Released Jul 2023. Benchmark data.

country: United Staprovider: Meta AIopen source: No
models

Phi-3-small-8k-instruct

70

Emerging model from Microsoft. Released Apr 2024. Benchmark data.

country: United Staprovider: Microsoftopen source: No
models

Phi-3-mini-4k-instruct

70

Emerging model from Microsoft. Released Apr 2024. Benchmark data.

country: United Staprovider: Microsoftopen source: No
models

claude-opus-4-20250514 32K

70

Emerging model from Anthropic. Released May 2025. Benchmark data.

country: United Staprovider: Anthropicopen source: No
models

gemma-7b

70

Emerging model from Google DeepMind. Released Feb 2024. Benchmark data.

country: United Kinprovider: Google Deeopen source: No
models

DeepSeek-V2.5

70

Emerging model from DeepSeek. Released Sep 2024. Benchmark data.

country: Chinaprovider: DeepSeekopen source: No
models

Meta-Llama-3-8B-Instruct

70

Emerging model from Meta AI. Released Apr 2024. Benchmark data.

country: United Staprovider: Meta AIopen source: No
models

Mixtral-8x7B-v0.1

70

Emerging model from Mistral AI. Released Dec 2023. Benchmark data.

country: Franceprovider: Mistral AIopen source: No
models

QwQ-32B

70

Emerging model from Alibaba. Released Mar 2025. Benchmark data.

country: Chinaprovider: Alibabaopen source: No
agents

GPT-4 Agent

44

OpenAI-powered autonomous agent. Top-tier AgentBench performer at 44.1% overall. Excels at household tasks (78%), knowledge graphs (58%), OS tasks (42%). Strong tool use and planning and self-correction capabilities. Powered by GPT-4.

provider: OpenAIbase model: gpt-4bfcl simple: 92.8
agents

Claude 3.5 Sonnet Agent

42

Anthropic-powered autonomous agent. Top-tier AgentBench performer at 42.3% overall. Excels at household tasks (72%), knowledge graphs (55%), OS tasks (40%). Strong tool use and self-correction capabilities. Powered by Claude 3.5 Sonnet.

provider: Anthropicbase model: claude-3.5bfcl simple: 94.5
agents

GPT-4-Turbo Agent

40

OpenAI-powered autonomous agent. Top-tier AgentBench performer at 40.2% overall. Excels at household tasks (68%), knowledge graphs (52%), OS tasks (39%). Strong tool use capabilities. Powered by GPT-4 Turbo.

provider: OpenAIbase model: gpt-4-turbagentbench db: 30
agents

Gemini Pro Agent

38

Google-powered autonomous agent. Strong AgentBench performer at 38.1% overall. Excels at household tasks (62%), knowledge graphs (48%), OS tasks (35%). Powered by Gemini Pro.

provider: Googlebase model: gemini-probfcl simple: 93.8
agents

Claude 3 Opus Agent

35

Anthropic-powered autonomous agent. Strong AgentBench performer at 35.4% overall. Excels at household tasks (55%), knowledge graphs (45%), OS tasks (33%). Powered by Claude 3 Opus.

provider: Anthropicbase model: claude-3-oagentbench db: 26.2
agents

Llama 3 70B Agent

30

Meta-powered autonomous agent. Capable AgentBench performer at 30.2% overall. Excels at household tasks (45%), knowledge graphs (38%). Powered by llama-3-70b.

provider: Metabase model: llama-3-70bfcl simple: 88.5
agents

Mistral Large Agent

28

Mistral-powered autonomous agent. Capable AgentBench performer at 28.5% overall. Excels at household tasks (42%), knowledge graphs (35%). Powered by mistral-large.

provider: Mistralbase model: mistral-labfcl simple: 90.2
agents

Qwen-72B Agent

25

Alibaba-powered autonomous agent. Capable AgentBench performer at 25.1% overall. Excels at household tasks (38%), knowledge graphs (32%). Powered by qwen-72b.

provider: Alibababase model: qwen-72bagentbench db: 18.5
agents

DeepSeek-67B Agent

23

DeepSeek-powered autonomous agent. Evaluated AgentBench performer at 23.2% overall. Excels at household tasks (35%), knowledge graphs (30%). Powered by deepseek-67b.

provider: DeepSeekbase model: deepseek-6agentbench db: 16.5
agents

Llama 3 8B Agent

18

Meta-powered autonomous agent. Evaluated AgentBench performer at 18.0% overall. Powered by llama-3-8b.

provider: Metabase model: llama-3-8bagentbench db: 12