Component Browser
Browse and compare AI components
Filters
Components
(153 of 153)Gemini 3 Pro
Google's frontier model ranking ECI 154. exceptional across reasoning, coding, and math benchmarks. exceptional intelligence. API access.
GPT-5.2
OpenAI's frontier model ranking ECI 152. exceptional across reasoning, coding, and math benchmarks. exceptional math.
Claude Opus 4.5
Anthropic flagship model with 200K context. Record ARC-AGI performance. Exceptional reasoning (98/100) and intelligence (97/100). Best for research and complex tasks.
NVIDIA B200
Blackwell GPU with massive 192GB HBM. 4.5K TFLOPS FP16 for next-generation performance. 52K tok/s MLPerf.
o3
OpenAI next-gen reasoning model with 200K context. Record ARC-AGI scores. Exceptional reasoning (99/100) and math (98/100). Best for research and complex problem-solving.
PyTorch
Training optimization framework. ThoughtWorks Radar: Adopt. 85K GitHub stars, 5.5M weekly downloads.
CoreWeave
Regional GPU cloud. excellent GPU availability. 4 regions. offering H100/H200 and 3 more. H100 at $6.16/hr. spot instances available.
Grok 4
xAI latest-gen reasoning model with 128K context. Exceptional reasoning (95/100) and intelligence (94/100). Record agentic benchmark performance.
NVIDIA H200
Hopper GPU with extended 141GB HBM. 2K TFLOPS FP16 for high-performance compute. 87K tok/s MLPerf. Available from 5 providers starting at $4.50/hr.
Claude Sonnet 4.5
Anthropic's frontier model ranking ECI 146. strong performance across major benchmarks. excellent intelligence. API access.
vLLM
Inference serving framework. ThoughtWorks Radar: Adopt. 38K GitHub stars, 850K weekly downloads.
Gemini 2.5 Pro (Jun 2025)
Google's frontier model ranking ECI 146. strong performance across major benchmarks. excellent intelligence. Released Jun 2025. API access.
DeepSeek V3
DeepSeek efficient MoE with 128K context. 671B parameters. Exceptional code (94/100) and math (93/100). State-of-the-art performance at low inference cost.
kimi-k2-thinking (official)
Moonshot's frontier model ranking ECI 145. strong performance across major benchmarks. excellent intelligence. Released Nov 2025. Open weights available.
HuggingFace Transformers
Model hub and library. ThoughtWorks Radar: Adopt. 135K GitHub stars, 12M weekly downloads.
MCP Filesystem Server
Official MCP server for filesystem operations. Enables AI agents to read, write, search, and manage files. Sandbox-safe with configurable permissions.
Qwen 3 235B
Alibaba flagship MoE model with 128K context. 235B parameters (22B active). Exceptional math (94/100) and reasoning (94/100). Competes with closed-source frontier models.
NVIDIA H100
Hopper GPU with large 80GB HBM. 2K TFLOPS FP16 for high-performance compute. 125K tok/s MLPerf. Available from 12 providers starting at $1.49/hr.
Claude Code
Anthropic-powered autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities.
kimi-k2-thinking (turbo official)
Moonshot's frontier model ranking ECI 145. strong performance across major benchmarks. excellent intelligence. Released Nov 2025. Open weights available.
Qwen3-Max-Instruct
Alibaba's frontier model ranking ECI 145. strong performance across major benchmarks. excellent intelligence. Released Sep 2025. API access.
o4-mini (high)
OpenAI's top-tier model ranking ECI 145. competitive on reasoning and coding tasks. excellent intelligence. Released Apr 2025. API access.
Llama 3.1 405B
Meta largest open-weights model with 128K context. 405B parameters. Excellent instruction following (92/100) and code (90/100). Industry benchmark for open models.
Grok Code Fast 1
xAI's coding-optimized model with 128K context. exceptional code (96/100). Best for software development and code generation.
OpenAI CUA
OpenAI Computer-Using Agent. SOTA benchmark results across computer control tasks. OSWorld 38.1% (human: 72.4%), WebArena 58.1%, WebVoyager 87%.
o1
OpenAI reasoning-first model with 200K context. Exceptional reasoning (98/100) and math (96/100). Uses extended thinking for complex multi-step problems.
Cursor
Anysphere-powered autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities.
Qwen3-Coder-480B-A35B
Alibaba's top-tier model ranking ECI 143. competitive on reasoning and coding tasks. excellent intelligence. Released Jul 2025. Open weights available.
Grok 4.1
xAI's general-purpose model with 256K context. excellent intelligence (91/100). Best for general knowledge tasks.
Model Context Protocol (MCP)
Open protocol for connecting AI agents to external tools, data sources, and resources. Enables standardized tool use across Claude Code, Cline, and other MCP-compatible agents.
Devin
Cognition Labs' AI software engineer. SWE-bench Verified 53.8%. First autonomous coding agent to demonstrate end-to-end software development.
Writer Action Agent
Writer's enterprise AI agent. GAIA Level 3 leader (61%), highest difficulty multi-step reasoning. Surpassed OpenAI Deep Research (~47.6%).
Grok-3 mini
xAI's top-tier model ranking ECI 141. competitive on reasoning and coding tasks. strong intelligence. Released Apr 2025. API access.
Google Cloud
Global hyperscaler. strong GPU availability. 35 regions. offering H100/A100 and 2 more. H100 at $3/hr. spot instances available.
Claude 3.7 Sonnet
Anthropic hybrid reasoning model with 200K context. First Claude with extended thinking. Exceptional code (95/100) and reasoning (96/100). Best for agentic tasks.
Perplexity
Perplexity AI-powered AI-native search agent. Strong tool use and planning and memory and self-correction capabilities.
Google TPU v5p
TPU GPU with large 95GB HBM.
Manus AI
General AI assistant acquired by Meta (Dec 2025). GAIA Level 3 score 57.7%. Now integrated into Meta's AI platform.
Claude Haiku 4.5
Anthropic's top-tier model ranking ECI 141. competitive on reasoning and coding tasks. strong intelligence. Released Oct 2025. API access.
Kimi K2 0905 (Novita)
Moonshot's top-tier model ranking ECI 140. competitive on reasoning and coding tasks. strong intelligence. Released Sep 2025. Open weights available.
Kimi K2 Instruct
Moonshot's top-tier model ranking ECI 140. competitive on reasoning and coding tasks. strong intelligence. Released Jul 2025. Open weights available.
DeepSeek R1
DeepSeek reasoning model with 128K context. Uses chain-of-thought. Exceptional math (96/100) and reasoning (96/100). Open-source competitor to o1.
GPT-OSS 120B
OpenAI's top-tier model ranking ECI 140. solid benchmark performance. strong intelligence. Open weights available.
Mistral Large
Mistral flagship model with 128K context. Strong instruction following (90/100) and code (88/100). European leader in frontier AI.
OpenAI Function Calling
OpenAI's structured output format for model-tool interaction. JSON schema-based function definitions enable reliable tool use. Adopted by most LLM API providers.
Cline
autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities. 28K GitHub stars.
MCP Git Server
Official MCP server for Git operations. Clone, commit, branch, merge, and push without leaving the agent context. Full Git workflow support.
Ollama
Local LLM inference with one-line setup. CPU-optimized, edge deployment focus. 148K GitHub stars, 2.5M weekly downloads.
LangChain
LLM orchestration framework. ThoughtWorks Radar: Adopt. 95K GitHub stars, 2.1M weekly downloads.
TensorRT-LLM
Inference serving framework. ThoughtWorks Radar: Trial. 9.5K GitHub stars, 180K weekly downloads.
Abridge
Abridge-powered healthcare automation agent. Strong tool use and planning and memory and self-correction capabilities.
IBM CUGA
IBM Computer Use General Agent. WebArena SOTA (61.7%). Uses modular Planner-Executor-Memory architecture for web automation.
Claude Cowork
Claude Code for non-technical work. Cowork lets you complete tasks like file organization, document creation, and data compilation using natural language. Features parallel task queuing, sandboxed folder access, and connector integrations. Built entirely with Claude Code itself.
AWS
Global hyperscaler. moderate GPU availability. 32 regions. offering H100/A100 and 2 more. H100 at $3.90/hr. spot instances available.
Replit Agent
Replit's AI coding agent. Top-3 revenue in coding agents. 35M MAUs with integrated cloud development environment.
Windsurf
Codeium's AI-native code editor. 2M MAUs. Cascade model for agentic coding with Flows feature for multi-file edits.
LangSmith
LLM observability platform by LangChain. Native integration for tracing, debugging, and monitoring LLM applications. 5K free traces, $39/user/month cloud.
GPT-4.1
Top-tier model from OpenAI (ECI 138). solid benchmark performance. strong intelligence. Released Apr 2025. API access.
Codestral
Mistral code-specialized model with 32K context. Exceptional code generation (94/100). Optimized for software engineering and code completion.
Devstral 2512
Mistral AI's coding-optimized model with 128K context. excellent code (94/100). Best for software development and code generation. Open weights.
Gemini 1.5 Flash
Google fast model with 1M context. Good instruction following (88/100) and reasoning (85/100). Cost-effective for production workloads.
Azure
Global hyperscaler. moderate GPU availability. 60 regions. offering H100/A100/ND H100 v5. H100 at $6.98/hr. spot instances available.
LangGraph
LLM orchestration framework. ThoughtWorks Radar: Adopt. 8.5K GitHub stars, 450K weekly downloads.
SGLang
Inference serving framework with RadixAttention for KV-cache reuse. Stable latency (4-21ms), optimized for multi-turn chat and RAG.
Weaviate
AI-native vector database with HNSW indexing. Open-source, supports hybrid search, product quantization, and multi-tenancy. Cloud and self-hosted options. 13K GitHub stars.
CrewAI
CrewAI Inc-powered multi-agent orchestration framework. Strong tool use and planning and memory and self-correction capabilities. 24K GitHub stars.
Lovable
Lovable-powered AI app building platform. Strong tool use and planning and memory and self-correction capabilities.
AMD MI300X
CDNA3 GPU with massive 192GB HBM. 1.3K TFLOPS FP16 for production-grade compute. 169K tok/s MLPerf.
Command R+
Cohere enterprise model with 128K context. Strong instruction following (88/100). Optimized for RAG and tool use in enterprise deployments.
kat-coder-pro-v1
Kwaipilot's coding-optimized model with 64K context. exceptional code (95/100). Best for software development and code generation.
Claude 3.5 Haiku
Anthropic fast model with 200K context. Strong instruction following (88/100) and code (85/100). Best for high-throughput production workloads.
NotebookLM
Google-powered research and analysis agent. Strong tool use and planning and memory and self-correction capabilities.
AutoGen
Microsoft-powered multi-agent orchestration framework. Strong tool use and planning and memory and self-correction capabilities. 35K GitHub stars.
Aider
autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities. 22K GitHub stars.
CrewAI Framework
Role-playing autonomous agent orchestration. Cutting-edge multi-agent collaboration. 42.5K GitHub stars, 350K weekly downloads.
MCP Brave Search
Official MCP server for Brave Search API. Enables web, news, and image search with privacy-respecting results. Requires Brave API key.
LlamaIndex
Data pipeline framework. ThoughtWorks Radar: Trial. 38K GitHub stars, 680K weekly downloads.
Together AI
GPU cloud provider. strong GPU availability. 2 regions. offering H100/A100.
Gemini 2.0 Flash Thinking Exp
Top-tier model from Google DeepMind,Google (ECI 136). solid benchmark performance. strong intelligence. Released Jan 2025. API access.
ERNIE 5.0 Preview
Baidu's reasoning-capable model with 128K context. strong reasoning (85/100). Best for complex multi-step problems.
DeepSpeed
Training optimization framework. ThoughtWorks Radar: Assess. 35K GitHub stars, 320K weekly downloads.
AutoGen Framework
Microsoft multi-agent conversational framework. Customizable agent behaviors, enterprise-ready. 35K GitHub stars.
Haystack
Deepset RAG framework. Lowest token usage in benchmarks, production-ready pipelines. 17.5K GitHub stars.
Gemini 2.0 Pro Exp (Feb 2025)
Top-tier model from Google (ECI 136). solid benchmark performance. solid intelligence. Released Feb 2025. Hosted access (no API).
OpenHands
All Hands AI-powered autonomous coding agent. Strong tool use and planning and memory and self-correction capabilities. 42K GitHub stars.
Arize Phoenix
Open-source LLM observability platform. OpenTelemetry-based tracing, evaluation, and monitoring. Free self-hosted, $50/mo managed cloud.
GPT-4.1 mini
Top-tier model from OpenAI (ECI 135). solid benchmark performance. solid intelligence. Released Apr 2025. API access.
Qwen2.5-Max
Competitive model from Alibaba (ECI 133). solid benchmark performance. solid intelligence. Released Jan 2025. API access.
DSPy
Stanford programmatic prompting framework. Lowest framework overhead, algorithmic optimization. 18K GitHub stars.
Lambda Labs
Regional GPU cloud. strong GPU availability. 3 regions. offering H100/A100/A10. H100 at $2.49/hr.
Langfuse
Open-source LLM engineering platform. Tracing, evaluation, prompt management, and metrics. 50K events free, self-host available.
AWS Trainium2
Trainium GPU with large 96GB HBM.
MetaGPT
Multi-agent framework simulating software company. Agents take roles: PM, architect, engineer. 45K GitHub stars.
GPT Researcher
Tavily-powered research and analysis agent. Strong tool use and planning and memory and self-correction capabilities. 16K GitHub stars.
Context7
Community MCP server for documentation lookup. Retrieves up-to-date docs and code examples for any library. Resolves library IDs and queries documentation.
Minimax M2
Minimax's code-capable model with 128K context. solid code (84/100). Best for software development and code generation.
Kortix Suna
Kortix open-source general agent framework. Self-hosted deployment for enterprise privacy. Flexible LLM backend support.
Llama 4 Maverick (FP8)
Competitive model from Meta (ECI 133). solid benchmark performance. solid intelligence. Released Apr 2025. Open weights available.
Promptfoo
Open-source LLM evaluation framework with YAML configuration. Supports prompt testing, red-teaming, and CI/CD integration. Lightweight alternative to heavier eval platforms.
Helicone
Open-source AI gateway with observability. Proxy-based setup for instant LLM monitoring. 100K requests free, $20/seat/month paid.
Gemma 3 27B
Competitive model from Google (ECI 131). solid benchmark performance. solid intelligence. Released Mar 2025. Open weights available.
Phi-4
Competitive model from Microsoft Research (ECI 131). solid benchmark performance. solid intelligence. Released Dec 2024. Open weights available.
Qwen Plus
Competitive model from Alibaba (ECI 131). solid benchmark performance. solid intelligence. Released Apr 2025. API access.
mimo-v2-flash
Xiaomi's code-capable model with 32K context. solid code (82/100). Best for software development and code generation. Open weights.
Qwen 2.5 72B
Alibaba open-weights model with 128K context. 72B parameters. Strong math (90/100) and instruction following (90/100). Top performer in open-weights category.
Llama 4 Scout
Competitive model from Meta (ECI 130). solid benchmark performance. solid intelligence. Released Apr 2025. Open weights available.
GPT-4o
OpenAI flagship multimodal model with 128K context. Strong code (92/100) and instruction following (94/100). Handles vision, audio, and text in unified architecture.
Braintrust
End-to-end AI evaluation and observability platform. Combines eval frameworks with production logging, experiments, and model comparison. Enterprise-focused alternative to LangSmith.
Playwright MCP
Community MCP server for browser automation via Playwright. Navigate pages, fill forms, take screenshots, and interact with web applications from AI agents.
Browser Use
browser automation agent. Strong tool use and planning and memory and self-correction capabilities. 8.5K GitHub stars.
GPT-4 Turbo
OpenAI GPT-4 Turbo with 128K context window. Balanced intelligence (88/100) and code generation (90/100). Being superseded by GPT-4o and o-series models.
Llama 3.3 70B
Meta latest generation open-weights model with 128K context. 70B parameters. Improved instruction following (90/100) and reasoning (87/100). Best Llama 70B variant.
VoltAgent
TypeScript agent framework with built-in observability. Multi-provider support, workflow orchestration. 8.5K GitHub stars.
Harbor
Containerized agent evaluation platform with task registry. Enables pre/post execution checks, sandbox environments, and reproducible agent testing. Used by Anthropic for internal evals.
Claude 3 Opus
Anthropic flagship Claude 3 model with 200K context. Excellent reasoning (94/100) and instruction following (95/100). Premium tier for complex analysis tasks.
Agent Protocol
Open standard for agent-to-agent communication. REST API spec enabling agents to list tasks, execute steps, and share artifacts. Supported by AutoGPT and agent frameworks.
Blink
AI-powered app builder for non-coders. Build websites, SaaS, and mobile apps by chatting with AI. Includes database, auth, hosting, and payment integrations.
Qwen3-32B
Emerging model from Alibaba. Released Apr 2025. Benchmark data.
gpt-4o-mini-2024-07-18
Emerging model from OpenAI. Released Jul 2024. Benchmark data.
c4ai-command-a-03-2025
Emerging model from Cohere. Released Mar 2025. Benchmark data.
Agent2Agent (A2A)
Google's emerging protocol for cross-platform agent interoperability. Enables agents from different platforms to discover, negotiate, and collaborate on tasks.
yi-lightning
Emerging model from 01.AI. Released Dec 2024. Benchmark data.
Qwen3-235B-A22B
Emerging model from Alibaba. Released Apr 2025. Benchmark data.
Llama-4-Maverick-17B-128E-Instruct
Emerging model from Meta AI. Released Apr 2025. Benchmark data.
Phi-3-medium-128k-instruct
Emerging model from Microsoft. Released Apr 2024. Benchmark data.
Qwen2.5-Coder-32B-Instruct
Emerging model from Alibaba. Released Nov 2024. Benchmark data.
Mistral-7B-v0.1
Emerging model from Mistral AI. Released Sep 2023. Benchmark data.
gpt-3.5-turbo-1106
Emerging model from OpenAI. Released Nov 2023. Benchmark data.
chatgpt-4o-01-29-2025
Emerging model from OpenAI. Released Jan 2025. Benchmark data.
o4-mini-2025-04-16 medium
Emerging model from OpenAI. Released Apr 2025. Benchmark data.
chatgpt-4o-03-27-2025
Emerging model from OpenAI. Released Mar 2025. Benchmark data.
Yi-6B
Emerging model from 01.AI. Released Nov 2023. Benchmark data.
falcon-180B
Emerging model from Technology Innovation Institute. Released Sep 2023. Benchmark data.
Llama-2-7b
Emerging model from Meta AI. Released Jul 2023. Benchmark data.
Llama-2-70b-hf
Emerging model from Meta AI. Released Jul 2023. Benchmark data.
Phi-3-small-8k-instruct
Emerging model from Microsoft. Released Apr 2024. Benchmark data.
Phi-3-mini-4k-instruct
Emerging model from Microsoft. Released Apr 2024. Benchmark data.
claude-opus-4-20250514 32K
Emerging model from Anthropic. Released May 2025. Benchmark data.
gemma-7b
Emerging model from Google DeepMind. Released Feb 2024. Benchmark data.
DeepSeek-V2.5
Emerging model from DeepSeek. Released Sep 2024. Benchmark data.
Meta-Llama-3-8B-Instruct
Emerging model from Meta AI. Released Apr 2024. Benchmark data.
Mixtral-8x7B-v0.1
Emerging model from Mistral AI. Released Dec 2023. Benchmark data.
QwQ-32B
Emerging model from Alibaba. Released Mar 2025. Benchmark data.
GPT-4 Agent
OpenAI-powered autonomous agent. Top-tier AgentBench performer at 44.1% overall. Excels at household tasks (78%), knowledge graphs (58%), OS tasks (42%). Strong tool use and planning and self-correction capabilities. Powered by GPT-4.
Claude 3.5 Sonnet Agent
Anthropic-powered autonomous agent. Top-tier AgentBench performer at 42.3% overall. Excels at household tasks (72%), knowledge graphs (55%), OS tasks (40%). Strong tool use and self-correction capabilities. Powered by Claude 3.5 Sonnet.
GPT-4-Turbo Agent
OpenAI-powered autonomous agent. Top-tier AgentBench performer at 40.2% overall. Excels at household tasks (68%), knowledge graphs (52%), OS tasks (39%). Strong tool use capabilities. Powered by GPT-4 Turbo.
Gemini Pro Agent
Google-powered autonomous agent. Strong AgentBench performer at 38.1% overall. Excels at household tasks (62%), knowledge graphs (48%), OS tasks (35%). Powered by Gemini Pro.
Claude 3 Opus Agent
Anthropic-powered autonomous agent. Strong AgentBench performer at 35.4% overall. Excels at household tasks (55%), knowledge graphs (45%), OS tasks (33%). Powered by Claude 3 Opus.
Llama 3 70B Agent
Meta-powered autonomous agent. Capable AgentBench performer at 30.2% overall. Excels at household tasks (45%), knowledge graphs (38%). Powered by llama-3-70b.
Mistral Large Agent
Mistral-powered autonomous agent. Capable AgentBench performer at 28.5% overall. Excels at household tasks (42%), knowledge graphs (35%). Powered by mistral-large.
Qwen-72B Agent
Alibaba-powered autonomous agent. Capable AgentBench performer at 25.1% overall. Excels at household tasks (38%), knowledge graphs (32%). Powered by qwen-72b.
DeepSeek-67B Agent
DeepSeek-powered autonomous agent. Evaluated AgentBench performer at 23.2% overall. Excels at household tasks (35%), knowledge graphs (30%). Powered by deepseek-67b.
Llama 3 8B Agent
Meta-powered autonomous agent. Evaluated AgentBench performer at 18.0% overall. Powered by llama-3-8b.