Cognition Labs' AI software engineer. SWE-bench Verified 53.8%. First autonomous coding agent to demonstrate end-to-end software development.
Organization that created the model
Software license type
Agent specialization area
Required foundation models to operate
Community usage, market traction, and ecosystem maturity
Claude Opus 4.5 achieved 80.9% on SWE-bench Verified, becoming the first model to exceed the 80% threshold. This represents a 21x improvement from GPT-4's initial 3.8% score in October 2023, demonstrating rapid progress in AI coding capabilities.
Devin emerged as the first commercial AI software engineer, achieving $2B valuation and 53.8% on SWE-bench Verified. As the pioneering commercial coding agent, it established the market category now contested by Claude Code, Cursor, and others.