GPT-4-Turbo Agent

AgentsCost-effective alternative to GPT-4 for agentic tasks

OpenAI-powered autonomous agent. Top-tier AgentBench performer at 40.2% overall. Excels at household tasks (68%), knowledge graphs (52%), OS tasks (39%). Strong tool use capabilities. Powered by GPT-4 Turbo.

Metrics

AgentBench

AgentBench Overall40.2

Composite score across all evaluation environments

AgentBench OS38.5

Operating system task completion accuracy

AgentBench Database30

Database query and manipulation performance

AgentBench Knowledge Graph52.1

Knowledge graph reasoning and retrieval

AgentBench WebShop33.2

E-commerce navigation and purchasing

AgentBench ALFWorld68

Simulated household task completion

AgentBench Mind2Web22.5

Real-world web browsing task success

Info

Base Modelgpt-4-turbo

Underlying foundation model powering the agent

ProviderOpenAI

Organization that created the model

Other

Swe Bench Verified26.8

Score Breakdown

adoption

15%88

Community usage, market traction, and ecosystem maturity

tool use

memory context

self reflection

planning reasoning

Compatibility(2)

Sources(1)

github.com

Developed by

OpenAI

Stack

Tools

Registry

Training