Gemini Pro Agent

AgentsGoogle's improved function calling and grounding

Google-powered autonomous agent. Strong AgentBench performer at 38.1% overall. Excels at household tasks (62%), knowledge graphs (48%), OS tasks (35%). Powered by Gemini Pro.

Metrics

AgentBench Overall38.1

Composite score across all evaluation environments

AgentBench OS35.2

Operating system task completion accuracy

AgentBench Database28.5

Database query and manipulation performance

AgentBench Knowledge Graph48

Knowledge graph reasoning and retrieval

AgentBench WebShop30.1

E-commerce navigation and purchasing

AgentBench ALFWorld62

Simulated household task completion

AgentBench Mind2Web20.3

Real-world web browsing task success

Base Modelgemini-pro

Underlying foundation model powering the agent

ProviderGoogle

Organization that created the model

Score Breakdown

adoption

15%75

Community usage, market traction, and ecosystem maturity

tool use

memory context

self reflection

planning reasoning

Compatibility(2)

Sources(1)

github.com

Developed by

Google DeepMind

Related Signals(2)

Human-AI Gap Persists: 92% vs 75% on Simple Tasks

AgentsJan 11

GAIA benchmark reveals a persistent capability gap between humans and AI on tasks trivially easy for humans. On Level 1 (simplest tasks), humans achieve 92% while best AI achieves 75%. The gap widens at higher difficulty levels, signaling fundamental limitations in AI reasoning and tool use.

Only 16% of Enterprise Deployments Are True Agents

AgentsDec 29

Despite the agent hype, only 16% of enterprise and 27% of startup deployments qualify as true agents. Most production architectures remain simple, built around fixed-sequence or routing-based workflows.

Stack

Tools

Registry

Training

Inference

Cost