Back to Benchmarks

BFCL

Berkeley Function Calling Leaderboard - evaluates LLMs on tool/function calling accuracy across simple, multiple, parallel, and multi-turn scenarios

N/A
Category:agents
View Source

Methodology

AST comparison and execution verification of function calls

Leaderboard

(0 models)
No models have been evaluated on this benchmark yet.

Data source: Epoch AI, “Data on AI Benchmarking”. Published at epoch.ai

Licensed under CC-BY 4.0

BFCL - AI Model Benchmark | NeoSignal