🤖
BoxPwnr Traces & Benchmarks
AI agent benchmark results across security platforms
GitHub
All Platforms
›
...
Loading...
-
Challenges
-
Solved
-
Best Model
-
Traces
Model Performance & Cumulative Solves
Model Leaderboard
Model
Solver
pass@N
Solved / Attempted
Completion
Avg Turns
Challenge Status
Model
Difficulty
Status
All
Solved
Failed
Clear Filters
Best per challenge
All attempts
Matrix
Date
Replay
Report
Challenge
Difficulty
▲
Status
Turns
Duration
Model
Version