Curunir Evals

Agentic eval results — local vs cloud models on tool use, planning, memory, and more.

View the Project on GitHub jalemieux/curunir-evals

Agentic Eval Series

This page has moved. Redirecting to the series homepage.