AI
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

OpenAI · Thu, 10 Oct 2024 10:00:00 GMT

We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering....