MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
OpenAI · Thu, 10 Oct 2024 10:00:00 GMT
We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering....
We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering....