Scientists design new 'AGI benchmark' that indicates whether any future AI model could cause 'catastrophic harm'
Favicon 
www.livescience.com

Scientists design new 'AGI benchmark' that indicates whether any future AI model could cause 'catastrophic harm'

OpenAI scientists have designed MLE-bench — a compilation of 75 extremely difficult tests that can assess whether a future advanced AI agent is capable of modifying its own code and improving itself.