
In a test, undergraduates from Peking University outperformed AI!
The PKU team has developed a rigorous chemistry evaluation question bank called SUPERChem, pitting 174 top chemistry students against leading AI models like GPT, Gemini, and DeepSeek in a head-to-head challenge. With 500 cheating-proof, high-difficulty chemistry questions, the test reveals the shortcomings of AI in scientific reasoning.
The creation of SUPERChem fills the gap in multimodal deep-reasoning evaluation within the field of chemistry.
It is reported that the team released this achievement not to highlight the weaknesses of AI but to push it further.