Meet Android Agent Arena (A3): A Comprehensive and Autonomous Online Evaluation System for GUI Agents
The development of large language models (LLMs) has significantly advanced artificial intelligence (AI) across various fields. Among these advancements, mobile GUI agents—designed to perform tasks autonomously on smartphones—show considerable potential. However, evaluating these agents poses […]
