Navigating AI Agent Evaluation in Production