If you've ever shipped a GenAI feature wondering “is this actually good enough?”, you're not alone. Traditional pass/fail QA breaks down when outputs are non-deterministic, and teams end up making release decisions based on subjective “vibe checks” rather than data. This session shows how Product Managers can partner with QA to replace intuition with a systematic AI evaluation pipeline. You'll learn how to define quality as measurable dimensions (groundedness, tone, helpfulness, safety), build a representative test set, and design rubrics that align product goals with engineering...
Nixalkumar Patel

Nixalkumar Patel is a Product Manager at LG Electronics, where he works at the messy intersection of "does this AI actually work?" and "how do we prove it?" He partners with engineering and QA teams to operationalize Generative AI features for consumer products, focusing on AI evaluation: defining measurable quality standards, building rubric-based scorecards, and scaling assessment through automated judging. He has led cross-functional initiatives connecting product goals to test coverage, regression gates, and monitoring for LLM-driven behaviors. Previously, he supported data products integrating an enterprise foundation model into smart-home use cases. He advocates for "evals" as the new unit tests for AI and shares practical frameworks teams can adopt without exposing proprietary data.