STARWEST 2026 - Big Data, Analytics, AI/Machine Learning for Testing
Sunday, September 20
Strategies for Testing AI-Based Systems
Fundamentals of AI—ICAgile Certification (ICP-FAI)
Monday, September 21
Testing from the Inside: AI-Assisted Unit Testing Edition
NewWant to level up your testing and development skills while harnessing the power of AI? In today’s environments, shifting left is more important than ever to catch bugs early and accelerate delivery. Traditional software testing teaches you to think outside the box from a user’s perspective—but some of the best insights come from looking inside the box, analyzing the code itself, and applying AI to make testing faster and smarter. Join Tariq King as he walks you through the fundamentals of program-based testing, now enhanced with AI assistance. Learn how to apply techniques such as testing...
Become an AI Power User
Impostering a bit in the AI-verse? Overwhelmed by daily AI announcements? Unsure you're using AI most effectively? Tiny bit of FOMO? Jeremiah has you covered! In this workshop, he'll help you become an AI Power User. Become a boss at your job, whatever your role or industry! He'll show you where AI shines and where you'll want to be careful, plus toss you lots of hands-on practice. In the time together, Jeremiah will help you pinpoint YOUR niche, build a custom AI assistant, and develop a comms strategy to show off your new skills. You'll walk out with cutting-edge knowledge, a...
Getting Started with AI-Driven Automation
AI has been rapidly changing the way we approach software testing. Traditional test automation is time-consuming to create and breaks down easily in the presence of change. Thankfully, AI is helping testing teams create less procedural, more resilient tests that are able to self-heal in the presence of modern, rapidly changing, highly dynamic production systems. This sounds great, but you may be asking yourself: How do I get started? What additional skills do I need to learn? What tools are available for me to start using, right now? Join Dionny Santiago as he breaks down different AI...
Stop Guessing and Start Planning with Better Behavior Discovery
Are you tired of working on user stories that seem to be missing vital details for testing? Are you frustrated with being left out of vital design conversations? Or are you fed up with sizing estimates that never turn out to be true to reality? Then it’s time to stop guessing your way through product development and start planning it with better behavior discovery. In this tutorial, we will learn how three vital roles – business, development, testing – can collaborate on what features to build and test through the structured activities of story mapping and example mapping. We will practice...
Cursor and Claude Code for Test Automation Engineers (Basics)
NewPeople know Cursor as a vibe-coding tool, but it can be so much more. This hands-on tutorial will equip you with tons of tips on how to make your test automation workflow more efficient and turn you into a real 10x engineer. We'll take a look at how to personalize your experience with Cursor, write tests faster and without hallucinations, use diagrams and visualization tools to plan your test automation strategy, create your own custom Cursor rules and workflows, and enhance your test suite. Suitable for QA Engineers, Test Automation Engineers, Frontend Developers, and DevOps Engineers. By...
Forming Your Agent Team: From LLM to Agent
NewThe conversation around AI has already moved beyond prompts and chatbots. Today's engineering teams are beginning to build agents that can reason, use tools, and perform meaningful work. While many professionals have experimented with large language models (LLMs), far fewer understand how agents actually work or how to build one themselves. In this hands-on tutorial, you'll move beyond simply using AI and begin building with it. Through practical exercises and real-world examples, you'll learn the foundations of modern agentic systems and follow the evolution from LLM to agent. Along the...
Becoming an AI-Native Testing Organization
NewAI is changing how software is designed, built, and validated. As industries transition to AI-native product development, testing organizations must adapt their practices and skills. Manual testing is no longer enough; traditional automation should be enhanced with AI-driven quality engineering, autonomous agents, and data-powered tactics for faster and more reliable product delivery. Join Adam Auerbach to explore what it means to become an AI-Native Testing Organization. He will outline the AI-native software development lifecycle (SDLC) and highlight necessary changes in quality...
A Quality Engineering Introduction to AI and Machine Learning
Although there are several controversies and misunderstandings surrounding AI and machine learning, one thing is apparent — people have quality concerns about the safety, reliability, and trustworthiness of these types of systems. Not only are ML-based systems shrouded in mystery due to their largely black-box nature, they also tend to be unpredictable since they can adapt and learn new things at runtime. Validating ML systems is challenging and requires a cross-section of knowledge, skills, and experience from areas such as mathematics, data science, software engineering, cyber-security,...
Testing AI Systems That Refuse to Sit Still: Practical Evals, Red Teaming, and Oversight for AI Agents
NewModern AI systems don’t behave like traditional software. The same prompt can produce different outputs, models can drift without code changes, and AI agents may hallucinate, misuse tools, leak context, or confidently invent facts while appearing completely functional. In this hands-on tutorial, Jeremiah Marble will show attendees how to test and harden modern AI systems using practical, lightweight techniques teams can apply immediately. Participants will build tiny AI agents, intentionally break them through prompt injection, unsafe outputs, hallucinations, and memory drift, then create...
Cursor and Claude Code for Test Automation Engineers (Advanced)
NewYou have the basics of AI tools covered, now let’s push them to the limits. In this advanced, hands-on workshop, we’ll go beyond vibe-coding and explore how to use AI tooling as a strategic orchestration tool for test automation engineers. Suitable for QA Engineers, Test Automation Engineers, Frontend Developers, and DevOps Engineers; we’ll dive deep into advanced techniques like multi-agent reasoning for debugging, building robust end-to-end tests, maintaining long-context conversations without drift, and crafting reusable automation patterns. We’ll be building custom tooling and agents...
Leading Your Agent Team: From Agent to Agentic Orchestration
NewBuilding an agent is only the beginning. The next wave of AI innovation is being driven by teams of specialized agents working together to solve increasingly complex problems. As organizations move beyond individual agents and toward coordinated AI systems, a new challenge emerges: how do you lead, coordinate, and govern intelligent teams? In this hands-on tutorial, you'll move beyond creating agents and begin learning how to orchestrate them. Through practical exercises and real-world examples, you'll explore how specialized agents can collaborate, share responsibilities, and work...
Building Apps and Tests Together with AI: Agentic Spec-Driven Development
NewWhat if you could turn ideas into working software with tests to prove it, using AI as your collaborator? This hands-on half-day tutorial introduces agentic spec-driven development, a practical approach for building web apps and test automation together with AI coding agents like Claude, Cursor, Copilot, or Codex. Designed for testers and anyone involved in software development, this session shows how to define a clear project context and rules. You will write feature specs as structured markdown with user stories, design decisions, and acceptance criteria. Then you will use AI to generate...
Tuesday, September 22
Strategies for Testing Autonomous AI and Multi-Agent Architectures
NewTesting Artificial Intelligence (AI) agents presents a paradigm shift from traditional software quality assurance. Unlike deterministic, rule-based applications, AI agents exhibit emergent behaviors, learn from their environments, and make autonomous decisions, making conventional test case design and execution insufficient. This tutorial will provide a comprehensive understanding of the unique challenges and advanced strategies required to effectively test single and multi-agent AI systems. Participants will learn how testing agents differ significantly from testing traditional software....
Agentic AI: From Rules to Reasoning
NewAI agents have existed for decades, but generative AI has fundamentally changed what agents can do and how they are designed and built. Come and explore the evolution of AI agents across two major waves. First, learn the foundations of Agentic AI through agents built using rules, heuristics, and traditional machine learning, examining where these approaches excel and why they struggle with complexity, ambiguity, and scale. Then dive into the second wave of agents powered by generative AI and multimodal large language models. These modern agents can reason, plan, use tools, and interact...
Top-Notch Web Testing with Playwright and AI (Basics)
NewNew to Playwright or only scratched the surface? This hands-on coding tutorial is designed for developers, testers, and SDETs who want to start building real web tests quickly using Playwright with TypeScript. We’ll begin with what makes Playwright stand out from other tools. Then, we’ll walk step-by-step through setting up a project, recording tests, refining them into clean, maintainable test cases. From there, we’ll introduce AI-powered workflows using the Playwright CLI with skills to generate test plans and test cases. By the end of the session, you’ll have working tests, a clear...
AI-Driven API Test and Automation
In this tutorial, you’ll learn how to use GenAI to implement and test a REST API from scratch. Using Cursor as your development environment, you’ll be guided through a hands-on experience that combines powerful tools like Mocha, Supertest, k6, and GitHub Actions to implement automated testing and continuous integration. Under the guidance of Julio de Lima, you’ll first dive into essential REST API architecture concepts to build a solid foundation. Then, with the support of GenAI, you’ll generate and refine your API, create functional test cases, and automate them to validate behavior and...
Accelerate Quality: A Hands-On Tutorial on AI-Assisted Software Testing
NewUnlock the next era of Quality Assurance (QA) by moving beyond simple code assistance and embracing the power of AI agents. This half-day, intensive tutorial offers hands-on experience with cutting-edge Generative AI tools, including GitHub Copilot and leading GenAI Chatbots, to integrate AI at every stage of the testing lifecycle. You will master practical techniques to dramatically accelerate quality, learning how to leverage AI to analyze requirements and identify risks, create comprehensive test cases and data, and accelerate test automation by generating scripts and suggesting...
Top-Notch Web Testing with Playwright and AI (Advanced)
NewReady to push Playwright to the next level? This hands-on coding tutorial targets developers, testers, and SDETs who already know Playwright basics and want to scale their automation using AI-driven workflows. Using Playwright with TypeScript, we’ll dive straight into coding with tools like Playwright CLI and MCP, alongside AI coding agents such as Claude, Cursor, Copilot, or Codex. You’ll learn how to plan tests with AI, generate new coverage, and heal broken tests efficiently, while structuring your code with page objects and managing test data to avoid collisions. We’ll also explore...
From PRD to Production: Designing a Test Strategy That Actually Works
NewMost test strategies don’t fail in execution. They fail before testing even begins. They start too late, focus too narrowly on automation, and miss the one thing that actually matters: understanding what we are building and why. Janna and Cara will walk you through building a modern test strategy from the ground up, starting with the product requirements document (PDR) and carrying that intent through test design, execution, and measurement. They will break down a practical, end-to-end approach to quality strategy that connects product intent to engineering reality. You will learn how to...
AiGovOps for Testers: Validating AI Systems You Can Ship — and Defend
NewAI is in the systems you're testing — and the rules just changed. With the EU AI Act in full enforcement, 250+ U.S. state bills in motion, and audit demands rising, "we tested it" is no longer enough. Regulators, customers, and your own legal team want evidence: who validated what, against which risks, with what controls running when the model shipped. The good news: testers are the natural front line for AI governance. You already own validation, traceability, and the test-evidence chain that auditors will ask for. The opportunity is to make that work load-bearing for AI systems — and to...
Automating Test Design with a Little Help from Generative AI
NewRob Sabourin has spent over four decades pioneering automated test design across a wide range of technology stacks. More recently, he’s been exploring the power, promise—and occasional perversity—of applying Generative AI to the challenges of test design. In this lively and hands-on tutorial, Rob shares practical lessons from his experience using Generative AI to address real-world testing problems. From success stories and failures to unexpected surprises, he offers a candid look at what works, what doesn’t, and why. You will explore a variety of proven test design techniques, including...
Prompt Engineering for Software Quality Professionals
With the sudden rise of ChatGPT and large language models (LLMs), professionals have been attempting to use these types of tools to improve productivity. Building off prior momentum in AI for testing, software quality professionals are leveraging LLMs for creating tests, generating test scripts, automatically analyzing test results, and more. However, if LLM's are not fed good prompts describing the task that the AI is supposed to perform, their responses can be inaccurate and unreliable, thereby diminishing productivity gains. Join Tariq King as he teaches you how to craft high-quality AI...
AI-Enabled SDLC: Let the Robot Do the Work
NewTheory is great, but what about getting your hands dirty with some real problem-solving? This tutorial is about building your own AI Operating System that you'll take home and use immediately. Join Melissa Benua and Ryan Lee for a hands-on session where you'll construct a personalized AI toolkit that solves YOUR specific testing problems, regardless of your coding background. We'll start by deconstructing the modern SDLC, the same one you're already working in, and show you exactly where AI accelerates each phase. Then we'll prove a critical insight: code is now cheap. AI can generate...
Wednesday, September 23
Tester 2.0—Becoming Indispensable in the Age of AI
Over the past few months, it has become clear that human testing expertise isn't going anywhere. However, surviving the AI era is not the same as thriving in it; simply "not being replaced yet" isn’t the same as being future-ready. With code being produced faster than ever and teams getting leaner, there seems to be a growing gap between code velocity and code quality. Are we just going to have to live with the fact that software quality is in decline? The answer is "no, but..." While some things stay the same, testers of tomorrow need to prepare for the trials ahead. Advocates for quality...
How Technical Program Management Became the Architecture Layer of Modern AI Execution
As Generative AI moves from prototypes to production, one truth is emerging across every major tech organization — scaling AI responsibly isn’t just a data, science or engineering challenge but also an orchestration challenge. Behind every AI system that ships reliably, there’s a hidden architecture of alignment: between model builders, applied scientists, data engineers, security, compliance, and product teams. In that architecture, Technical Program Management (TPM) is becoming the connective tissue — the execution layer of modern AI. In this talk, Raj Karan shares lessons from leading...
Telemetry at Scale: Lessons from Building Observability for Distributed Systems
Modern distributed systems fail in messy, non-obvious ways: a small latency spike in one microservice can cascade through queues, sidecars, gateways, and control planes, yet traditional logging and isolated dashboards rarely reveal the true root cause. In this talk, Sneha will share how Microsoft tackled this while building the telemetry and observability platform behind Azure Container Apps and the Aspire Dashboard, used across thousands of customer environments. They standardized on OpenTelemetry to unify traces, metrics, and logs across heterogeneous workloads, invested in consistent...
How Testers Can Break AI: Practical Techniques to Find Bias, Hallucinations, and Accessibility
As AI-powered features (especially generative AI) are rapidly integrated into modern software, testing teams face a critical challenge. Traditional testing approaches focus on correctness and performance but fail to uncover ethical risks such as bias, hallucinations, and accessibility regressions. In real projects, this has led to AI systems that technically “work” yet exclude users, generate misleading outputs, or erode trust. In this talk, Aditi addresses this gap by reframing AI quality as a testable concern and applying practical, tester-led techniques rather than data science-heavy...
AI-Driven Identity Governance: How Testing Teams Secure Access in Zero Trust Environments
As organizations adopt Zero Trust Architectures, Identity and Access Management has become a critical security control that testing teams can no longer treat as a black box. Traditional role-based access models struggle to keep pace with dynamic cloud environments, non-human identities, and evolving threat patterns. This session explores how AI-driven identity governance transforms access validation into a continuous, testable security practice. Drawing from real enterprise implementations across finance, healthcare, and e-commerce, the presentation demonstrates how behavioral analytics,...
AI-Assisted Accessibility Testing: Generating WCAG-Focused Checks with Playwright MCP and LLM CLI
Accessibility testing is critical, yet often under-tested due to limited expertise, time constraints, and manual effort. While tools exist, teams still struggle to translate WCAG guidelines into actionable, repeatable automated checks. In this session, Sidhartha will explore how AI can responsibly assist accessibility testing — not by replacing standards or human judgment, but by bridging the gap between guidelines and executable tests. Using Playwright MCP together with an LLM CLI, Sidhartha will demonstrate how AI can: interpret WCAG requirements, generate meaningful accessibility test...
Evaluating Agentic LLM Apps: Beyond Vibes
"It seems to work" isn't a deployment strategy. As AI agents move from demos to production, teams discover that traditional software testing falls apart — outputs are non-deterministic, "correct" is subjective, and yesterday's perfect prompt fails mysteriously today. This talk tackles the unique challenges of verifying agentic applications. Rushabh will explore why agent evaluation is fundamentally harder than traditional ML testing: multi-step reasoning chains, tool use side effects, and the compounding uncertainty problem. You'll learn practical approaches to building evaluation datasets...
Beyond Coverage: Governing GenAI-Generated Tests with Metrics Leaders Can Trust
Generative AI has created a new risk for quality leaders: "Coverage Theater." This occurs when AI-generated test suites inflate code coverage metrics to record highs while silently reducing assertion quality, leaving teams with green dashboards but escaping defects. In this session, Niranjan will dismantle this illusion by implementing a Quality Governance Audit using two advanced metrics that reveal what coverage hides. He will introduce the Assertion Strength Index (ASI), a scoring framework that rates tests from generic "existence checks" to rigorous business validation, exposing GenAI’...
Testing AI Systems That Change Over Time
Modern software systems increasingly rely on AI-driven features such as recommendations, copilots, and automated decision-making. Unlike traditional software, these systems evolve over time as data changes and user behavior shifts, making them difficult to test using deterministic test cases alone. Many testing teams struggle with unpredictable outputs, flaky tests, and failures that only appear after deployment. In this session, Dr. Longe will address the challenge of testing AI-enabled systems that change over time and explain how testers can adapt familiar testing principles to these...
The Quality Nervous System
The Quality Nervous System is a biologically inspired network where AI agents and humans operate symbiotically in a single adaptive system. AI agents continuously explore, learn, and execute in real time across software at machine speed, while humans provide the judgment, strategy, and purpose to assure outcomes align with user and business goals. AI partnering fundamentally changes how software is built. Humans now collaborate with systems that generate code, tests, insights, and behavior at unprecedented speed and volume. Continuous real-time results flood teams faster than they can...
The Future of Test Automation: Trends, Challenges, and Your Burning Questions - Panel
Test automation is evolving at an unprecedented pace, driven by AI, continuous testing, and the ever-increasing complexity of modern software development. Join us for an engaging panel discussion where industry experts will explore the latest trends in test automation, including AI-powered testing, low-code/no-code automation, and the role of testers in a rapidly changing landscape. This session will also serve as an open forum for attendees to ask any lingering questions from the conference. Whether it’s about the future of automation, best practices, or the impact of AI on testing roles...
Thursday, September 24
Testing Event-Driven Systems Without Losing Your Sanity: Practical Patterns for AWS Serverless and Asynchronous Workflows
Event-driven architectures promise speed and scale, but they also introduce testing pain: eventual consistency, non-deterministic timing, duplicated events, and failures that only appear in production. In this talk, Parthiban will share a practical, field-tested approach he has used while leading distributed teams building regulated FinTech workloads on AWS serverless components such as Lambda, EventBridge, Step Functions, SQS, and API Gateway. He’ll start with the common failure patterns that make traditional end-to-end testing brittle, slow, and expensive. Next, he will walk through how...
Testing AI Systems That Learn in Production: From Static Test Cases to Continuous Validation
As organizations increasingly deploy AI and machine learning systems into production, testing practices built for static, rule-based software are no longer sufficient. Unlike traditional applications, AI systems learn from data, change behavior over time, and are sensitive to data drift, bias, and feedback loops, making defects harder to detect with conventional test cases. This session presents a practical, experience-driven approach to testing AI systems across the full lifecycle, from model development to live deployment. Drawing on real-world implementations and applied research, the...
Taming the Stochastic Beast: Building AI Evaluation Pipelines for GenAI Releases
If you've ever shipped a GenAI feature wondering “is this actually good enough?”, you're not alone. Traditional pass/fail QA breaks down when outputs are non-deterministic, and teams end up making release decisions based on subjective “vibe checks” rather than data. This session shows how Product Managers can partner with QA to replace intuition with a systematic AI evaluation pipeline. You'll learn how to define quality as measurable dimensions (groundedness, tone, helpfulness, safety), build a representative test set, and design rubrics that align product goals with engineering...
Agentic Quality at Scale—Orchestrating a QA Swarm for Swift Delivery
As delivery cycles compress, single AI agents are not enough. The next leap is a coordinated swarm of specialized QA agents, each owning a slice of the quality lifecycle (requirements, test generation, execution intelligence, defect triage, and release decisions). This session shows how to design an agent operating model that scales across teams, products, and pipelines without losing trust, traceability, or control. This session will introduce a practical blueprint for deploying multiple cooperating AI agents across the SDLC, with clear boundaries, KPIs, and governance that align to...
Building Ethical AI Literacy in Next-Generation Test Automation Leaders
As AI-driven automation becomes the backbone of modern QA—from intelligent test generation and self-healing scripts to risk-based prioritization and autonomous agents—the need for ethical and responsible AI leadership in testing has become critical. While teams rapidly adopt AI tooling, the ethical dimension of how these systems operate, learn, and influence decision-making is often underdeveloped. This session reframes test automation through an ethical leadership lens, walking through the full software delivery lifecycle and identifying where AI-powered testing introduces new risks,...
RAG Testing That Holds Up: Evaluating LLMs for Faithfulness, Boundaries, and Trust
PreviewMany teams are adopting RAG to constrain LLMs to internal documents, policies, and knowledge bases, but “using RAG” does not guarantee trustworthy behavior. In practice, models still hallucinate, blend outside knowledge, ignore source boundaries, and produce confident answers that are not supported by retrieved evidence. Traditional test approaches (happy-path assertions, correctness spot checks, performance metrics) often miss these failures because the output reads plausibly correct. Drawing from real evaluation work on document-constrained enterprise systems, this session...
Scaling Quality with AI: How We Built Agent-Based QA and a Secure Internal GPT
As financial systems grow in scale and regulatory complexity, traditional QA approaches struggle to keep pace with the volume of requirements, risks, and test artifacts that must be continuously reviewed and maintained. In regulated fintech environments, QA teams must balance speed, accuracy, and compliance—often relying on manual effort that does not scale. This session presents a real-world case study of how the Acba Bank QA organization evolved from manual, human-heavy processes to an AI-assisted quality ecosystem built around purpose-driven AI agents and a secure, in-house GPT platform...