Thursday, September 24, 2026 - 3:00pm to 4:00pm

LLM-Powered Observability for Modern Cloud Systems: Telemetry Reasoning, Incident Triage, and Faster Root-Cause Analysis

Modern cloud systems generate overwhelming volumes of telemetry—metrics, logs, traces, and events—yet incident response still relies on manual correlation, tribal knowledge, and brittle rule-based alerts. This work presents an approach to LLM-powered observability that augments traditional monitoring with telemetry reasoning to accelerate incident triage and root-cause analysis. Prashanthi proposes a pipeline that structures heterogeneous signals into a unified incident context, enriches them with service topology, deployment metadata, and SLO/SLA objectives, and guides engineers with evidence-backed hypotheses, next-best queries, and remediation suggestions. By turning noisy telemetry into actionable narratives, the approach aims to reduce MTTD and MTTR in real production settings.

Career & Personal Development

IoT & Embedded Testing

Performance Testing & Monitoring

Testing in DevOps

Test Automation Engineer

Software Tester

Project Manager

Prashanthi Matam

Discover Financial Services

Prashanthi Matam is a Senior Software Engineer specializing in AI, MLOps, and cloud-native systems. She has deep expertise in full-stack development, cloud infrastructure, DevOps, and AI/ML telemetry, building scalable production systems on platforms such as Kubernetes, TensorFlow, and Lang Chain. Her work spans intelligent observability solutions, predictive maintenance systems. She has strong customer-facing experience, helping enterprise teams deploy, optimize, and operationalize AI models in production, guiding them across the full AI lifecycle—from distributed training and model development to inference optimization, acceleration, and performance tuning. Previously, she focused on improving system reliability, scaling cloud-native workloads, and enhancing the efficiency of data and AI pipelines. Prashanthi holds a master’s degree in computer science from Northeastern University.

LLM-Powered Observability for Modern Cloud Systems: Telemetry Reasoning, Incident Triage, and Faster Root-Cause Analysis

Prashanthi Matam

Related Sessions