SRE Antipatterns in Everyday Life and What They Teach Us

Jennifer Petoff

(2021)

Download Google Scholar

Abstract

Real world experience and things that go wrong are two of life’s best teachers. This talk will explore key elements of scalable large-system design and Site Reliability Engineering (SRE) principles* through anti-patterns encountered in real life. Find out what lessons can be gleaned from watching the dynamics in a crowded cafe or dealing with a security issue during a hotel stay. Learn about fundamental site reliability engineering principles and practices including:

-Avoiding cascading failures
-Not feeding the machines with human toil
-Writing blameless postmortems
-Engineering solutions to eliminate classes of errors rather than implementing point fixes

These principles will be framed through a lens of the suboptimal while demonstrating the impact of SRE anti-patterns on user trust.

* SRE is often thought of as a specific implementation of the DevOps interface.

Research Areas

Software Systems

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

SRE Antipatterns in Everyday Life and What They Teach Us

Abstract

Research Areas

Learn more about how we research

Google Ai

Google Cloud

Google DeepMind

Google Labs