Security Chaos Engineering by Kelly Shortridge

Security Chaos Engineering by Kelly Shortridge

Author:Kelly Shortridge
Language: eng
Format: epub
Publisher: O'Reilly Media
Published: 2023-04-06T00:00:00+00:00


Chapter 5. Operating and Observing

There is nothing staid, nothing settled, in this universe. All is rippling, all is dancing; all is quickness and triumph.

Virginia Woolf, The Waves

The operations phase of software delivery deals with managing and studying the system while it runs in production. Production is where the system interacts with real customers and users. Much like rehearsals of a play, all the other phases build to this one. Once the software we design, build, and deploy is delivered to end users and is prancing in production environments, it can finally deliver value for the organization. For any organization with digitally delivered products and services, this phase is where the money printer is turned on (and goes brrrr!).

The operating and observing phase is where our mental models encounter their challenger: reality. It’s tempting to only observe what you expect to encounter; think back to the Jurassic Park example from Chapter 1 where they only checked for the number of dinosaurs they expected, not anticipating that there could be more than in their mental models. As we shepherd our systems in the tumultuous pastures of the internet, we must continually refine our mental models—keeping an open mind that our assumptions might be wrong and being curious enough to seek out evidence. That is, we need the last two ingredients in the recipe: feedback loops and flexibility. Operating and observing is the phase where we really start to learn about our systems and where feedback is generated for us to incorporate into our mental models, Effort Investment Portfolios, decision trees, and practices and procedures. The goal in this phase is to refine our mental models with evidence as much as possible.

In this chapter, we’ll start by talking about operational goals in a resilience paradigm and the practices that can help us achieve those goals, particularly through the lens of site reliability engineering (SRE). Then we’ll talk about scalability as an important systems property and how observability looks in the SCE paradigm, and we’ll briefly discuss experimenting with failure (which, after all, ideally happens in production).



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.