Data Engineering on Azure by Vlad Riscutia
Author:Vlad Riscutia [Vlad Riscutia]
Language: eng
Format: epub
Publisher: Manning Publications
Published: 2021-08-16T16:00:00+00:00
6.3.4 Postmortems
Because we are talking about data movement at scale, it is likely that multiple pipelines will fail during any period of time due to transient issues, upstream issues, scaling issues, etc. The recommendation is to periodically look at the top offenders : ETL pipelines generating the most incidents, even if transient. Aim to understand what is causing the issues and fix them as needed to keep the overall Azure Data Factory healthy.
For example, our pipeline ingesting website telemetry might start perfectly fine, but as traffic to the website increases and logs grow in size, the pipeline might start glitching. First, we start getting a few timeouts. Over time, the pipeline more often than not times out. While still somewhat workable, each run now requires multiple retries and sometimes manual intervention. This would be a clear sign that the original implementation no longer scales, and the pipeline needs to be optimized. A core practice for an SRE is a process for incident postmortems.
Definition An incident postmortem brings teams together to take a deep look at an incident to understand what happened, why it happened, and how the issue can be prevented in the future.
We wonât talk too much about how to run postmortems. Youâll find many other resources focused on operational best practices. The only thing we want to highlight is that SRE teams usually have a recurring meeting where they do postmortems for major incidents. Rolling up top offenders into the postmortem process ensures that these issues become visible and that these donât keep adding up until our whole ETL becomes unmanageable.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7806)
Grails in Action by Glen Smith Peter Ledbrook(7719)
Azure Containers Explained by Wesley Haakman & Richard Hooper(6802)
Configuring Windows Server Hybrid Advanced Services Exam Ref AZ-801 by Chris Gill(6797)
Running Windows Containers on AWS by Marcio Morales(6318)
Kotlin in Action by Dmitry Jemerov(5089)
Microsoft 365 Identity and Services Exam Guide MS-100 by Aaron Guilmette(5045)
Combating Crime on the Dark Web by Nearchos Nearchou(4620)
Microsoft Cybersecurity Architect Exam Ref SC-100 by Dwayne Natwick(4571)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4437)
The Ruby Workshop by Akshat Paul Peter Philips Dániel Szabó and Cheyne Wallace(4311)
The Age of Surveillance Capitalism by Shoshana Zuboff(3977)
Python for Security and Networking - Third Edition by José Manuel Ortega(3873)
The Ultimate Docker Container Book by Schenker Gabriel N.;(3532)
Learn Windows PowerShell in a Month of Lunches by Don Jones(3528)
Learn Wireshark by Lisa Bock(3489)
Mastering Python for Networking and Security by José Manuel Ortega(3376)
Mastering Azure Security by Mustafa Toroman and Tom Janetscheck(3353)
Blockchain Basics by Daniel Drescher(3322)
