OpenStack for Architects by Ben Silverman
Author:Ben Silverman
Language: eng
Format: epub
Tags: COM091000 - COMPUTERS / Cloud Computing, COM088000 - COMPUTERS / System Administration / General, COM011000 - COMPUTERS / Systems Architecture / General
Publisher: Packt Publishing
Published: 2018-05-31T11:23:04+00:00
The future of OpenStack troubleshooting and Artificial Intelligence-driven operations
As systems and workloads become increasingly abstracted, the velocity, frequency, and variety of data continues to multiply at exponential rates. At one time, many years ago, it was sufficient for administrators to simply log into servers that were unresponsive and comb through a handful of log files in order to determine root cause analysis (RCA).
Today, for example, in OpenStack, there are more than 15 different log files created by OpenStack control plane servers, as well as multiple unique logs in each of the compute servers. All of these logs, combined with logs from the operating systems, routers, switches, load balancers, WAN compressors equals a mountain of data to search in order to find a true incident RCA. The voracity, velocity and volume of data to search through manually decreases an administrator's ability to find RCA and solve issues. This, plus the number of new servers added to enterprises daily are contributing to a climbing Mean Time To Recovery (MTTR). Today, 3 hours is the average time it takes IT professionals to repair a single problem. That is simply unacceptable when we want our latest purchases delivered to our house via drone in 30 minutes.
One of the recent technological use cases around cloud computing has been machine learning (ML) and Artificial Intelligence (AI). The ability to harness capacity on demand and configure large clusters of systems to power massive parallel processing of data is quickly becoming a reality. Some companies are taking this concept of AI one step further and using operational data to enable what is being called Artificial Intelligence Operations (AIOps). These platforms are using correlation and resolution data to help administrators arrive at an RCA quicker by continually scanning log files in real time. As logs stream into the platform they need to recognize where there may be a problem by correlating across multiple log files, correlating the log sections by applying complex rules across multiple devices, and even multiple environments.
One such company, Loom Systems (https://www.loomsystems.com/), has created Loom Cloud Intelligence (LUCI), which provides a solution to the log sprawl problem by applying an AI to monitor all platforms and systems in a single plane of glass. LUCI, shown in the screenshot below, has the ability to see and correlate all IT sources in real time and push alerts via legacy alerting or using ChatOps. LUCI also allows administrators to drill down on alerts and correlated issues and creates stories of incidents based on empirical data retrieved from numerous log files.
LUCI also suggests possible RCA and remediations based on AI-driven problem determination. Loom Systems supports many different platforms including OpenStack, VMware, AWS, Microsoft Azure, and Google Cloud. Loom has a SaaS solution as well as an on-premise solution for those who cannot send log data offsite:
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7792)
Grails in Action by Glen Smith Peter Ledbrook(7705)
Configuring Windows Server Hybrid Advanced Services Exam Ref AZ-801 by Chris Gill(6669)
Azure Containers Explained by Wesley Haakman & Richard Hooper(6665)
Running Windows Containers on AWS by Marcio Morales(6188)
Kotlin in Action by Dmitry Jemerov(5076)
Microsoft 365 Identity and Services Exam Guide MS-100 by Aaron Guilmette(4971)
Combating Crime on the Dark Web by Nearchos Nearchou(4561)
Microsoft Cybersecurity Architect Exam Ref SC-100 by Dwayne Natwick(4446)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4426)
The Ruby Workshop by Akshat Paul Peter Philips Dániel Szabó and Cheyne Wallace(4224)
The Age of Surveillance Capitalism by Shoshana Zuboff(3964)
Python for Security and Networking - Third Edition by José Manuel Ortega(3797)
Learn Windows PowerShell in a Month of Lunches by Don Jones(3515)
The Ultimate Docker Container Book by Schenker Gabriel N.;(3464)
Learn Wireshark by Lisa Bock(3367)
Mastering Python for Networking and Security by José Manuel Ortega(3362)
Mastering Azure Security by Mustafa Toroman and Tom Janetscheck(3338)
Blockchain Basics by Daniel Drescher(3308)
