BigQuery for Data Warehousing by Mark Mucchetti

BigQuery for Data Warehousing by Mark Mucchetti

Author:Mark Mucchetti
Language: eng
Format: epub
ISBN: 9781484261866
Publisher: Apress


Aha! Orders over $200 fail at five times the rate of orders below $200. In fact, there is a steady curve as order price increases, leading to extremely high failure rates when orders exceed $500. The team then joins back in the specific errors that are occurring and ties that to the same graph.

Figure 12-3Same graph with error type included

Now it’s crystal clear: this elevated error rate is caused by an error saying that the bank declined the card for insufficient funds. As customers try to charge cards for higher and higher dollar amounts, they are more likely to be declined for this reason. It turns out there’s no problem at all. The issue is caused by customers trying to exceed their credit limits. Relieved, the team leaves the new event sinks in place, creates some additional alerting to fire if this expectation is violated, and goes off to happy hour.

It’s easy to see how this might have spiraled out of control in a less sophisticated organization. Pulling these datasets manually from multiple systems and consolidating them would lead to hours of work to look at a single static view. An inconsistent data warehouse might have made it prohibitively difficult to marry the error logs and the user account data. And as the team was analyzing increasingly stale data, they would have been unable to defend or explain new cases still arriving from customers.

This is a key insight for any business process. Shortening your feedback loop allows you to surface and react to relevant information as soon as it’s generated. Now that the team knows this could be an issue, they can monitor it continuously—and should a cluster of customers report the issue again, they can quickly determine if it has the same root cause or if a new issue has surfaced.

Using this as a template, you can easily imagine constructing more sophisticated scenarios. A live clickstream could be integrated to show how users react to unexpected conditions during payment and used to improve the website experience. An alert could be set on one customer receiving large amounts of errors from many different credit cards, indicating potentially fraudulent transactions. You could see if users who experience these errors are likely to succeed at purchasing on another card or at a later time. Or you could just see if users who encounter errors on one order are more or less likely to become repeat purchasers. These are all ways you might fruitfully integrate Cloud Logging with BigQuery (or, generically, application performance monitoring with your data warehouse tool.)



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.