Data Analysis Using SQL and Excel by Gordon S. Linoff
Author:Gordon S. Linoff
Language: eng
Format: epub, pdf
ISBN: 9781119021445
Publisher: Wiley
Published: 2015-11-26T00:00:00+00:00
Comparison of Hazards by Stops in Year in Excel
The previous chapter showed two ways of comparing changes in survival probabilities over time. The first method was to use starts in a given year, which provides information about acquisition during the year, but not about all the customers who were active during that time. The second approach forces the right censorship date to be earlier, creating a snapshot of survival at the end of each year. Using starts, customers who start in 2006 have relatively lower survival than customers who start in 2004 or 2005. However, the snapshot method shows that 2006 survival looks better than survival at the end of 2004.
This section proposes another method, based on time windows. Using time windows, hazard probabilities are estimated based on customers’ activity during each year. Time windows make it possible to calculate hazard probabilities for all tenures.
The approach is to calculate the number of customers who enter, leave, and stop at a given tenure, taking into account the time window. The following query does the calculation for stops during 2006:
WITH const as ( SELECT CAST('2006-01-01' as DATE) as WindowStart, CAST('2006-12-28' as DATE) as WindowEnd ) SELECT (CASE WHEN tenure < 1000 THEN tenure ELSE 1000 END) as tenure, SUM(enters) as numenters, SUM(leaves) as numleaves, SUM(isstop) as numstops FROM ((SELECT (CASE WHEN StartDate >= WindowStart THEN 0 ELSE DATEDIFF(day, StartDate, WindowStart) END) as tenure, 1 as enters, 0 as leaves, 0.0 as isstop FROM const CROSS JOIN Subscribers s WHERE Tenure >= 0 AND StartDate <= WindowEnd AND (StopDate IS NULL OR StopDate >= WindowStart) ) UNION ALL (SELECT (CASE WHEN StopDate IS NULL OR StopDate >= WindowEnd THEN DATEDIFF(day, StartDate, WindowEnd) ELSE Tenure END) as tenure, 0 as enters, 1 as leaves, (CASE WHEN StopType IS NOT NULL AND StopDate <= WindowEnd THEN 1 ELSE 0 END) as isstop FROM const CROSS JOIN Subscribers s WHERE Tenure >= 0 AND StartDate <= WindowEnd AND (StopDate IS NULL OR StopDate >= WindowStart) ) ) s GROUP BY (CASE WHEN Tenure < 1000 THEN Tenure ELSE 1000 END) ORDER BY tenure
Notice first that the stop window ends on 2006-12-28 rather than 2006-12-31. The 28th is the cut-off date for the data; the table has no starts or stops beyond that date. If the later date were used, then active customers would have their tenures extended by three days. That is, a customer who started on 2006-12-28 would have a tenure of three rather than zero, and the resulting hazards would differ slightly from the point estimates in the last section.
The variable enters counts the number of customers entering the time window at each tenure. This tenure is zero for customers who start during the window and a larger value for customers who start before the window. The variables leaves and stops are calculated based on the tenure on the right censorship date or the tenure when a customer stops.
Each subquery has the same WHERE clause in order to select only customers active during the
Download
Data Analysis Using SQL and Excel by Gordon S. Linoff.pdf
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Implementing Enterprise Observability for Success by Manisha Agrawal and Karun Krishnannair(7296)
Supercharging Productivity with Trello by Brittany Joiner(6555)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6409)
Mastering Tableau 2023 - Fourth Edition by Marleen Meier(6317)
Inkscape by Example by István Szép(6170)
Visualize Complex Processes with Microsoft Visio by David J Parker & Šenaj Lelić(5872)
Build Stunning Real-time VFX with Unreal Engine 5 by Hrishikesh Andurlekar(4860)
Design Made Easy with Inkscape by Christopher Rogers(4571)
Customizing Microsoft Teams by Gopi Kondameda(4113)
Linux Device Driver Development Cookbook by Rodolfo Giometti(3932)
Extending Microsoft Power Apps with Power Apps Component Framework by Danish Naglekar(3702)
Business Intelligence Career Master Plan by Eduardo Chavez & Danny Moncada(3645)
Salesforce Platform Enterprise Architecture - Fourth Edition by Andrew Fawcett(3575)
Pandas Cookbook by Theodore Petrou(3557)
The Tableau Workshop by Sumit Gupta Sylvester Pinto Shweta Sankhe-Savale JC Gillet and Kenneth Michael Cherven(3361)
TCP IP by Todd Lammle(2982)
Drawing Shortcuts: Developing Quick Drawing Skills Using Today's Technology by Leggitt Jim(2910)
Applied Predictive Modeling by Max Kuhn & Kjell Johnson(2857)
Work Smarter with Microsoft OneNote by Connie Clark(2840)
