Statistical and Machine-Learning Data Mining by Ratner Bruce

Statistical and Machine-Learning Data Mining by Ratner Bruce

Author:Ratner, Bruce [Ratner, Bruce]
Language: eng
Format: epub
ISBN: 978-1-4665-5121-3
Publisher: CRC Press
Published: 2011-03-30T16:00:00+00:00


TABLE 14.7

Logistic Regression of Response on X2, X1X2, and X2_SQ

TABLE 14.8

Classification Table of Model with X2, X1X2, and X2_SQ

I conclude that the relationship is quadratic, and the corresponding model is a good fit of the data. Thus, the best RESPONSE model is defined by X2, X1X2, and X2_SQ.

14.9 Database Implication

Database marketers are among those who use response models to identify individuals most likely to respond to their solicitations and thus place more value in the information in cell (1, 1)—the number of responders correctly classified—than in the TCCR. Table 14.9 indicates the number of responders correctly classified for the models tested. The model that appears to be the best is actually not the best for a database marketer because it identifies the least number of responders (1,256).

I summarize the modeling process as follows: The base RESPONSE model with the two original variables, X1 and X2, produces TCCR(X1, X2) = 48.37%. The interaction variable X1X2, which is added to the base model, produces the full model with TCCR(X1, X2, X1X2) = 55.64%, for a 15.0% classification improvement over the base model.

Using the new CHAID-based data mining approach to determine whether a component variable can be omitted, I observe that X2 (but not X1) can be dropped from the full model. Thus, the best-so-far model—with X2 and X1X2—has no loss of performance over the full model: TCCR(X2, X1X2) = TCCR(X1, X2, X1X2) = 55.64%.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.