Fundamentals of Predictive Analytics with JMP, Second Edition by Klimberg PhD Ron & B. D. McCullough

Fundamentals of Predictive Analytics with JMP, Second Edition by Klimberg PhD Ron & B. D. McCullough

Author:Klimberg, PhD, Ron & B. D. McCullough
Language: eng
Format: azw3, mobi, pdf
Publisher: SAS Institute
Published: 2016-12-20T08:00:00+00:00


Often there is a natural break where the distance jumps up suddenly. These breaks suggest natural cutting points to determine the number of clusters. The “best” number of clusters is typically chosen at or near this “elbow” of the curve. The elbow suggests which clusters should be profiled and reviewed with the subject-matter expert. Based on Figure 9.11, the number of clusters would probably be 3, but 2 or 4 would also be possibilities. Choosing the “best” number of clusters is as much an art form as it is a science. Sometimes a particular number of clusters produces a particularly interesting or useful result. In such a case, SSE can probably be ignored.

The difference between two and three or three and four clusters (indeed between any pair of partitions of the data set) is not always statistically obvious. The fact is that it can be difficult to decide on the number of clusters. Yet this is a very important decision, because the proper number of clusters can be of great business importance. In Figure 9.12 it is difficult to say whether the data set has two, three, or four clusters. A pragmatic approach is necessary in this situation: choose the number of clusters that produces useful clusters.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.