Pentaho Data Integration Cookbook Second Edition by 2013

Pentaho Data Integration Cookbook Second Edition by 2013

Author:2013
Language: eng
Format: epub, mobi
Publisher: Packt Publishing


There's more...

The Fuzzy match step allows you to choose among several matching algorithms, which are classified in the following two groups:

Algorithms based on a metric distance: The comparison is based on how the compared terms are spelled

Phonetic algorithms: The comparison is based on how the compared terms sound, as read in English

The following is a brief comparative table for the implemented algorithms:

Algorithm

Classification

Explanation

Example

Levenshtein

Metric distance

The distance is calculated as the minimum edit distance that transforms one string into the other. These edits can be character insertion or deletion, or substitution of a single character.

The transformation of "pciking" into "picking" requires two changes (the c and i need to be replaced), which would be a distance of 2.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.