Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXII by Abdelkader Hameurlain Josef Küng Roland Wagner Sanjay Madria & Takahiro Hara

Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXII by Abdelkader Hameurlain Josef Küng Roland Wagner Sanjay Madria & Takahiro Hara

Author:Abdelkader Hameurlain, Josef Küng, Roland Wagner, Sanjay Madria & Takahiro Hara
Language: eng
Format: epub
Publisher: Springer Berlin Heidelberg, Berlin, Heidelberg


3.4 DAAR Algorithm

DAAR integrates DCI and the association rule classification to select discrimination-aware rules from all rules that have passed the minimum confidence and support thresholds. DAAR’s algorithm is shown in Table 2.Table 2.DAAR algorithm

The algorithm defines the maximum length of the rule as k in the input, so as a result, all rules will contain at most k – 1 antecedents on the left and one class label on the right. In the loop, the algorithm merges the (i–1)-item rule set which was generated in the last round with the 2-item rule set (the base case), to get the i-item rule set. In line 10, the set of rules is sorted by DCI in ascending order for clear presentation to users, and this sorting does not affect the classification results. The majority voting is then used to classify new instances; if the vote is tied (e.g. the same numbers of rules support each class), the sum of DCI of all rules supporting each class is calculated and compared to determine the final class. As discussed in Sect. 3.1, the severity of discrimination is lower when DCI is smaller; therefore the voting will select the class value with lower sum value as it is less discriminatory.

To illustrate this with an example, let’s apply the standard AR and DAAR to our traffic incident dataset, in which the manager is the sensitive attribute. Two of the rules generated by the standard AR are “location = Cahill Expressway, Sydney easy incident” and “manager = Henry easy incident”. Both of these rules have confidence of 0.76, but the second rule includes the sensitive attribute, and its DCI is 0.259. If the threshold for DCI is set as 0.1, our method can filter out the second rule effectively. More examples will be presented in Sect. 5.1.

When applying the non-discriminatory rules on new data (e.g., test datasets), the accuracy, DS and IncS are computed to evaluate the model.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.