Redundancy Detection with Chi-Squared TestΒΆ
Redundancy detection is an important task in data integration. This step applies the Chi-Squared Test to evaluate the correlation between two attributes (for nominal data). A very high Chi-value indicates that one attribute strongly implies the other and may be removed as a redundancy.
Input Parameters
- Nominal data
Output Parameters
- Redundant attributes
Workflow
Algorithm
References
- J. Han, M. Kamber and J. Pei, Data Mining - Concepts and Techniques, 3rd ed., Amsterdam: Morgan Kaufmann Publishers, 2012.