Data Discretization with Chi-Squared TestΒΆ

Causal Step

This step first preforms an initial discretization for the input discrete data, then repeats a bottom-up merging process continuously until a termination condition is fulfilled. The merging process consists of two steps: (1) perform the Chi-squared test for each pair of adjacent intervals, (2) merge the pair of adjacent intervals with the lowest Chi-square value. Merging continues until all pairs. For details refer to the ChiMerge algorithm.

Input Parameters

  1. Input data
  2. Chi-squared significance threshold

Output Parameters

  1. Discretized data

Workflow

../../../_images/workflow42.svg

Algorithm

Chi-Squared Test

References

  • R. Kerber, ChiMerge: Discretization of Numeric Attributes, Learning: Inductive, AAAI 92, pp. 123-128, 1992.