Hampel Identifier

Hampel Identifier algorithm decides whether a value is a value outside the region of interest based on that value’s distance from the estimated median distribution. Generally, if

\[|Y_i - \text{Median}(Y)| > 3 \cdot \text{MAD} \, \text{,}\]

the data point is considered outside the region of interest. \(\text{Median}(Y)\) is the Median, \(Y_i\) is \(i\)th element of \(Y\) and \(\text{MAD}\) corresponds to the Median Absolute Deviation value.

Input Parameters

Parameter Type Constraint Description Remarks
\(Y\) \(Y \in \mathbb R^N\) \(N \in \mathbb{N}\) Input data sequence of length \(N\) None

Output Parameters

Parameter Type Constraint Description Remarks
\(\hat{Y}\) \(\hat{Y} \in \mathbb R^N\) None Values in the \(Y\) list which are outside the region of interest are marked None

Single Steps using the Algorithm

References

  • R.K. Pearson, Mining Imperfect Data, Society for Industrial and Applied Mathematics, Philadelphia, PA., 2005.