We want the drift detector to be robust to outliers.
It should not raise alarms if there is a single broken input, for example. It should react only when we see "enough" changes altogether.
This can often be achieved by picking the right statistical test to compare the data distributions for the individual features: e.g. Kolmogorov–Smirnov, Anderson–Darling, or Chi-squared test. A lot of nuances still apply as to which test to pick when. We are working on selecting reasonable defaults for the evidently
open-source library to address this. On the other hand, we want the outlier detector to be sensitive enough.
It should raise alarms when individual objects look "strange" even before changes accumulate and reach a critical mass. We will likely opt for a different test, such as the isolation forest algorithm or one-class SVM.
Outlier detection is often important when the cost of a single model mistake is high. We will likely tolerate some false positive alerts to perform extra spot-checking.