Anomaly detection, also called outlier detection, identifies unexpected events, observations, or items that differ significantly from the norm. Often applied to unlabeled data by data scientists in a process called unsupervised anomaly detection, any type of anomaly detection rests upon two basic assumptions:
- Anomalies in data occur very rarely
- The features of data anomalies are significantly different from those of typical instances
Typically, anomalous data is linked to some sort of problem or rare events such as hacking, bank fraud, malfunctioning equipment, structural defects/infrastructure failures, or textual errors. For this reason, identifying actual anomalies rather than false positives or data noise is essential from a business perspective.