(Master of Information Science)
- It is different from tasks like Class classification/Regression because it detects things that have no previous examples.
- In terms of Predictive and Discovery-based Machine Learning, it is considered a discovery-based approach.
- It searches for outliers.
- It is also effective to use clustering to group past data and consider anything that does not fit into any group as an anomaly.
(Master of Information Science) Machine Learning
- Anomaly Detection
- k th-NN
- First, measure the distance between your point and all other points.
- Sort them and use the distance to the k-th nearest point as the score.
- Why k-th?
- We want to find outliers.
- If k=1, we may not be able to find outliers that are close to each other.
- So we use k=2, 3, or higher.
- k th-NN