machine-learning training-data anomaly-detection

Anomaly detection with machine learning without labels

I am tracing multiple signals for a certain period of time and associating them with a timestamp like following:

t0 1 10 2 0 1 0 ...
t1 1 10 2 0 1 0 ...
t2 3  0 9 7 1 1 ... // pressed a button to change the mode
t3 3  0 9 7 1 1 ...
t4 3  0 8 7 1 1 ... // pressed button to adjust a certain characterstic like temperature (signal 3)

where t0 is the tamp stamp, 1 is the value for signal 1, 10 the value for signal 2 and so on.

That captured data during that certain period of time should be considered as the normal case. Now significant derivations should be detected from the normal case. With significant derivation I do NOT mean that one signal value just changes to a value that has not been seen during the tracing phase but rather that a lot of values change that have not yet been related to each other. I do not want to hardcode rules since in the future more signals might be added or removed and other "modi" that have other signal values might be implemented.

Can this be achieved via a certain Machine Learning algorithm? If a small derivation occurs I want the algorithm to first see it as a minor change to the training set and if it occurs multiple times in the future it should be "learned". The major goal is to detect the bigger changes / anomalies.

I hope I could explain my problem detailed enough. Thanks in advance.

Solution

you could just calculate the nearest neighbor in your feature space and set a threshold how far its allowed to be away from your test point to not be an anomaly.

Lets say you have 100 values in your "certain period of time"

so you use a 100 dimensional feature space with your training data (which doesn't contain anomalies)

If you get a new dataset you want to test, you calculate the (k) nearest neighbor(s) and calculate the (e.g. euclidean) distance in your featurespace.

If that distance is larger than a certain threshold it's an anomaly. What you have to do in order to optimize is finding a good k and a good threshold. E.g. by Grid-search.

(1) Note that something like this probably only works well if your data has a fixed starting and ending point. Otherwise you would need a huge amount of data and even than it will not perform as good.

(2) Note It should be worth trying to create an own detector for every "mode" you have mentioned in your question.