Anomaly Detection is one of the more difficult areas of machine learning, but it is also one of the most important.
But how do you predict an event that has a probability of occurring .1% of the time. Or even .01% of the time? I
have an opportunity to dive into a problem that offers even more difficulty, hard drive failure.
Since it is so rare for the drives to fail, anomaly detection will be our modus operandi.
This will allow us to offer the probability that a particular hard drive will fail.