Outlier Detection - Basics

<<< Click here to go back 

Outlier detection refers to detecting patterns in a given data set that do not conform to an established normal behaviour. It is required to be extra cautious before stating a data an outlier, as there might be an explanation for it to be like that.

 An outlier is actaully an outlier only if there is no explanation for it to stand out.

Q3.  How to detect these outliers?

Method selection for detection of outlier is subjected to the requirement and purpose. There are multiple ways to detect outliers,  which can classified majorly as :

Univariate Methods :  The outliers in a variable are detected based on the pattern of the variable itself but not any external factor (variable). These methods are quite useful  and simple in approach, however aspect of possible explanation for the "supposed outlier" is ignored.

Multivariate Methods :  The outliers in the variable are detected considering a variable with other variables and hecce the aspect of possible explanation remains. Multivariate methods are complex, but are quite robust in nature.

During a modeling exercise, we need to use both the methods at various stages. We will cover this art later in the article.

Following are the method (above mentioned category wise ) of outlier detection :

No comments:

Post a Comment

Do provide us your feedback, it would help us serve your better.