Provided by: International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)
Topic: Big Data
Date Added: Apr 2014
Learning data sampled from a non-stationary distribution has been shown to be a very challenging problem in machine learning, because the joint probability distribution between the data and classes changes over time. Most real time problems as they change with time can suffer concept drift. For example, a recommender or advertising system, in which customer's behavior may change depending on the time of the year, on the inflation and on new products made available. An additional challenge arises when the classes to be learned are not represented equally in the training data i.e. classes are imbalanced, as most machine learning algorithms work well only when the class distributions are balanced.