Excerption of User Profile from Web Log Data using Hadoop Framework
With the high development of Internet, e-commerce websites now routinely have to work with log datasets which are up to a few terabytes in size. How to remove messy data timely with low cost and find out useful information is a problem the users' have to face. The mining process involves several steps from pre-processing the raw data to establishing the final models. To address the problem of extracting and maintaining a very large number of user profiles from large scale data, the authors first describe the different scalable implementations of the proposed framework. Then they will see the challenges they faced in the implementation. And at the end they will see how hadoop can be used as an efficient solution for the problem.