Content Based Web Sampling

Provided by: AICIT
Topic: Big Data
Format: PDF
Web characterization methods have been studied for many years. Most of these methods focus on text-based web contents. Some of them analyze the contents of a web page by analyzing its HTML code, hyper links, and/or DOM structure. Seldom, a web page is characterized based on its visual appearance. A good reason for also considering the visual appearance of a web page is because humans initially perceive a web page as an image, and only then will look in detail at text and further pictorial contents. Hence, it is a more natural way of trying to analyze and classify the contents of the web pages.

Find By Topic