International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Deep web data extraction is challenging problem recently since the structured data from deep web pages underlie intricate structure. So, extraction of web data from deep web pages received much attention among the researchers. In this research, vector space model and content features are utilized for deep web data extraction. Initially, extracted deep web pages are taken as input for the proposed method and Document Object Model (DOM tree) is constructed. Through the DOM tree, information given in the whole web pages is split into block wise and block with its contents are given for feature computation process.