International Journal Of Engineering And Computer Science
With more than two billion pages created by millions of web page authors and organizations, the World Wide Web is a tremendously rich knowledge base. The knowledge comes not only from the content of the pages themselves, but also from the unique characteristics of the web, such as its hyperlink structure and its diversity of content and languages. A considerably large portion of information present on the world wide web (www) today is in the form of unstructured or semi-structured text data bases. It becomes tedious for the user to manually extract real required information from this material.