Science and Development Network (SciDev.Net)
The authors present a simple web search engine for indexing and searching html documents using python programming language. Because python is well known for its simple syntax and strong support for main operating systems, they hope it will be beneficial for learning information retrieval techniques, especially web search engine technology. Many papers written in the web Information Retrieval (IR) field utilize their own web crawlers to crawl, index, and analyze contents (including hyperlink texts) of the pages and network structure of the web. Sometimes a search functions to return relevant pages to the users 'queries is also provided. Crawler and search function are considered to be the fundamental components of a search engine and each has its own research challenges and problems.