Your Friendly Neighborhood Web Crawler: A Guide to Crawling the Web With SAS

Date Added: Mar 2010
Format: PDF

The World Wide Web has a plethora of information; from stock quotes to movie reviews, market prices to trending topics, almost anything can be found at the click of a button. Many SAS users are interested in analyzing data found on the Web, but how does one get this data into the SAS environment? Various methods are available, such as designing the own Web crawler in SAS DATA step code or utilizing the %TMFILTER macro in SAS Text Miner. This paper will review the general architecture of a Web crawler. The paper will discuss the methods of getting Web information into SAS, as well as examine experimental code from an experimental internal project called SAS Search Pipeline.