Date Added: Nov 2010
As the prevalence of blogs, discussion forums, and online news services continues to grow, so too does the portion of this Web content that relates to health and medicine. The authors propose that everyday, medically-oriented Web content is a valuable and viable data source for medical hypothesis generation and testing, despite its being noisy. In this paper, they present a proof-of-concept system supporting this notion. They construct a corpus comprising news papers relating to the drugs Vioxx, Naproxen and Ibuprofen, that were published between 1998-2002.