I was recently subjected to an interesting new type of malicious code a few days ago, and I wanted to share it.
A friend of mine asked me to help him with a little bit of PHP coding a few days ago. He understands a bit about programming, but never did it for anything complex, and does not do it very often. He sent me the script he had written, and I started tinkering with it, and the more I started playing with it, it soon ended up being a ground-up re-write of his code.
The script itself had a fairly simple logic to it: parse the Apache access log, find other pages that have referred visitors to this particular page, and create a link page with the number of referrals. When doing some debugging, though, something odd happened. I was dumping the output of the raw log to the browser, and all of a sudden I was redirected to another web site!
What made this attack extremely interesting to me is that it did not actually attack a particular piece of software, nor did it care what OS I used or anything else. All it needed was for someone to run code that did not validate data. Indeed, it is a very common developer misperception to assume that data, once it is in a database, is clean and does not need to be validated on its way out of the database. This is the real lesson learned here. I can put all of the input validation I want into my program. But if someone else's software also accesses the same database, and does not properly validate data, I might as well not be doing validation at all, if I assume that the data is valid when I use it.
This is yet another reason why I am down on Web applications; it is the only system that I can think of in which input by one user is presented to another user in a way that the second user's computer will parse and interpreted and maybe even execute the first user's input, outside of the control of the developer. In thin client and desktop application computing, the programmer has total and complete control over the presentation layer and what occurs there. In Web application, the presentation layer is a complete no-man's land. There is no telling what will happen there. Data that is good today may become dangerous tomorrow if some new technology gets added to the browser and creates a browser issue. One example would be to allow users to post videos online; if there is a buffer overflow problem in the user's media player of choice, then you (the programmer) are giving malicious users a tool to attack other users. Web services are just as bad, particularly when using an