There has been lots of good posts so far about the ASP.NET vulnerability that was unveiled late Friday. In summary the attack exposes cryptographic keys used in ASP.net which can lead to all sorts of nastiness.
I recommend you read and apply the advice from Scott Guthrie and then come back and read more.
Microsoft is releasing very little information on this issue until they have a fix ready. As such, some of the information here is based on my best guess of how it is actually working. Although these are assumptions the advice on how to avoid these attacks are not.
What I have inferred from the given workaround is that there are two components to this vulnerability. The first is an information disclosure vulnerability that certain responses to requests give different error codes. The second is a side channel attack that occurs that reveals how long it took to yield a certain error code.
Information Disclosure Vulnerabilities
An information disclosure vulnerability is just what the name implies. Information is being disclosed that gives an attacker information. The classic example of this is with a login system that states “Invalid Username” when you use a username that does not exist and “Invalid Password” when you use a username that DOES exist but the wrong password. The system should just give you a generic error message and not reveal the reason for the login failure.
Side Channel Attacks
The attack is an example of a side channel attack. A side channel attack is when an attacker uses clues from a system to determine information instead of brute forcing the attack. The most common form of this is a timing attack in which an attacker looks at how long something takes to determine what is happening. Some other side channels that have been used have been looking at power consumption of systems, the sound the systems make, and the electromagnetic radiation given off by systems. Granted, timing attacks are the simplest to execute remotely (but network speed variability can introduce inaccurate timings.)
A real world example of a timing attack would be my magic toaster. It has different settings for toast, bagels, waffles, and black pucks I can also tweak the darkness of the item with a dial. If I knew what time something was put into the toaster and what time it popped up I could probably determine the setting that was used on the toaster. With more refinement of my attack I could probably tell what the darkness toasting dial was set to as well. This attack is not precise though. Toast on 10 and Bagel on 1 might take the same amount of time (or close to it). Plus there may be differences in the materials inserted that affect timing (i.e. thin break or thick break).
As you can see, side channel attacks do not (usually) net you what you are looking for with ease 100% of the time. The real thing they do is narrow down the possibilities of what could be happening. If I said guess a number between 1 and 1,000,000,000 it would take a while to get it. If I said guess a number between 1, and 1,000,000,000 but the number is 7, 34, or 2million…. well that makes it a lot easier.
As descibed in Scott’s post the workaround is two address these two issues. By having custom errors on (which you should have anyways) for all error codes to redirect to a page should address the information disclosure vulnerability. The page that Scott shows also puts in a random delay of the response to mitigate the timing side channel component.
As this attack is used to determine encryption keys then I would state that encrypted data is at risk to being decryted. This would be things like viewstate, cookies, forms tickets, membership data, and more. As this attack is out there and should be fairly easy to automate I would consider my keys as most likely compromised. It appears the attack can run fairly quickly too (the example I saw got the key in 5 minutes).
To this end it may be a good time to update keys (after you have applied the workaround) used in your systems.
Mitigating Information Disclosure In Your Code
Combating information disclosure of exceptions is fairly simple. In a client server environment I take any error the server throws and I log it (with as much information) and then return a generic error. In this case it may be that a 404 (page not found) error is returned in one case and 500 (internal server error) in another. Normally this is pretty standard behavior but it really does not help the user. In the end they wanted something and it did not happen so a generic error is sufficient. This can be achieved with the setting of a standard error page that hides the HTTP error code from the user.
Mitigating Timing Side Channel Attacks In Your Code
The timing side channel attack is tricky to spot as it is not a typical attack vector but it is fairly easy to fix. If you have code that can reveal information based on the time taken it may be practically to add a random wait time to responses. Going back to the login scenario it may take 2-5ms to determine if the user is in the database and then an additional 2-3ms to hash the password and compare it if the user exists. By seeing the process takes 2-5 ms before getting a login failure we see that the user probably did not exist. If it takes 7-10ms the user probably exists but the password was wrong. By adding a Thread.Sleep(cryptographically_random*) if the user is not found we can simulate the amount of time taken to hash and compare the password.
But Wait! There’s a catch. If we wait longer than the time it would take to hash the password we have just created the same problem. I.e. if we sleep for 0-20ms then an attacker may know that if it takes between 7-10ms that we have the right user and if it takes between 5-20ms is that the user was not found (but a sleep was added to lengthen the process). So adding wait time created the same issue we were trying to prevent! The most encompassing solution would be to wait on success or failure in this case. So no matter what the time it takes to execute something is variable. This may negatively affect performance though as you have threads sleeping all over.
RANDOM IDEA AS I WRITE THIS POST: As every computer is different stating that success is a-b milliseconds and failure is x-y milliseconds as hardcoded fact in code does not work as the software may be installed in many different places (or upgraded to faster systems in the future). It may be an idea to build in a profiler around a sensitive method call that monitors the time it takes and then assures that subsequent calls fall within that range.
*It is important not to just use Random as the sequence of Random can be preditced. Using a crypto level random number generator generates a more random random
Mitigating Timing Side Channel Attacks On Your Network Layer
The other component to this is that it takes multiple requests to the server to test different bits of data. The demo I saw did close to 40,000 requests. This may be able to be stopped or slowed down by rate limiting requests to a server to a reasonable/human level. It also leaves a pretty big log footprint on your servers.