Study finds major computer virus threat

Posted by Rick C. Hodgin

San Diego (CA) - Two graduate students from UC-San Diego, Erik Buchanan and Ryan Roemer, have published a paper demonstrating that the process of creating a known form of computer virus can be automated more easily than previously thought. Application of a concept known as "return-oriented programming" can allow even properly written programs to be taken over, thereby becoming agents of the attacker. Is this threat real? And should we be concerned?





The call stack



When computer programs are running, they use something named a "call stack" internally. This thing keeps track of where a program is and what it's doing. It also includes important information used by the program. Suppose the user is asked for a filename to save a file. Some of the information on the call stack would be a pointer to the name that was typed in. Here's a simple example:









In this example there are three things called functions, main(), function1(), function2(). They are "called" with parameters, and after executing their code they return back to next line. The way to read this is to say that "main calls function1 with the value (or parameter) three. Then, function1 calls function2 with five." After those two function calls, the program would be where it says "// here we are!," and the call stack looks like that on the right.



Note it has a trail of breadcrumbs back to the top-level function, which in this case is main. When function2 is completed, it will return back to function1 and remove the "5" from the stack in #0992. After function1 is done, it will return back to main and remove the "3" from the stack in #1000".





The virus exploit



If we consider that inside every program there are literally thousands of little functions like these that do specific things, then it becomes very easy to recognize a potential use for virus writers. If they can alter the return address to not point to function1 or main, then they can take control of the program and transfer it anywhere they want to.



While this possible exploit has been known for a long time, it required the absolute best of the best to hand-craft the code to exploit the weakness. What the two researchers are demonstrating now is that for x86, SPARC and most RISC-based architectures, this type of exploit can be automated, making it far easier to utilize for viruses. They also warn that there isn't much protection possible from such an exploit because the innards of a good program are simply used differently to alter the system.





How does it work?



Suppose there exists a functioning, virus-free program published by Microsoft, Adobe, or other company. Now, it's obvious that each program contains a particular set of functions internally, and they are executed in a particular order - which is what makes the program do what it does. These functions, however, if executed in a different sequence, could do other things, potentially harmful things, to you or your system.



The virus exploit takes advantage of a known or discovered bug which allows the attacker to send a command to the remote computer which overwrites the return address. This is not the same as infecting the machine with new code. Simply, instead of the program going back to main or function1, it would go to some place that the virus writer has determined. And while this is still a difficult exploit to take advantage of, the two-person team has shown how the corruption could be crafted to run arbitrary code on the machine.

This would make it possible to launch a web browser, for example, with a particular web address which would then automatically begin a download of infected code to run on a machine. The infected machine may not even show any signs of infection. It might only record keystrokes, or wait patiently for the day the virus writer calls all of its "army bot" machines into action on a denial-of-service attack, or for some other purpose (possibly even non-harmful ones, such as using your computer to compute some of their data in a stolen distributed computing platform). The range of uses for such an attack is quite wide.





Difficult or exploitable?



Both. This type of exploit still requires intimate knowledge of software internals. However, the tools these two are discussing in their paper demonstrate that an advanced assembly programmer with knowledge of the system could automate this process on common toolsets. A relatively straight-forward analysis of the code generated by common compilers could allow the virus authors to find keys or triggers, things that are known to the virus writer to be a sign of a particular thing.

They could, for example, learn that a particular application uses the standard library of functions available to all C++ programmers, for example. If there's a known exploit possible there, then they could take advantage of it. All there has to be is one doorway in, and the virus writer could exploit it using this kind of attack.



No longer does an attacker have to inject malicious code into a machine and then run it. By simply sending a packet of data, one which takes advantage of overlaying part of the stack, the good code already present inside of software can be run in new ways resulting in potentially bad things.





No practical defense against it



The authors indicate that existing exploit counter-measures, such as the "No Execute" bit in AMD and Intel which typically prevents altered memory regions from being used as program instructions, and anti-virus scans which look for changes in code or virus-like code, are totally bypassed with this method of attack. There is nothing which identifies this attack as an attack until the machine starts acting differently.



The authors believe these kinds of exploits may well be the central theme used in future attacks. This stems from the fact that all of the code used is validated, known to the system and is working properly from all outward appearances. And being immune from this kind of attack only requires that the software be completely bug-free. And in truth, so long as human beings are writing code, that will never be the case - meaning all machines everywhere are potentially vulernable to this kind of attack.