How obfuscation helps protect Java from reverse engineering

Few things are more frustrating to programmers than running across a bug you can't solve without access to source code you don't have. Whether you're patching in code from an online open-source library or you're making calls to common operating system routines, you likely spend time each week crunching code that you didn't write and for which you may not have the source.

It's easy to reverse engineer Java class files because Java bytecode contains a lot of the same information as its original source code. In addition, Java programs have a good reputation as being "write once, run everywhere." This flexibility has a number of potential advantages in a distributed environment. While not unique to the Java language, code decompilation has never been deployed so publicly or ubiquitously as it is among Java developers. The flip side of decompilation is obfuscation.

What is obfuscation?

Given the ease with which decompilers extract source code from compiled code, protecting your code and the valuable secrets therein isn't easy. As Java decompilers have come into regular use so have Java obfuscators, which effectively put a smoke screen around your code.

Code obfuscation is currently one of the best methods for protecting Java code from reverse engineering. Obfuscation renders software unintelligible but still functionally equivalent to the original code. It also makes programs more difficult to understand, so that it is more resistant to reverse engineering.

In Figure A, a set of class files, P, becomes another set of class files, P', through an obfuscator. The result is that the code of P is not equal to the code of P', and P.code is more difficult to understand than P'.code but both function the same way.

For example, let's take simple Java code from the original source (

class OriginalHello {

   public OriginalHello() {

       int number=1;


   public String getHello(String helloname){

       return helloname;


After obfuscating this code by the simplest obfuscator (such as KlassMaster), all names in this class will be changed to scrambled, and line numbers will be removed. This is obfuscated code (

class a {

   public static boolean a;

   public a() {

      int a=1;


   public String a(String b){

       return b;


From above, you can see that Hello.class has been changed to a class by obfuscator KlassMaster and their method, getHello(java.lang.String), is altered to a(java.lang.String). The method name, a(), is more difficult to understand than getHello(). When you compare the obfuscated bytecode with the original bytecode, you can also see that the line numbers have been removed from the obfuscated bytecode. This gives less information to reverse engineers.

This very simple example about code obfuscation just scrambles identifiers and removes the line numbers that are generated by compilers. Modern commercial obfuscators are able to scramble really fast and are tough to decrypt the code; however, there is still a lot of research being done in this area.

Obfuscation techniques

Besides literals replacement and line number removals, there is a set of tricks that various obfuscators use. One popular way to obscure source is to take the meaningless string trick to the next level by replacing a symbol from the class file with an illegal string. The replacement might be a keyword like private or, even worse, a completely meaningless symbol such as ***. Some virtual machines, especially in browsers, don't take kindly to such antics. Technically, a variable having a symbol as a name such as = is contrary to the Java specification; some virtual machines will overlook it.

Another technique some obfuscators use is usually targeted to specific decompilers like Mocha and JODE. A bad instruction is injected into the code; it doesn't make a difference in running the code, but it crashes the decompiler.

As an example of such bad instruction, let's take the original code (decompiled):

Method void main(java.lang.String[])

     0 new #4

      3 invokespecial #10

      6 return

and the code after obfuscation (but keeping the same names for simplicity):

Method void main(java.lang.String[])

      0 new #4

      3 invokespecial #10

      6 return

      7 pop

Note that the routine now has a pop instruction after the return. Obviously, a function can't do something after it's returned — that's the trick. By placing an instruction after a return statement, it ensures that it will never be executed. The code here is essentially impossible to decompile; it doesn't make any sense because it doesn't correspond to any possible Java source code.

Other common obfuscation techniques include the following:

  • Layout obfuscations modify the layout structure of the program by two basic methods: renaming identifiers and removing debugging information. They make the program code less informative to a reverse engineer. Most layout obfuscations cannot be undone because they use one-way functions such as changing identifiers by random symbols and removing comments, unused methods, and debugging information. Though layout obfuscations cannot prevent reverse engineers to understand the program by observing the obfuscated code, they at least consume the cost of reverse engineering. Layout obfuscations are the most well studied and widely used in code obfuscation. Almost all Java obfuscators contain this technique.
  • Control obfuscations change the control flow of the program. The trick is simple: For a routine A() obfuscator creates an additional routine A_bug and an "if" selector, if (PREDICATE) then A_bug(); else A();. The PREDICATE is designed on-the-fly in that way so it is always false (but it's made so it's hard to conclude that fact), and the A() routine is always selected instead of a buggy copy A_bug().
  • Data obfuscations break the data structures used in the program and encrypt literals. This method includes modifying inheritance relations, restructuring arrays, etc. Data obfuscations thoroughly change the data structure of a program. They make the obfuscated codes so complicated that it is impossible to recreate the original source code.


You should keep in mind that no obfuscator known today provides any guarantees on the difficulty of reverse engineering. Thus, obfuscators do not provide security of a level similar to modern encryption schemes, and you should used with other measures in tandem in cases where security is of high importance.

The most common software reverse engineering attacks target copy protection schemes. These schemes generally rely heavily on existing operating system procedure calls, making it easy to bypass basic code obfuscation using the same tools used with unobfuscated code. In addition, obfuscated code often depends on the particular characteristics of the platform and compiler, making it difficult to manage if either change.

Peter V. Mikhalenko is a Sun certified professional who works for Deutsche Bank as a business consultant.


Get Java tips in your inbox Delivered each Thursday, our free Java newsletter provides insight and hands-on tips you need to unlock the full potential of this programming language. Automatically subscribe today!

Editor's Picks

Free Newsletters, In your Inbox