Recently, I needed to write a bunch of Smali code to use in tests for Simplify. While, Smali syntax is simple and fairly easy to write, it’s also tedious and I needed to do some tricky, uncommon stuff. I wasn’t even sure how to do it in Smali. Luckily, it’s pretty easy to write Java and convert it to Smali. I’ve talked about how to make a small alias to do this and go over some other use cases in a previous post. Writing Java and converting to Smali makes it easy to quickly prototype lots of Smali code without worrying about Smali syntax or conventions. In this post, I want to show how to use a new Android compiler called jack
which takes the place of dx
and you’ll need to know how to use if you want to continue converting Java to Smali.
Building with Jack
The original Android compiler is dx
and it works by translating Java .class files to Dalvik executables (.dex). Jack, however, compiles Java source code, so you don’t need to invoke javac
at all.
I originally had to learn about Jack because it looks like dx
doesn’t support newer Java .class versions. I tried converting a Java 8 compiled class and dx
gave me the following error:
1 | $ dx --dex AndroidException.class |
This was a problem. I was trying to fix a bug which exposed another bug which exposed 4 or 5 things I could be doing much cleaner which in turn led to even more stuff I wanted to fix first before fixing the original bugs. It’s a bit like this:
By the time I saw the dx
error, I had about blown my stack. After a few milliseconds panic, I calmed down and started looking through the build-tools
directory of the most recent Android platform I had installed. I knew that Android must be able to convert Java 8 classes to .dex because people are making Android apps using it.
1 | $ echo $ANDROID_BUILD_TOOLS |
Lo and behold, there’s this jack.jar
. I ran it just to see what it did and it gave this nice, long, helpful help message:
1 | $ java -jar $ANDROID_BUILD_TOOLS/jack.jar --help |
The main difference between jack.jar
and dx
is that, since Jack operates on Java source, it needs access to some Java runtime files. I assumed rt.jar
would be enough. Here’s an example of how to use it with a file called Hello.java
:
1 | mkdir temp_dex |
This is tweaked to work on a Mac, but it should be easy to translate to Linux or Windows.
Detecting Jack Created Files
I was curious if it was possible to fingerprint Jack-built .dex files. There’s a lot of useful stuff you can do with compiler fingerprinting such as detecting malware. I wanted to add the fingerprints to the database in APKiD. To find how the files are different, I built two .dex files from the same Java but using different tools. In this case, dx
and jack.jar
.
Here’s a small .dex file created with dx
:
Here’s the same Java converted with jack.jar
:
See that cute little emitter: jack-4.12
? Looks like Jack intentionally watermarks files it creates. It might be able to turn it off with a command line parameter, but I haven’t looked. Here are the rules I added to APKiD to detect Jack: https://github.com/rednaga/APKiD/commit/ccca5ed519b7b2551a3205686be364c26020f1cd#diff-1731ed362177d8429d827a2b6ef3786bR131.
Update 12-06-2016 - Improved Jack Detection
Shortly after announcing this blog on Twitter, @iBotPeaches (Apktool developer) was kind enough to point out another distinguishing characteristic of Jack-generated DEX files which is described here: https://github.com/iBotPeaches/Apktool/issues/1354.
The new Jack compiler changes the names of compiler generated access$000
methods to names like -set0()
, -get0()
, and -wrap0()
. To build some .dex files for testing, here’s some Java code:
1 | public class OuterClass { |
I saved this as OuterClass.java
and compiled it explicitly with Java 1.7 because 1.8 isn’t supported by dx
:
1 | `/usr/libexec/java_home -v 1.7`/bin/javac OuterClass.java |
Then, I used javap
to get the names of the javac
generated method names:
1 | $ javap OuterClass.class |
Yup, it’s all access$00\d
stuff, so Jack must be the one changing these. No idea why it does this apart from making it a bit more clear what their function is, i.e. it’s easier to guess the behavior of -set0(int)
than access$002(int)
.
To get a Jack compiled version of OuterClass
:
1 | mkdir jack |
Now, to examine the differences in method names by looking at the strings:
Ok, seems obvious enough. I could look for these two sets of strings to figure out the compiler. But what happens if you use dexmerge
to combine the dx
and jack.jar
produced .dex files?
Here’s the alias I use for dexmerge
:
1 | $ alias dexmerge |
In case you’re curious, here’s the usage:
1 | $ dexmerge |
My guess was that this would merge all the strings but the only the code in the first .dex would be kept, leaving several strings unreferenced. Unreferenced strings might actually be an interesting heuristic for finding weird .dex files, but it’s not something you could do without some disassembly.
1 | $ dexmerge merge.dex dx.dex jack/classes.dex |
Lo, and behold, the strings from each are retained:
This .dex has an interesting history. If you know what to look for, you could tell quite a bit about how it was made which may help you infer how technically sophisticated the creator was and what tools and environment they were using. You’d know straight away that it’s the result of dexmerge
because of the map type ordering (search ABNORMAL_TYPE_ORDER
(note to self: number slides in the future)). You would also know parts of the file were created with dx
or dexlib 2.x and parts were created with jack.jar
.