This story starts with someone reporting a very well written and concise issue for Simplify. After digging into it, I found a problem with how smalivm was handling static field initialization. In case you didn’t know, you can initialize a static field in smali like this:
1 | .field private static someInt:I = 5 |
I’d seen that smali supported this format years ago, and included it in my Smali syntax definitions for Sublime, but I couldn’t ever produce a DEX which used this. Whenever I had a simple, primitive static field, dx
would generate bytecode which initialized the field in the class initializer <clinit>
.
Ok, so now I needed to support this in smalivm which means I had to figure out exactly how everything worked, what was valid, what was invalid, and how each type (float, long, int, …) looks. Yay!
Long ago, when I tried to create a DEX which had these “inline static field literals”, the reason I failed may have been because I was either using an older version of dx
or I was invoking it weirdly (looking at you --no-optimize
). If some versions just didn’t use inline literals, it could be an interesting signature for compiler fingerprinting for APKiD.
When thinking about this problem, I realized I didn’t think very carefully about the values of uninitialized fields. Originally, I was assuming an UnknownValue
for all fields until they were initialized. However, I know Java treats Objects as null
and primitives as something sensible like 0
for numerics and false
for boolean. I felt absolutely confident that Dalvik worked the same way, so of course I setup a way of testing it to be absolutely super confident + 1. So, I wrote some Java:
1 | class InitTests { |
This will test the default value of int
and char
. I bet you didn’t think about char
when I mentioned default values of primitives earlier! Oh no, you were probably smugly thinking “of course an int
is 0, that just makes sense, duh!” but what about char
, huh? I figured it’d probably be a null character, but I so infrequently use those in Javaland that I wasn’t even sure '\0'
would work. Turns out, it does.
So I have all this Java. How am I going to run this on a Dalivk VM? You might be thinking, “Oh! I know this one! I’ll make an Android project, add this code as a class, wire it up to get executed when the main activity loads, and throw it on an emulator!” If that’s what you thought, give yourself an “F“ because Fuck that. Way too slow. Enter java2smali
.
1 | # Need javac, baksmali and dx on your path, bro |
This bad bitch will convert your easy to read and write Java code into Smali. It has some limitations. Namely, it’s written in Bash, so a small percentage of you may go mad if you look at it too long. It also doesn’t work super good with multiple files or inner classes.
Now you can generate Smali from the Java, but you still need to execute it, right? Enter runsmali
:
1 | function func_runsmali() { |
Ideally, this should be in Python and installable using pip
. It’s a rainy day project, and since it doesn’t rain in California, it might be a while before I get to it. Also, about Python packages, friends, let me tell you that Python packaging is a DARK ART. You want to know the best practices? Fuck you. That’s what they are. It’s a mess. Ruby Gems are much easier, but Ruby isn’t installed on everyone’s machine like Python is.
Disclaimer: Coding in Python is a joy, and I don’t dislike the language at all. There’s just some confusing shit if you’re new and trying to learn, like Python 2 vs Python 3 and all the different package distribution tools.
Anyway, just fire up an emulator and use runsmali
to take care of invoking the dalvikvm and get the output:
1 | $ runsmali InitTests.smali InitTests |
I wanted to know how to initialize all of the different types, so I had to go digging through the syntax to know everything that was valid. I started here where FIELD_DIRECTIVE
is defined. This led me to some literal parsing code here. Ultimately, I found this which told me how type signifiers are defined. I also stumbled across this which showed me float
s and double
s can be NaN
in addition to numeric literals. That leaves us with these all as valid static field literals:
1 | .field static myInt:I = -4 |
In the next post on this topic, I’ll talk about inheritance and valid ways of referencing fields.