The Problem
Dalvik doesn’t have a proper null type. A null is represented by a 0
. Consider this example Smali code:const/4 v0, 0x0
It could actually represent a few of different types:
int v0 = 0;
boolean v0 = false;
byte v0 = 0x0;
short v0 = 0;
- And even:
v0 = null;
In case you were wondering about how char
is handled, char c = 'a'
it looks like this:const/16 v0, 0x61
I wanted to know when Dalvik coaxed 0
values into null
references for my work on Simplify. I tried searching and only found bits and pieces, and, of course, a bunch of source code. The first page I found that looked promising was http://forensics.spreitzenbarth.de/2012/08/27/comparison-of-dalvik-and-java-bytecode/
but all it said about null
s was:
Dalvik bytecode does not have a specific null type. Instead, Dalvik uses a 0 value constant. So, the ambiguous implication of constant 0 should be distinguished properly.
This wasn’t going to cut it.
The Experiment
I dug through the source code a little and felt like I only partially understood when it happened. To be sure, and to understand more deeply, and mostly because I like to do things the cheap, easy, ghetto way, I decided to write some Java, convert it to Smali, and execute it to see what happens!
Consider this bit of code which handles null
and 0
back to back:
1 | public static void addNullAnd0ToList() { |
This is the resulting Smali (with a main
method that I added because I’m nice and want you to be able to easily execute this yourself):
1 | .class public LHelloWorld; |
These two lines are responsible for adding the null
to wtf:Ljava/lang/List;
:
1 | const/4 v1, 0x0 |
My first guess was Dalvik sees that v1
contains an integer but is used as a Ljava/lang/Object;
type argument. Does it have to be an integer? Does it work with other numbers? What if v1
was a short
? I added a check-cast
to force v1
into short
:
1 | const/4 v1, 0x0 |
Then compiled an ran everything:smali hello.smali -o classes.dex && zip Hello.zip classes.dex && adb push Hello.zip /data/local && adb shell dalvikvm -cp /data/local/Hello.zip HelloWorld
It failed:
1 | DexOpt: --- BEGIN 'Hello.zip' (bootstrap=0) --- |
The key part of this error is the S is not instance of Ljava/lang/Object;
. Ok, that’s fair. There must be a difference between registers with and without explicit type casting. But does it work with integers? I tried with check-cast v1, I
and got about the same error. The code didn’t get past the verifier, so it knew at runtime it was wrong. To use a short
without a check-cast
I just added a getShort()S
method. I didn’t think it would work because in both the method call and with check-cast
, explicit type information is available.
1 | invoke-static {}, LHelloWorld;->getShort()S |
1 | .method public static getShort()S |
And I was right; it fails:
1 | VFY: register1 v1 type 10, wanted ref |
This is getting silly and I’m starting to think I should maybe just audit the source to fully understand. So I spend another 10 - 15 minutes poking around before giving up. I’ll just derrive the behavior experimentally hashtag yolo.
For the sake of completeness, I also try with a getInt()I
:
1 | invoke-static {}, LHelloWorld;->getInt()I |
1 | .method public static getInt()I |
Another failure:
1 | VFY: register1 v1 type 12, wanted ref |
Dalvik can see through my cheap tricks. What if I try a wide value like with const-wide
? There’s no explicit type… Slight change to the code because long
s are fat and take up two registers. I had to move the register to v2
.
1 | const-wide v2, 0x0L |
NOPE:
1 | VFY: register1 v2 type 13, wanted ref |
Conclusion
Eventually, I found that only two things work for a null
:
const/4 v1, 0x0
const/16 v1, 0x0
And these are considered null
only if there’s no explicit type information available between assignment and use. Now I can take these delicious, esoteric trivialities and apply them towards creating failing tests. And I can’t help but simultaneously get excited by the prospect of failing tests and wonder what kind of life choices led to this.