Understanding Dalvik Static Fields part 1 of 2

This story starts with someone reporting a very well written and concise issue for Simplify. After digging into it, I found a problem with how smalivm was handling static field initialization. In case you didn’t know, you can initialize a static field in smali like this:

1
.field private static someInt:I = 5

I’d seen that smali supported this format years ago, and included it in my Smali syntax definitions for Sublime, but I couldn’t ever produce a DEX which used this. Whenever I had a simple, primitive static field, dx would generate bytecode which initialized the field in the class initializer <clinit>.

Ok, so now I needed to support this in smalivm which means I had to figure out exactly how everything worked, what was valid, what was invalid, and how each type (float, long, int, …) looks. Yay!

Read More

Death and the Java Class Loader

When smalivm is virtually executing code, sometimes it needs to pass around Java Class objects. If it’s a Java API class like String or LinkedList, that’s no problem because smalivm is running in a JVM and has access to those classes already. But what if the class is of a type that’s from the app it’s trying to run? That class only exists in virtual execution imagination land, and if I don’t want to rewrite everything and implement core JVM stuff myself, I need to dynamically create classes.

What this means is, when you pass smalivm an input DEX, it’ll create a real life Java class which talks and walks just like the DEX class you give it, except it’ll be inert and soulless, without any code. This way it can pass around the dry husk of a Java class, and if input DEX code wants to check the number of names of methods or do tricky reflection stuff, all those properties are there.

Read More

What is a company?

People throw around the phrase “company culture” a lot. There are tons of articles on LinkedIn and Medium about “How to 10x Your Company Culture”, “The 7 Mistakes Managers Make Which Harm Culture”, “12 Steps to Improving Culture”, and so on. Some articles are really good and many are at least interesting, but I always felt like they all make assumptions that limit creativity in their approaches to understand and improve culture. Companies are just people.

I’d always been a bit of a loner, and maybe that’s why when I started working it was endlessly fascinating for me to watch the company with the camera pulled way back as if I was an alien trying to understand the fundamental forces which made the organization work. I observed and mused about how to understand companies from first principals for about a decade until one day I made some unexpected conceptual connections that really pulled back the curtain and helped me understand culture differently. And it’s all thanks to online video games.

Read More

Dalvik Virtual Execution with SmaliVM

Sometimes it’s useful to know what code does without executing it. You could read the code with your eyeballs and run it with your brain but that takes too long and it’s really hard, and executing code on a real machine can get messy, especially if it’s malicious. But what can you do if you want to understand a lot of malicious code? What if it’s obfuscated and even harder for your brain? Maybe you want to do some fancy analysis so you can accurately know when certain methods are called? Well, for this there’s executing on a virtual machine, i.e. virtual execution. There are many different ways of implementing a virtual machine. You could simulate an entire computer like VirtualBox and QEMU or you could simulate a smaller subset. The general idea is the same between all types: build a program which simulates executing other programs in all the important ways and gracefully fails for everything else.

Read More

Why Most Vulnerabilities Are Never Disclosed

When it comes to writing software, humans are the best game in town. Unfortunately, we’re absolutely terrible at it. Of course, we’re good at other stuff – recognizing faces, tool use, gossiping, and bi-pedal locomotion, but it turns out our brains are not so good at giving a computer the thousands of tiny, precise instructions necessary to validate an email address or properly deal with names. That fact we get anything to work at all is amazing

The bottom line is that if developers are writing code, they’re writing bugs and some bugs are vulnerabilities. Some are found and responsibly disclosed while others are kept secret or sold. For reasons which I shall explain, I believe that most security vulnerabilities are fixed but never disclosed.

Read More

Code Kata: Bloom Filter

If you’re unfamiliar with what a Code Kata is, check out my previous post Code Kata: TDD and Run-length Encoding

The goal for this kata is to learn an unfamiliar data structure. It’s called a bloom filter. I’ve read the Wikipedia article and have used them, but until I’ve made it myself I don’t understand it deeply. The more fundamental my understanding, the more flexible I can be in applying a concept. It’s just like calculus. There’s a world of difference between merely memorizing a formula and having a deep, intuitive understanding.

Read More

Reversing an Open Source Vulnerability

Vulnerability disclosures rarely include enough technical detail to reproduce the exploit. This is a good thing. It wouldn’t do to arm every script kiddie with exact details of how to write an exploit with every disclosure. However, there are times when someone like an application security engineer or security researcher need to “reverse engineer” the disclosure to reconstruct the technical detail in order to fully understand the vulnerability or write an exploit to test systems for weakness.

Read More

Code Kata: TDD and Run-length Encoding

A kata is a martial arts training method. It’s a set of detailed and choreographed movements and poses. The movements are performed repeatedly and internalized. A code kata is is a training method for developing skill in programming. Take something you do frequently, or wish to do better, strip away everything not essential, and practice it repeatedly.

Read More

TetCon 2016 - Android Deobfuscation: Tools and Techniques

I gave a talk at TetCon 2016 about Android obfuscation and deobfuscation.

The talks at TetCon were great and the people there were super nice. I got all kinds of new ideas and spent the entire flight home furiously coding. Super motivating to hear from and talk to other people working on similar problems. Thanks to the organizers and volunteers for making everything happen.

Read More

Decompiling XAPK Files

While reviewing new Android reverse engineering questions on Stack Overflow, I came across this request to decompile an .xapk. A brief, non-technical description of the format is described on APKPure’s website:

XAPK is a brand new file format standard for Android APK package file. Contains all APK package and obb cache asset file to keep Android games or apps working, it always ends in “.xapk”. To ensure games, applications run perfectly, APK Install one click install makes it easy for Android users directly install .apk, .xapk file to the root directory.
obb cache data?

Read More