Using Markov Chains for Android Malware Detection

If you’re chatting with someone, and they tell you “aslkjeklvm,e,zk3l1” then they’re speaking gibberish. But how can you teach a computer to recognize gibberish, and more importantly, why bother? I’ve looked at a lot of Android malware, and I’ve noticed that many of them have gibberish strings either as literals in the code, as class names, in the signature, and so on. My hypothesis was that if you could quantify gibberishness, it would be a good feature in a machine learning model. I’ve tested this intuition, and in this post I’ll be sharing my results.

Read More

Monitoring HTTPS Traffic of a Single App on OSX

If you reverse engineer network protocols or do any other network security stuff, you’ve probably needed to collect network traffic at least once – either to understand a protocol or look for sensitive information. Back in the good old days, this simply meant firing up tcpdump and watching those sweet, plaintext packets flow on by. Now, everyone has a stick up their butts about encryption – bunch of cry babies couldn’t handle getting their accounts hacked and their private info sold on the deep dark web for a few hundred dogecoin.

In today-time, doing any network analysis absolutely requires knowledge of HTTPS / SSL / TLS interception and it turns out to be non trivial almost all of the time! Of course, this makes sense because the entire point of TLS is to secure your communication. Like any other seldom trodden path, intercepting TLS has some caveats. First, you have to grok how Man-in-the-Middle works, how certificates work and how to install them on your system, how to massage your OS and certain apps into using those certs. Finally, you’ve got to navigate a bunch of proxy documentation and configuration to actually intercept and display the traffic.

In this post, I’ll be describing how to monitor the encrypted HTTPS traffic of a single app on macOS as well as solutions to some of the frustrating problems I encountered.

Read More

How Bitcoin Improves Free Speech and Government

Many people are introduced to Bitcoin and other cryptocurrencies merely as a way to make money investing. They see the price rising, buy in, and hope it goes “to the moon“, without really understanding what it is or why the price is moving. I’m glad Bitcoin is getting popular. I’m a huge fan, but I don’t give a single shit about how good of an investment it is. Even though you might make a lot of money investing, it pales in comparison to how Bitcoin can fundamentally change the world.

I can talk all day about how a big chunk of the world is unbanked and doesn’t have access to financial services, how this is a huge problem, and who knows what will happen when 4 billion people suddenly have access to savings accounts and loans with a simple feature phone with internet. I can write pages and pages about how Bitcoin enables truly micro transactions and how I think it’ll fit in nicely as payment models for AI powered robot services, self-driving cars, and media streaming services. The list goes on and on, but this post covers how Bitcoin bolsters the power of speech and improves our relationship with government.

Read More

Calling JNI Functions with Java Object Arguments from the Command Line

When analyzing malware or penetration testing an app which uses a native library, it’s helpful to isolate and execute the library’s functions. This opens the door for debugging and using the malware’s own code against it. For example, if the malware has encrypted strings and the decryption is done by a native function, you could either spend a bunch of time reversing the algorithm to write your own decryption routine or you could just harness the function such that you can execute it with arbitrary inputs. If the malware author completely changes their decryption, you might not have to change anything. In this post, I’ll explain how to harness a native library and execute its functions even if they require arguments from a live JVM instance.

In a previous post, I explained how to create a Java VM from Android native code but I didn’t give any real examples of how to use it. In this post, I’ll give a concrete example.

Read More

Creating a Java VM from Android Native Code

If you’re writing native / JNI code for Android, it’s probably as native method of an Android app. These methods are always passed the Dalvik VM instance of the app as the first parameter. You need this to create jstrings and other Java objects, lookup classes and fields, etc. It’s not normal for you to have to instantiate a VM from native code because most of the time, if you’re using the Java Native Interface (JNI), you started in Java land and are only dipping into native code land for them sweet, sweet performance benefits. However, if you’re reverse engineering or writing an exploit, you’re likely always delving int all kinds of unusual trouble which the developers reasonably believed would never happen or at least would only be a theoretical edge case.

I recently needed to create a VM from native code to pass Java object arguments to a JNI function. In this post, I want to share what I came up with and why I finally settled on this particular method.

Read More

Building with and Detecting Android's Jack Compiler

Recently, I needed to write a bunch of Smali code to use in tests for Simplify. While, Smali syntax is simple and fairly easy to write, it’s also tedious and I needed to do some tricky, uncommon stuff. I wasn’t even sure how to do it in Smali. Luckily, it’s pretty easy to write Java and convert it to Smali. I’ve talked about how to make a small alias to do this and go over some other use cases in a previous post. Writing Java and converting to Smali makes it easy to quickly prototype lots of Smali code without worrying about Smali syntax or conventions. In this post, I want to show how to use a new Android compiler called jack which takes the place of dx and you’ll need to know how to use if you want to continue converting Java to Smali.

Read More

Understanding Dalvik Static Fields part 2 of 2

In the first part of this series on Dalvik class fields, I wrote about how Dalvik handles static field literals. This article is focused on how field inheritance works and exploring all the different but equally valid ways of referencing fields at the bytecode level.

If you are familiar with Java, you probably already understand how Java field inheritance looks and behaves at the source code level, but btyecode is less strict and potentially more ambiguous (at least to humans) than source. JVM languages like Scala and Groovy compile to the same bytecode as Java, but both have very different source code restrictions.

Read More

Understanding Dalvik Static Fields part 1 of 2

This story starts with someone reporting a very well written and concise issue for Simplify. After digging into it, I found a problem with how smalivm was handling static field initialization. In case you didn’t know, you can initialize a static field in smali like this:

.field private static someInt:I = 5

I’d seen that smali supported this format years ago, and included it in my Smali syntax definitions for Sublime, but I couldn’t ever produce a DEX which used this. Whenever I had a simple, primitive static field, dx would generate bytecode which initialized the field in the class initializer <clinit>.

Ok, so now I needed to support this in smalivm which means I had to figure out exactly how everything worked, what was valid, what was invalid, and how each type (float, long, int, …) looks. Yay!

Read More

Death and the Java Class Loader

When smalivm is virtually executing code, sometimes it needs to pass around Java Class objects. If it’s a Java API class like String or LinkedList, that’s no problem because smalivm is running in a JVM and has access to those classes already. But what if the class is of a type that’s from the app it’s trying to run? That class only exists in virtual execution imagination land, and if I don’t want to rewrite everything and implement core JVM stuff myself, I need to dynamically create classes.

What this means is, when you pass smalivm an input DEX, it’ll create a real life Java class which talks and walks just like the DEX class you give it, except it’ll be inert and soulless, without any code. This way it can pass around the dry husk of a Java class, and if input DEX code wants to check the number of names of methods or do tricky reflection stuff, all those properties are there.

Read More

What is a company?

People throw around the phrase “company culture” a lot. There are tons of articles on LinkedIn and Medium about “How to 10x Your Company Culture”, “The 7 Mistakes Managers Make Which Harm Culture”, “12 Steps to Improving Culture”, and so on. Some articles are really good and many are at least interesting, but I always felt like they all make assumptions that limit creativity in their approaches to understand and improve culture. Companies are just people.

I’d always been a bit of a loner, and maybe that’s why when I started working it was endlessly fascinating for me to watch the company with the camera pulled way back as if I was an alien trying to understand the fundamental forces which made the organization work. I observed and mused about how to understand companies from first principals for about a decade until one day I made some unexpected conceptual connections that really pulled back the curtain and helped me understand culture differently. And it’s all thanks to online video games.

Read More