I am trying to find an efficient way for getting all classes references from a dex file.
I have tried several solutions, but I want to generalize it while also maintain efficiency. The expected result is having a list of class names which are being used inside the dex file (Not class declarations, class usages)
The goal is using the list mentioned above to understand which classes that were implemented in a specific dex, are being called/used in other dex files.
According to http://pallergabor.uw.hu/androidblog/dalvik_opcodes.html , Class can appear as a reference in smali in the following:
For example:
.method public static valueOf(Ljava/lang/String;)Lcom/google/ads/AdRequest$Gender;
For example:
const-class v0, Lcom/google/android/gms/internal/ads/zzalo;
For example:
check-cast p1, Lcom/google/android/gms/internal/ads/zzaao;
For example:
instance-of p2, p1, Lkotlin/coroutines/AbstractCoroutineContextKey;
For example:
new-array v1, v4, [Lcom/google/android/gms/internal/ads/zzabf;
For example:
filled-new-array {v0, v1, v2}, [Ljava/lang/String;
For example:
new-instance v0, Lcom/google/firebase/FirebaseApiNotAvailableException;
For example:
iget-object v0, p0, Lkotlinx/coroutines/AwaitAll$AwaitAllNode;->disposer:Ljava/lang/Object;
For example:
iput-object p1, p0, Landroidx/activity/ActivityViewModelLazyKt$viewModels$4;->$extrasProducer:Lkotlin/jvm/functions/Function0;
For example:
sget-object p1, Landroidx/lifecycle/Lifecycle$Event;->ON_STOP:Landroidx/lifecycle/Lifecycle$Event;
For example:
sput-object v0, Lcom/google/android/gms/internal/ads/zzab;->zza:Lcom/google/android/gms/internal/ads/zzab;
For example:
invoke-static {v0}, Lkotlin/Result;->constructor-impl(Ljava/lang/Object;)Ljava/lang/Object;
The solutions I have tried:
The first solution is to use baksmali tool to get all smalis and then use grep with the keywords from the list above. Although this gives the expected result, it requires me to use a jar, grep, and then parse the result.
The second solution was to use dx tool. This tool allows you the following command:
dx --find-usages <file.dex> <declaring type> <member> Find references and declarations to a field or method. <declaring type> is a class name in internal form, like Ljava/lang/Object; <member> is a field or method name, like hashCode.
So if I use the following command:
dx --find-usages classes.dex "L.*" ".*
This actually gives a good result, which is almost the expected result.
Very partial output:
dx --find-usages classes.dex "L.*" ".*" Landroid/support/customtabs/ICustomTabsCallback;.extraCallback method declared Landroid/support/customtabs/ICustomTabsCallback;.extraCallback(Ljava/lang/String;Landroid/os/Bundle;) Landroid/support/customtabs/ICustomTabsCallback;.extraCallbackWithResult method declared Landroid/support/customtabs/ICustomTabsCallback;.extraCallbackWithResult(Ljava/lang/String;Landroid/os/Bundle;) Landroid/support/customtabs/ICustomTabsCallback;.onMessageChannelReady method declared Landroid/support/customtabs/ICustomTabsCallback;.onMessageChannelReady(Landroid/os/Bundle;) Landroid/support/customtabs/ICustomTabsCallback;.onNavigationEvent method declared Landroid/support/customtabs/ICustomTabsCallback;.onNavigationEvent(ILandroid/os/Bundle;) Landroid/support/customtabs/ICustomTabsCallback;.onPostMessage method declared Landroid/support/customtabs/ICustomTabsCallback;.onPostMessage(Ljava/lang/String;Landroid/os/Bundle;) Landroidx/window/core/ValidSpecification;.require: field reference Ljava/lang/Object;.value (iget-object) Landroidx/window/core/ValidSpecification;.require: method reference Lkotlin/jvm/functions/Function1;.invoke(Ljava/lang/Object;) (invoke-interface) Landroidx/window/core/ValidSpecification;.require: method reference Ljava/lang/Boolean;.booleanValue() (invoke-virtual) Landroidx/window/core/ValidSpecification;.require: field reference Ljava/lang/Object;.value (iget-object) Landroidx/window/core/ValidSpecification;.require: field reference Ljava/lang/String;.tag (iget-object) Landroidx/window/core/ValidSpecification;.require: field reference Landroidx/window/core/Logger;.logger (iget-object) Landroidx/window/core/ValidSpecification;.require: field reference Landroidx/window/core/SpecificationComputer$VerificationMode;.verificationMode (iget-object) Landroidx/window/core/ValidSpecification;.require: method reference Landroidx/window/core/FailedSpecification;.<init>(Ljava/lang/Object;Ljava/lang/String;Ljava/lang/String;Landroidx/window/core/Logger;Landroidx/window/core/SpecificationComputer$VerificationMode;) (invoke-direct/range) Landroidx/window/embedding/ActivityRule; field declared Z.alwaysExpand Landroidx/window/embedding/ActivityRule; field declared Ljava/util/Set;.filters
This method raises 2 issues:
a. It is VERY slow. The reason was found in the source code of DX, https://cs.android.com/android/platform/superproject/+/master:dalvik/dx/src/com/android/dx/command/findusages/FindUsages.java , which uses recursion to find & match the pattern.
b. As a result from the recursion mentioned before, the output contains also the recursion path and not just the usages which might give huge output for small classes.dex files. Therefore iterating over the result and filtering what I am looking for will take much more time than expected.
Class names are stored in strings table, you can use something like the following for quick & dirty retrieval:
baksmali list strings classes.dex | grep \"L
You might get some false positives though.
Bonus points: with a slightly modified filter you'd catch classes that are being used with reflection as well.