The Dangers of Rooting : Data Leakage Detection in Android Applications

Mobile devices are widely spread all over the world, and Android is the most popular operative system in use. According to Kaspersky Lab’s threat statistic (June 2017), many users are tempted to root their mobile devices to get an unrestricted access to the file system, to install different versions of the operating system, to improve performance, and so on. )e result is that unintended data leakage flaws may exist. In this paper, we (i) analyze the security issues of several applications considered relevant in terms of handling user sensitive information, for example, financial, social, and communication applications, showing that 51.6% of the tested applications suffer at least of an issue and (ii) show how an attacker might retrieve a user access token stored inside the device thus exposing users to a possible identity violation. Notice that such a token, and a number of other sensitive information, can be stolen by malicious users through a man-in-the-middle (MITM) attack.


Introduction
In everyday routine, smartphones, laptops, tablets or, more in general, mobile devices have become an essential need for everyone.ey are widely used to read e-mails, carry out financial transactions, browse maps, chat with other people, and so on.Mobile devices have to face a number of issues due to the resource constraints (performance issue [1,2], e.g.) and also security issues (data leakage [3,4], privacy concern [5,6], etc.).In particular, the latter may be affected by the applications installed.Usually users choose such applications focusing on the number of total downloads [7], the reviews provided by users [8,9], and so on.A typical environment where ratings can be easily found is Google Play Store, the largest app store which counts over 3 million applications available [10] split into two major categories: Apps and Games-with 2.5 million and 500 thousand apps, respectively [11].However, it often happens that people who provide ratings evaluate the appearance, functionality, usability, and performances of an application without focusing on security aspects.In addition, as reported in Kaspersky Lab's threat statistic (June 2017) [12] summarized in Table 1, security issues are further amplified by users when they root their phones.Notice that users obtain superuser access privileges to change the current Android version, to get access to the file system without restrictions, to install modified apps and gain more privileges, to improve performance, and so on.However, these access privileges may affect the security of installed applications [12,13,15], providing an access door to many sensitive information [16][17][18].In this scenario, unintended data leakage flaws may exist.
In order to identify such flaws, in this paper, we extend and improve our previous work [19].In particular, we improve our testing activities by analyzing not only the security issues of Android password managers but also those applications that are considered particularly relevant in terms of handling user sensitive information, such as financial, social, and communication applications.Notice that we do not describe innovative techniques but rather we measure the impact of a well-known technique (e.g., Xposed framework) on a rooted device, executing an extensive testing activities and observing that several applications do not implement the minimum security requirements.
In addition, we show the possibility to retrieve an access token, exposing users to a possible identity violation.Finally, we show that the same token (and many other sensitive information) can be retrieved through a man-in-the-middle (MITM) attack because several applications do not implement adequate cryptographic techniques for data protection or do not implement them at all.e remainder of the paper is organized as follows.In Section 2, we describe a number of approaches that can be used to analyze applications.In Section 3, we show the solution adopted to retrieve sensitive information from Android applications.Particular attention is paid to describe hooking techniques.In Section 4, we present our testing activities, showing how malicious users might retrieve sensitive information.Finally, conclusions are drawn in Section 5.

Different Approaches to Analyze Applications
When an application lands on the market, it becomes suddenly available to be used by everyone.is means that it can be tested and analyzed under all possible conditions.Every internal element of an app should share the necessary information to perform a specific task without any data leakage.Unfortunately, this does not always happen.
In order to recognize possible data leakages, two wellknown approaches can be used: static and dynamic analysis.
(i) Static analysis is based on the examination of an application without the execution of it [20].Its radius of action is quite limited because many applications adopt obfuscation [21,22] and dynamic code loading [23] to restrict access to internal information.However, it may be interesting to understand if the application's associated files, such as database, backup, or log files, are encrypted.In this case, entropic techniques are very useful [24].(ii) Dynamic analysis, instead, relies on the execution of the applications [25,26].e main idea is to collect (at runtime) the values that gradually come out from the called instructions.e advantage of this approach is to be less susceptible to code obfuscation.
In general, Android applications can assume many behaviors; thus it is necessary to monitor their activities, for example, through interface or automatic event injectors [27][28][29].
But there is also a third approach, situated halfway between the previous: the hybrid analysis [30,31].To work well, a system which adopts this technique must be designed in such a way that if the first was lacking, the second would take place, covering the gap [30].
In mobile device analysis, there is no a standard approach (static or dynamic) to collect data optimally.More precisely, we collect data via static analysis, and then we employ them in a dynamic scanning.is was accomplished through hooking techniques (hooking means to intercept methods with a known signature called by an application, acquiring its complete control), setting up the scenario shown in Figure 1.Taking into account a Java class named Signature, we notice that (a) the method initSign is invoked, (b) initSign receives a PrivateKey object, (c) initSign pass the object itself to another method, that is, engineInitSign of Figure 1, and (d) Hooker could take control of the method call, spying, or replacing its contents.
To better understand how this mechanism works, we explain in detail the hooking techniques-Xposed framework [32]-in Section 3.

How to Retrieve Sensitive Information
A generic Android application is a single compressed archive which includes essential information about the app [33].Among all this information, we focus on the DEX file (Figure 2) because it provides interesting features related to the target application [34,35].
We developed a tool, called Apk2Method, which (i) opens the APK of the target application; (ii) identifies the classes.dexfile that looks for a specific marker, that is, 6465 780a 3033 3500 in Hex; (iii) reads all methods invoked related to cryptographic field; (iv) finally, outputs a text file where all gathered data are stored in a convenient format for a subsequent parsing.For the sake of simplicity, we call such a file file.txt.
en, we developed an Android application which (v) inputs data previously stored in file.txt and parses such a file using Java reflections and regular expressions; (vi) runs inside a module of the Xposed framework, called Prober, which is able to select the target application.
More precisely, Prober represents the real execution engine of the hooking technique, implemented by Xposed.
e Xposed framework, in turn, takes control of each method called by the target application, spying, or replacing each passed argument.Doing so, the control flow of an application can be changed, providing us the ability to execute our own code enriched with specific security tests.Notice that it may happen that a portion of the target application's information is encrypted or obfuscated [36], using speci c tools such as Proguard, DashO, and Dex-Protector.ese tools rename classes, methods, and variables assigning them meaningless names [37].Consequently, the parsing activity will be very di cult and sometimes impossible (even with the support of the re ections [30]).In all other cases, if applications release sensitive information, our approach is able to detect these leaks.

e Xposed Framework.
e framework used [32] is identi ed by four individual components: the Xposed, the XposedBridge, the XposedInstaller, and the XposedMods system.Among these, the rst two are responsible for preparing the device to accommodate the framework.Let us brie y explain what happens when two generic methods, A and B, are called (Figures 3 and 4).
When the device is switched on, (1) the boot sequence starts: (a) the Boot ROM code starts executing from a prede ned location, loading the bootloader into RAM, (b) the bootloader sets up the necessary resources in two stages-network and memory-needed to run the kernel, (c) the Android kernel sets up a group of resources-cache, protected memory, scheduling, and drivers-and looks for init in the system les, (d) init is the very rst process, which sets the environment for Zygote [38] and daemons, and (e) daemons are invoked; (2) once the daemons are invoked, an extended version of process /system/bin/app_process [39] is called, which is meant to load the necessary classes designed to perform hooking-XposedBridge.jar; (3) as soon as an application calls a generic method (A), it is intercepted and redirected rstly to hookMe-thodNative, which increases the privilege level of the method received as argument, and secondly to handleHookedMethod, which links the method implementation to its own native generic method.In this way, it is possible to read all the arguments; (4) nally, the ow resumes naturally.

Testing Activity
We download and analyze several applications from Google's o cial Android Market, using two mobile devices-Wiko Wax (Android KitKat, rooted with KingRoot [40]) and Samsung Galaxy Nexus (Android Lollipop, rooted with Nexus Root Toolkit [41]) (at time of writing, Android KitKat and Lollipop represent nearly half (about 47%) of the market) [42].
Our analysis follows two main directions.A rst approach targets events resulting from data leakage of the method calls.ese leaks are usually characterized by an improper use of objects as arguments, for example, using string as passwords, making whole structures visible, and so on.en, to improve the ability to recognize data leakage, a second approach has been developed with the aim to nd leaks on data transmitted over the Internet by phone.

First Approach.
We downloaded 135 Android applications from Google Play Store, where 36 applications belong to "TOOLS" category, 54 to "PRODUCTIVITY," 7 to "SO-CIAL," 8 to "COMMUNICATION," and 30 to "FINANCE," taking care of the installation count value.Such indicators represent the number of users who installed the chosen application, and it can be found at the information panel of each application [43].In addition, let us remark that the choice of a particular application was taken relying on the fact that is used for security purposes and deal with data that are particularly sensitive for user side.For each application, we collect and store classes, methods, arguments, and return values.
More precisely, our approach works as follows (Figure 5):  Mobile Information Systems (1) An application alpha.apk is downloaded from Google Play Store and installed on the device.(2) en alpha.apk is transferred to the computer, using the Android Debug Bridge (ADB) [44].(3) e Apk2Method tool inputs alpha.apk.(4) e Apk2Method tool outputs classes and methods, storing them in le.txt previously mentioned in Section 3. e top of Figure 6 shows a toy example, pointing out that classes and methods of an application might be obfuscated.( 5) Such a le is copied in a speci c path of our application Prober, and a rebooting of the mobile device is required to apply changes to the system.( 6) When the alpha application runs-for example, the user input ID, password, e-mail, personal data, and so on-Prober stores methods invoked, arguments, and return values in le.log, as shown in the lower part of Figure 6.(7) Finally, in le.log, we are able to identify the presence of data leakage.
All apps analyzed have been cataloged using four levels of granularity: (1) no leakage: the application is safe; (2) abnormal behavior: the application suddenly freezes or crashes; (3) privacy concerns: the application releases unprotected sensitive information, that is, IMEI, phone number, geolocation, OS, and so on; and (4) account info: the application reveals account information-login IDs and passwords.
As shown in Tables 2 and 3 and in Figure 7, the testing results suggest that some issues have been identi ed for the category tools, productivity, andnance.In particular, in such categories, 51.6% of the tested applications su er from one (at least) of the following issues: (i) e application does not perceive to be observed.
(ii) e application does not warn the user about the presence of a jailbroken/rooted device.(iii) Private keys used during a communication (e.g., the OpenSSLRSAPrivateCrtKey or the RSAPrivateKey and the associated parameters) are in plaintext.(iv) Personal data, such as IMEI and geolocation, are not protected.(v) e master password (of the password manager) or the users account password (login IDs and password) are handled in plaintext.
On the contrary, the applications tested which belong to social and communication are not a ected by the same issues.

Second Approach.
A second issue is related to the leakage of encrypted data transmitted over the Internet and stored in the device itself.To avoid a user being forced to create a new account, a common practice is to exploit a third-party app that handles the authentication phase using a delegation protocol-for example, OAuth 2.0 [45].In particular, the authentication phase is done through an access token that is stored in the application's internal directory, preventing user from entering the login credentials   Mobile Information Systems (see Alice in Figure 8).Since the access token can be seen as a set of user attributes used to prove that a user is authenticated, the client application usually does not use a mechanism to validate the access token, and in rooted devices, this token can be easily found by browsing the application's folder; an attacker may retrieve such a token and inject it during a new authentication phase, stealing the identity of the victim (see Eve in Figure 8).Moreover, for all users who ignore the alerts and unknowingly accept everything, the token may be stolen on the channel through a man-in-the-middle attack.
For this set of users, we also tried to identify di erent types of possible attacks.erefore, we downloaded and analyzed 67 Android apps that send data over the Internet and should take care about user sensitive information.As described in Section 4.1, these applications belong to the following categories: 2 apps belong to "TOOLS," 16 to "PRO-DUCTIVITY," 4 to "SOCIAL," 10 to "COMMUNICATION," and 35 to "FINANCE."e main issue found is that several applications do not perform the SSL/TLS client authentication, thus making them potentially vulnerable to a manin-the-middle attack.Tables 4 and 5 summarize our testing activities.More precisely, we found leaks on 55.2% of the apps tested, where 50.0%comes from "TOOLS," 75.0% from "PRODUCTIVITY," 25.0% from "SOCIAL," 60.0% from "COMMUNICATION," and 48.6% from "FINANCE."

Conclusions
Since mobile devices are widely spread and used for everything, the protection of information, transaction data, and privacy have to be taken into account seriously.
In this paper, we focused on the real case scenario of rooted devices, analyzing the most installed Android applications with the aim to check how safe they are.We Table 3: Correlation between the installation count and the 4 levels of granularity.

Figure 1 :Figure 2 :Figure 3 :
Figure 1: An example of the hooking technique in action, specialized in spying.

Figure 4 :Figure 5 :
Figure 4: e sequence diagram illustrates how the system changes while the framework is active.

Figure 6 :
Figure 6: A toy example of the outputs obtained by analyzing an application alpha.apk.

2 A l i c e 3 A l i c e 2 A l i c e 3 A l i c e 1 A l i c e 2 A l i c eFigure 8 :
Figure 8: A graphical representation of the problem concerning the delegation scheme implemented by some applications.

Table 1 :
[12]p 10 (out of 100) countries where Android devices are rooted most frequently and where mobile devices are attacked most often by a malware[12].

Table 2 :
e results of the analysis obtained with the Android 5.x device.

Table 4 :
Number of apps that are potentially vulnerable to a MITM attack.

Table 5 :
Correlation between the installation count and MITM vulnerability.