The goal of PrivacyGrade.org is to help raise awareness of the behaviors that many smartphone apps have that may affect people’s privacy. PrivacyGrade provides detailed information about an app's privacy-related behaviors. We summarize these behaviors in the form of a grade, ranging from A+ (most privacy sensitive) to D (least privacy sensitive).
You can also watch the following 5-minute video from PopTech 2013 that gives an overview of our work:
Today's smartphones have an incredible array of capabilities. Smartphones have access to our communications (email, contacts list, and social networks), activities (location, call logs, photos, accelerometers), and more. However, some apps access this sensitive data in ways that people do not expect. These are not just hypothetical risks either: several app developers have already been fined by the Federal Trade Commission for deceptive uses of people's data.
Some apps access sensitive personal data to offer new kinds of functionality. For example, map applications use location data to show you where you currently are.
Other apps access sensitive data for advertising reasons. For example, location data is used by many apps to help personalize what ads are shown. In our analyses, we have also found a few cases where highly sensitive data, such as contact list and account information, are used for advertising.
In rare cases, sensitive data is used by malware to steal information. PrivacyGrade currently does not focus on malware.
Grades are assigned using a privacy model that we built. This privacy model measures the gap between people's expectations of an app's behavior and the app's actual behavior. For example, according to studies we have conducted, most people don't expect games like Cut the Rope to use location data, but many of them actually do. This kind of surprise is represented in our privacy model as a penalty to an app’s overall privacy grade. In contrast, most people do expect apps like Google Maps to use location data. This lack of surprise is represented in our privacy model as a small or no penalty.
Grades are assigned using a privacy model that we built. This privacy model measures the gap between people's expectations of an app's behavior and the app's actual behavior.
By expectations, we mean what people think an app does versus what it actually does. For example, according to studies we have conducted, most people don't expect games like Cut the Rope to use location data, but many of them actually do (this is true for Android as of May 2015). This kind of surprise is represented in our privacy model as a penalty to an app’s overall privacy grade. In contrast, most people do expect apps like Google Maps to use location data. This lack of surprise is represented in our privacy model as a small or no penalty.
To build our privacy model, we use a combination of static analysis and crowdsourcing. By static analysis, we mean that we have built custom software that inspects an app and analyzes what sensitive data that app uses and how that data is used. More specifically, we check if the app is using the sensitive data primarily because of a third-party library* included with the app. For example, using this approach, we can assess if an app uses location data, and if so, whether it is for map purposes (because our analysis found that location data is only used by the Google Maps library) or for advertising purposes (because our analysis found that location data is only used by an advertising library). In other words, libraries give us a way of inferring the semantics of how some kinds of sensitive data are used by an app.
By crowdsourcing, we mean that we posted a series of tasks on Amazon's Mechanical Turk to solicit crowd workers’ privacy expectations for a core set of 837 different apps. For example, we asked questions like "how comfortable do you feel with Cut the Rope using your location data for advertising purposes?"
Combined, this approach of using static analysis and crowdsourcing allows us to build a model of people's concerns. We then use this privacy model to assign grades ranging from A+ (no concerns according to our model) to D (many concerns according to our model).
More specifically, each app is assigned a score based on our model. Apps that have no privacy concerns are assigned an A+. The remaining apps are sorted by their score and then split into quartiles. The top quartile of apps are assigned an A, the second quartile receiving a B, the third quartile a C, and the fourth quartile a D.
* Libraries are reusable pieces of code. For example, Facebook, Google, Twitter all offer libraries that makes it easier for developers to use their services.
Right now, for Android smartphones, you only have a few options. If you're feeling particularly sensitive about privacy, you can delete the app. Alternatively, you can try to find a replacement app that has similar functionality (e.g. there are many flashlight apps that don't need an Internet connection or location data). If you're worried about location data, you can turn off location services.
Also, generally speaking, for-pay apps are better about privacy than free apps.
To remove an app from your Android device head to the Settings menu and touch either the Apps or Application Manager section (device dependent). You will see a page listing all downloaded apps. Touch the app you'd like to uninstall and hit Uninstall.
For more information see the following Google support document.
PrivacyGrade currently has several limitations. One limitation is that, presently, we only analyze free apps.
Another current limitation is our reliance on libraries to infer an app's semantics. For the most part, this is a pretty good approximation, but is not perfect. For example, an app might implement internal code that uses your location data for advertising, rather than a library. Currently, PrivacyGrade would not correctly catch this case.
We also only periodically crawl and update our analysis. So our privacy grades might not reflect the latest apps available on Google Play.
There may also be limitations with our use of Androguard for static analysis. Androguard has not been updated for the latest versions of Android, which means that we may have missed some permissions.
Lastly, we build our privacy model based on a core set of 837 apps. However, this model could still use some improvement, for example, taking into account specific categories (games vs productivity apps). We are looking at several different approaches to improve the accuracy of our models.
Each privacy grade has a link to a feedback form where you can send our team comments. You can also send us feedback by emailing firstname.lastname@example.org
The most common problem that leads to low privacy grades is using too much personal data for advertising. For example, we've seen some ad libraries that want to use location data, contact lists, and cameras for advertising.
As part of our research, we have interviewed and surveyed many app developers. It turns out that many developers don't realize that the ad libraries they are including in their apps are accessing a lot of personal data. So, if you are using a lot of ad libraries, we encourage you to take a second look at what those libraries are doing, and see if there are other APIs that you can use that don't require so much personal data.
Also, please note if you have updated your app on Google Play, we only periodically crawl and analyze apps. We are looking into streamlining our app analysis process to offer more up-to-date results.
Lastly, please note the limitations of our method, as described above in this FAQ.
We currently don't have ways for other people to get involved at this time. But stay tuned, we're looking into ways.
We are a team of researchers at Carnegie Mellon University, primarily part of the CHIMPS lab. The team is led by Professor Jason Hong The main work behind PrivacyGrade was done by Jialiu Lin, Shahriyar Amini, Song Luan, Kevin Ku, Mike Villena, Bharadwaj Ramachandran, Richmond Wong, and Fanglin Chen. Other contributors include Professors Janne Lindqvist, Norman Sadeh, and Joy Zhang, as well as REUs Bharadwaj Ramachandran and Sunnia Ye.
We created PrivacyGrade to help three different audiences: end-users, developers, and policy makers.
By end-users, we mean anybody who uses a smartphone and installs apps. Right now, there is little privacy information available for people. Furthermore, when we have given talks about smartphone privacy, people often express a great deal of surprise about how much information apps are collecting (to the point where people have even deleted apps during the talk). We aren't against ads or other uses of sensitive data, but we want people to be better informed about how their data is being used.
By developers, we mean people who create apps. In some of our studies, we found that developers often have little understanding of what they should be doing with respect to privacy, and sometimes didn't know what their own apps were doing (primarily because of they were not aware of the behaviors of the libraries they used).
By policy makers, we mean people and organizations that set guidelines governing use of personal data. This might include industry consortia, government agencies, and interested third parties. When talking to people in these various groups, we found that there was often a lack of information about what apps are actually doing, in terms of the range of behaviors as well as statistics.
The funding for PrivacyGrade is supported in part by the National Science Foundation (CNS-1228813), the Army Research Office (DAAD19-02-1-0389 and W911NF-09-1-0273), NQ Mobile, and by a Google Faculty Research Award. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funders.
We are considering expaning our work to other platforms. We're a small research team at a university, and so we only have limited resources.
Our original work on app scanning was published in Ubicomp 2012. You can see this paper here. A brief summary of the work is available here. We have also interviewed developers to understand what challenges they face, published at USEC 2014. You can see this paper here. We also have a paper that will appear at SOUPS 2014 that describes more of how we scaled up our privacy models.
You can cite PrivacyGrade with these two papers:
Lin, J., S. Amini, J.I. Hong, N. Sadeh, J. Lindqvist, J. Zhang. Expectation and Purpose: Understanding Users' Mental Models of Mobile App Privacy through Crowdsourcing. In The Fourteenth International Conference on Ubiquitous Computing (Ubicomp 2012). pp. 501-510. 2012.
You might also be interested in two dissertations examining the privacy of smartphone apps:
We are logging what web pages are being accessed, IP address, time of access, and other associated information. This information is pretty much what all web servers collect in their logs.
We are using this information to understand which pages are most popular, and how we can improve on the delivery of the content of this web site.