Abstract
When developing and maintaining distributed systems, auditing privacy properties gains more and more relevance. Nevertheless, this task is lacking support of automated tools and, hence, is mostly carried out manually. We present a formal approach which enables auditors to model the flow of critical data in order to shed new light on a system and to automatically verify given privacy constraints. The formalization is incorporated into a larger policy analysis and verification framework and overall soundness is proven with Isabelle/HOL. Using this solution, it becomes possible to automatically compute architectures which follow specified privacy conditions or to input an existing architecture for verification. Our tool is evaluated in two real-world case studies, where we uncover and fix previously unknown violations of privacy.
This work has been supported by the German Federal Ministry of Education and Research, project DecADe, grant 16KIS0538.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We make them available at https://github.com/diekmann/net-network.
References
Das Standard-Datenschutzmodell. Technical report, Konferenz der unabhängigen Datenschutzbehörden des Bundes und der Länder, Darmstadt (2015). https://www.datenschutzzentrum.de/uploads/sdm/SDM-Handbuch.pdf
Bock, K., Rost, M.: Privacy by design und die neuen schutzziele. DuD 35(1), 30–35 (2011)
Cavoukian, A.: Creation of a Global Privacy Standard, November 2006, Revised October 2009. https://www.ipc.on.ca/images/resources/gps.pdf
Cavoukian, A.: Privacy by Design – The 7 Foundational Principles, January 2011. https://www.ipc.on.ca/wp-content/uploads/Resources/7foundationalprinciples.pdf
Chair of Network Architectures, Services, TUM: MeasrDroid. http://www.droid.net.in.tum.de/
Common Criteria: Part 3: Security assurance components. Common Criteria for Information Technology Security Evaluation CCMB-2012-09-003(Version 3.1 Revision 4), September 2012
Danezis, G., Domingo-Ferrer, J., Hansen, M., Hoepman, J.H., Metayer, D.L., Tirtea, R., Schiffner, S.: Privacy and data protection by design – from policy to engineering. Technical report, ENISA (2015)
Denning, D.: A lattice model of secure information flow. Commun. ACM 19(5), 236–243 (1976)
Diekmann, C., Korsten, A., Carle, G.: Demonstrating topoS: theorem-prover-based synthesis of secure network configurations. In: 11th International Conference on Network and Service Management (CNSM), pp. 366–371, November 2015
Diekmann, C., Michaelis, J., Haslbeck, M., Carle, G.: Verified iptables firewall analysis. In: IFIP Networking 2016, Vienna, Austria, May 2016
Diekmann, C., Posselt, S.-A., Niedermayer, H., Kinkelin, H., Hanka, O., Carle, G.: Verifying security policies using host attributes. In: Ábrahám, E., Palamidessi, C. (eds.) FORTE 2014. LNCS, vol. 8461, pp. 133–148. Springer, Heidelberg (2014). doi:10.1007/978-3-662-43613-4_9
Enck, W., Gilbert, P., Han, S., Tendulkar, V., Chun, B.G., Cox, L.P., Jung, J., McDaniel, P., Sheth, A.N.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM TOCS 32(2), 5 (2014)
Feilkas, M., Ratiu, D., Jürgens, E.: The loss of architectural knowledge during system evolution: an industrial case study. In: ICPC, pp. 188–197, May 2009
Kinkelin, H., Maltitz, M., Peter, B., Kappler, C., Niedermayer, H., Carle, G.: Privacy preserving energy management. In: Aiello, L.M., McFarland, D. (eds.) SocInfo 2014. LNCS, vol. 8852, pp. 35–42. Springer, Cham (2015). doi:10.1007/978-3-319-15168-7_5
Murray, T., Matichuk, D., Brassil, M., Gammie, P., Bourke, T., Seefried, S., Lewis, C., Gao, X., Klein, G.: seL4: from general purpose to a proof of information flow enforcement. In: IEEE S&P, pp. 415–429, May 2013
Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL: A Proof Assistant for Higher-Order Logic. LNCS, vol. 2283. Springer, Heidelberg (2016). http://isabelle.in.tum.de/
Rost, M., Pfitzmann, A.: Datenschutz-Schutzziele – revisited. Datenschutz und Datensicherheit DuD 33(6), 353–358 (2009)
Tromer, E., Schuster, R.: DroidDisintegrator: intra-application information flow control in Android apps (extended version). In: ASIA CCS 2016, pp. 401–412. ACM (2016). http://www.cs.tau.ac.il/~tromer/disintegrator/disintegrator.pdf
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Case Studies
A Case Studies
The main idea and usage of our model was already motivated by the fictional example of Fig. 1. In this appendix, we present details on two real-world case studies where we evaluate and audit two distributed systems for data collection which are deployed at the Technical University of Munich. For the sake of brevity, we only present the most interesting aspects. We will write node labels as \(X \text {---} Y\), where X corresponds to the tainted labels and Y corresponds to the untainted labels. For example, \(t\ Anonymizer = \lbrace \mathsf {energy}\rbrace \text {---} \lbrace \mathsf {location} \rbrace \).
1.1 A.1 Energy Monitoring System
Energy monitoring systems (EMS) have severe privacy implications: If installed in an office, such a system for example allows to draw conclusions about the effective working periods and behavior of employees by measuring the devices they use. EMS consist of at least two components: A logging unit which records energy usage locally and a server which stores and processes all recorded data. Considering privacy, storing all collected data in a single, possibly external place without fine-grained access control on the data level is critical.
We examined how to improve privacy of the data before persisting it [14]: Since the logger is an off-the-shelf component which we cannot modify, we suggest to add an additional component, called \( P4S \), directly after the logger. This component separates the data by different owners, recipients, or some given predicate and applies further protection measures. The separated data can then be forwarded to (possibly different) cloud services. For the sake of brevity, we do not discuss this service, key management, and how cloud services could collaborate. Our proposed architecture is shown in Fig. 2. We modeled four different kinds of privacy-related data the logger captures by the taint labels \(\mathsf {A}\), \(\mathsf {B}\), \(\mathsf {C}\), and \(\mathsf {D}\).
As input to our tool, we provided the set of components including taint labels and system boundaries as architectural constraints. The results are as follows: Our model shows that data flow from the \( Logger \) to \( P4S \) (which crosses a system boundary physically over the network) is highly critical. For this taint label specification, topoS verified that our architecture is compliant with the security invariants. It also asserts that any attempt to interlink the different data processing pipelines within \( P4S \) would be a severe privacy violation. These insights generated by topoS can be further incorporated into the architecture: The designed pipelines can be separated into individual, isolated, stateless containers within \( P4S \) that can be instantiated on demand for each different taint label. In summary, our extended topoS allowed us to formally assess privacy properties of our proposed architecture before we invested time implementing it.
1.2 A.2 MeasrDroid
MeasrDroid [5] is a system for collecting smartphone sensor data for research purposes. Via an app it may collect location data, information from the smartphone sensors, and networking properties such as signal strength, latency, and reliability. Ultimately, the data is stored and analyzed by a trusted machine, called \( Collect Droid \). To decrease the attack surface of this machine, it is not reachable over the Internet. Instead, the smartphones push the data to a server called \( Upload Droid \), which is regularly polled by \( Collect Droid \) for new information. Since \( Upload Droid \) is particularly exposed, a compromise of this machine must not lead to a privacy violation. Hence, it must be completely uncritical, i.e. not having any taint or untaint labels. This is achieved by having the smartphones encrypt the data for \( Collect Droid \) as only recipient. Consequently, \( Upload Droid \) only sees encrypted data. The model of the architecture is shown in Fig. 3. We modeled three users, each producing data with its individual tainting label \(\mathsf {A}\), \(\mathsf {B}\), or \(\mathsf {C}\). To model encryption of some taint label \( x \), we create a pair of related nodes \(( Enc _\mathrm { x }, Dec _\mathrm { x })\) where the first untaints and the second taints accordingly.
For this taint label specification, topoS verified that our architecture is compliant with the security invariants. In addition, given the taint label specification and adding an additional adversary node, topoS automatically computes an alternative architecture which is also compliant with the security invariants. Comparing our manually designed architecture with the topoS-generated architecture with adversary, we asserted that we did not overlook subtle information leaks. Our evaluation shows that our architecture is a subset of the topoS-generated architecture and only uncritical data can leak to an adversary. It also reveals that our architecture provides no protection against an adversary flooding \( Upload Droid \) with nonsensical data. We found topoS to be a suitable tool to formally support the previous informal privacy claims about the architecture.
Auditing the Real MeasrDroid. The previous paragraphs presents a theoretical evaluation of the architecture of MeasrDroid. The question arises how the real system, which exists since 2013, compares to our theoretical evaluation. Together with the authors of MeasrDroid we evaluate the implementation regarding our previous findings: We collect all machines which are associated with MeasrDroid. We find that they do not have a firewall set up, but instead rely on the central firewall of our lab. With over 5500 rules for IPv4, this firewall may be the largest real-world, publicly available iptables firewall in the worldFootnote 1 and handles many different use cases. MeasrDroid is only a tiny fragment of it, relying on the protocols http, https, and ssh. For brevity, we focus our audit port 80 (http).
The model of the MeasrDroid architecture (cf. Fig. 3) should be recognizable in the rules of our firewall. In particular, \( Collect Droid \) should not be reachable from the Internet while \( Upload Droid \) should, and the former should be able to pull data from the latter. This information may be hidden somewhere in the firewall rule set. We used fffuu [10] to extract the access control structure of the firewall. The result is visualized in Fig. 4. This figure reflects the sheer intrinsic complexity of the access control policy enforced by the firewall. We have highlighted three entities. First, the IP range enclosed in a cloud corresponds to the IP range which is not used by our department, i.e. the Internet. The large block on the left corresponds to most internal machines which are not globally accessible. The IP address we marked in bold red belongs to \( Collect Droid \). Inspecting the arrows, we have formally verified our first auditing goal: \( Collect Droid \) is not directly accessible from the Internet. The other large IP block on the right belongs to machines which are globally accessible. The IP address in bold red belongs to \( Upload Droid \). Therefore, we have verified our second auditing goal: \( Upload Droid \) should be reachable from the Internet. In general, it is pleasant to see that the two machines are in different access groups. Finally, we see that the class of IP addresses including \( Collect Droid \) can access \( Upload Droid \) which proves our third auditing goal.
For the sake of example, we disregard that most machines at the bottom of Fig. 4 could attack \( Collect Droid \). Under this assumption, we ignore this part of the graph and extract only the relevant and simplified parts in Fig. 5. So far, we presented only the positive audit finding. Our audit also reveals many problems, visualized with red arrows. They can be clearly recognized in Fig. 5: First, \( Upload Droid \) can connect to \( Collect Droid \). This is a clear violation of the architecture. We have empirically verified this highly severe problem by logging into \( Upload Droid \) and connecting to \( Collect Droid \). Second, most internal machines may access \( Collect Droid \). Third, there are no restrictions for \( Upload Droid \) with regard to outgoing connections. In theory, it should only passively retrieve data and never initiate connections by itself (disregarding system updates).
Therefore, our audit could verify some core assertions about the actual implementation. In addition, it could uncover and confirm serious bugs. These bugs were unknown prior to our audit and we could only uncover them with the help of the presented tools. Using the firewall serialization feature of topoS, we fixed the problems and reiterated our evaluation to assert that our fix is effective.
Rights and permissions
Copyright information
© 2017 IFIP International Federation for Information Processing
About this paper
Cite this paper
von Maltitz, M., Diekmann, C., Carle, G. (2017). Privacy Assessment Using Static Taint Analysis (Tool Paper). In: Bouajjani, A., Silva, A. (eds) Formal Techniques for Distributed Objects, Components, and Systems. FORTE 2017. Lecture Notes in Computer Science(), vol 10321. Springer, Cham. https://doi.org/10.1007/978-3-319-60225-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-60225-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60224-0
Online ISBN: 978-3-319-60225-7
eBook Packages: Computer ScienceComputer Science (R0)