Projects‎ > ‎

Proteus: A Practical and Rigorous Toolkit for Private Data Analysis

Supported by National Science Foundation CAREER Award CNS #1253327, CAREER: Proteus: A Practical and Rigorous Toolkit for Privacy (abstract) and a 2013 Google Faculty Research Award. 

Principle investigator: Ashwin Machanavajjhala, Duke University


  • Xi He (PhD Student)
  • Benjamin Stoddard (PhD Student)
  • Ios Kotsogiannis (PhD Student) 
  • Ashwin Machanavajjhala (Faculty)
  • Daniel Kifer (Faculty Member, Penn State University)
  • John Abowd (Faculty Member, Cornell University & US Census Bureau) 
  • Amol Deshpande (Faculty Member, University of Maryland, College Park)
  • Thodoris Rekatsinas (PhD Student, University of Maryland, College Park)


Statistical privacy, or the problem of disclosing aggregate statistics about data collected from individuals while ensuring the privacy of individual level sensitive properties, is an important problem in today's age of big data. The key challenge in statistical privacy is that applications for data collection and analysis operate on varied kinds of data, and have diverse requirements for the information that must be kept secret, and the adversaries that they must tolerate. Thus, application domain experts, who are frequently not experts in privacy, cannot directly use an existing, general-purpose privacy definition. Instead, they must develop a new privacy definition or customize an existing one. Currently there exist no rigorous techniques to customize privacy to applications.

This project builds PROTEUS, a general-purpose toolkit for developing rigorous privacy definitions and mechanisms that can be customized to applications. The cornerstone of PROTEUS is a novel privacy framework that allows customized privacy protection by explicitly listing the secrets to be protected, enumerating the (potentially infinite set of) possible adversaries, and ensuring rigorous bounds on the information disclosed to each adversary about every secret. Novel theoretical tools in PROTEUS include methods to reason about privacy for correlated data, privacy against realistic adversaries, and techniques to express and enforce personalized privacy preferences. These tools result in practical privacy mechanisms for publishing social science survey data, social network analysis, and analysis of user-activity streams. Broader impacts of this project include developing new courses in privacy and big-data management, as well as technology transfer to the US Census.


  • Daniel Kifer, Ashwin Machanavajjhala "Pufferfish: A framework for mathematical privacy definitions", to appear ACM Transactions on Databases Systems (TODS) 2013 39(1)
  • Thodoris Rekatsinas, Amol Deshpande, Ashwin Machanavajjhala "SPARSI: Partitioning sensitive data amongst multiple adversaries", Proceedings of the VLDB Endowment (PVLDB) 2013 6(13)