view · edit · attach · print · history

Summary | Rationale | Powerpoint | JPG Slide | JPG Image | Credits | Comments | All Nuggets

The Best of Both Worlds: Achieving Privacy and Utility

Reaping the benefits of a data-rich world without sacrificing our privacy.

Summary

Every time we make a purchase, walk through a public place, use our cellphones, or go to the doctor, someone records data on our activities. This collection of data holds promise but also poses a threat. Taking medical data as an example, researchers can utilize large collections of medical data to discover new treatments, yet patients do not want their medical records made public. Theoretical computer scientists have methods to achieve both goals of utility and privacy simultaneously.

One powerful approach to this problem is given by secure multi-party computation, which allows users that hold different sets of secret data to collaborate and analyze all their data together, in such a way that no user learns anything about anyone else's secret data except for whatever is revealed by the output of the analysis. Remarkably, theoretical computer scientists have shown that, in principle, any desired analysis can be done while maintaining this maximal level of privacy, in natural but somewhat simplified models of interaction between participants. Ongoing research is developing new techniques to allow such analyses to be efficiently carried out in a much richer and more realistic variety of software and networked settings.

But what analyses should be permitted? Even in a setting in which the data are all held, unencrypted, by a trusted curator, how can the curator provide useful information, while preserving the privacy of the individual data items? To appreciate the problem, consider a "differencing attack," in which a researcher makes two queries to the curator: How many people are HIV positive? How many people whose name is not John Q. Public are HIV positive? Exact answers to these two queries reveals the HIV status of Mr. Public. Scientists are beginning to gain an understanding of what "privacy" should mean in this setting, and to develop techniques to provide accurate, privacy-preserving answers when theoretically possible.

Rationale

Encryption secures our stored data but seems to make it inert: Can we analyze encrypted data without having to decrypt it first? Cryptographers are developing a number of techniques to accomplish this goal in different settings, leading to a wide variety of applications. For instance, scientists have begun developing techniques to write software that incorporates secret information, in such a way that the secrets are kept hidden even from the holder of the software, with applications ranging from digital rights management to homeland security. This work is closely related to one of the most surprising achievements of theoretical computer science -- the development of secure multi-party computation protocols, which allow users that hold different sets of secret data to collaborate and analyze all their data together, in such a way that no user learns anything about anyone else's secret data except for whatever is revealed by the output of the analysis. Such tools could be invaluable in allowing medical researchers to make use of the vast medical datasets held by different hospitals. For instance, researchers doing AIDS research would be able to calculate population statistics on AIDS without identifying the private individuals with the disease.

The HIV scenario mentioned in the summary highlights a weakness in the privacy guarantee offered by secure multi-party computation: the only promise is that nothing is revealed about an individual beyond what is revealed by the outcome of the computation -- it does not speak to the question of which computations, or set of computations, can safely be carried out. Put differently, in multi-party computation functionality is paramount and privacy is only as good as the functionality permits. Privacy researchers are now asking a different question: when privacy is paramount, what functionality can be achieved? This question even makes sense in a setting in which the data are all held, unencrypted, by a trusted curator. How can the curator provide useful information, while preserving the privacy of the individual data items?

Contributors and Credits

Cynthia Dwork, Kristin Yvonne Rozier, Amit Sahai, Salil Vadhan

Image Ideas

List ideas for possible images. You can also upload images you've found using a command like this Δ.

Comments

  • to give feedback on this nugget, just add another bullet to this list
  • The summary seems too long, I think it's more fitting in length/style for the rationale. The first paragraph is great. Perhaps the summary can have one more paragraph that briefly states the two main directions being addressed, and just a sentence or two about ongoing/future challenges: SMPC - perform a given functionality in such a way that achieves the maximum level of privacy, ie only what is revealed by the output, and Private Data Analysis (is this the right term?) - what functionalities can be done without harming an individual's privacy. - Salil
  • Minor suggestions - Salil
    • "both goals" -> "the seemingly contradictory goals"
    • put secure multi-party computation in italics
    • I found the example in the "Or could they?" paragraph to be a bit hard to get; the example in the last paragraph was much clearer to me.
    • "beyond what is revealed...computation" -> "except what is revealed...computation" and put this entire phrase in italics.
    • "only as good" -> "(only) as good"
    • "Privacy researchers" -> "Theoretical computer scientists" (if appropriate, or else maybe find some other way to convey the role of TCS here)
    • "Exact answers" -> "Even though each question on its own seems to be relatively innocuous, exact answers..."
  • I agree with Salil's point about length. Also, I suggest replacing "highlights a weakness" with something that doesn't sound so negative about the research just advocated -- like perhaps something along the lines of "This shows that more research is needed beyond ..." or something like that. - Amit
  • Shorter summary drafted by Cynthia and me. - Salil
  • The rationale sounds a little bit redundant after the summary: some phrases are even identical (e.g., "To appreciate the problem, consider a differencing attack"). I also agree with Salil that the example that starts with "Or could they?" is hard to understand. --Luis
  • I'm a bit torn over the SMPC wording in the summary. I think that the current summary focuses too much on secure multi-party computation, which while still a vibrant area, might seem too "solved" for such a nugget. I've changed the last two sentences of that paragraph to give a better idea of the many challenges that remain in this area of "secure analysis". I've also eliminated the redundancy that Luis remarked on. - Amit
  • How about a more positive phrasing: "this maximal level of privacy, in natural but somewhat simplified models of interaction between participants. Ongoing research is developing new techniques to allow such analyses to be efficiently carried out in a much richer and more realistic variety of software and networked settings." - Salil
  • Salil's suggestion looks great to me. - Amit
  • Comment incorporated into text. - Salil
  • Note from designer Elaine Park: This one is tricky, with so much to get across. This image of a fingerprint (which has been blown up and printed for examination) being shredded may convey the main gist of the idea.
  • I agree with Kristin's concern (posted on the "design drafts" page) that the fingerprint image suggests that this line of work is too much about trying to protect criminals rather than everyday citizens. - Salil
  • The shredding metaphor is a good one. Can the shredder shred something else like a credit card or ss card? -- Richard
  • the image is OK, but doesn't convey the privacy issue well, like having medical records on the web. - Bernard
  • Revised ppt uploaded. - Salil
  • Note from designer Elaine Park: Here is a pattern of social security cards being cut into strips.
  • Revised ppt uploaded. - Salil
  • Note from designer Elaine Park: A shadow figure may be using information (a credit card swipe, medical information, cellphone transmission and surveillance footage) for unknown purposes
view · edit · attach · print · history
Page last modified on April 08, 2010, at 11:42 PM