April 2012


Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sushil Jajodia <[log in to unmask]>
Reply To:
Wed, 4 Apr 2012 08:46:54 -0400
text/plain (43 lines)
                       SEMINAR ANNOUNCEMENT
                     GEORGE MASON UNIVERSITY
              Date :  Thursday, April 5, 2012
              Time :  11:00 - 12:00
          Location :  Research Hall, Room 401

Differentially Private Histogram Publication 

Yin Yang
Research Scientist
ADSC, Singapore

Differential privacy (DP) is a promising scheme for releasing the results of
statistical queries on sensitive data, with strong privacy guarantees
against adversaries with arbitrary background knowledge. Existing studies on
DP mostly focus on simple aggregations such as counts. This paper
investigates the publication of DP-compliant histograms, which is an
important analytical tool for showing the distribution of a random variable,
e.g., hospital bill size for certain patients. Compared to simple
aggregations whose results are purely numerical, a histogram query is
inherently more complex, since it must also determine its structure, i.e.,
the ranges of the bins. As we demonstrate in the paper, a DP-compliant
histogram with finer bins may actually lead to significantly lower accuracy
than a coarser one, since the former requires stronger perturbations in
order to satisfy DP. Moreover, the histogram structure itself may reveal
sensitive information, which further complicates the problem.

Motivated by this, we propose two novel algorithms, namely NoiseFirst and
StructureFirst, for computing DP-compliant histograms. Their main difference
lies in the relative order of the noise injection and the histogram
structure computation steps. NoiseFirst has the additional benefit that it
can improve the accuracy of an already published DP-complaint histogram
computed using a naive method. Going one step further, we extend both
solutions to answer arbitrary range queries. Extensive experiments, using
several real data sets, confirm that the proposed methods output highly
accurate query answers, and consistently outperform existing competitors. 
**Point of contact:  Prof Sushil Jajodia [log in to unmask]