Data management can now be outsourced to cloud service providers like Amazon Web Services or IBM SmartCloud.
This calls for encrypted data-representation schemes that also give way to efficient query processing.
State-of-the-art approaches are overly expensive for exact-match queries in the worst case, or they do not ensure privacy
if an adversary knows the data distribution. In this paper, we propose a new privacy approach without these shortcomings.
It makes use of encryption, obfuscated indices, and data fragmentation. To speed up query processing, we propose three novel data-transformation and query-execution schemes. For two schemes, we prove that an adversary capable of solving any polynomial problem cannot determine if any attribute values appear together in a tuple. Thus, with our schemes, sensitive data is not linked to personally identifiable information. To evaluate our third scheme, we propose a measure that quantifies the risk of disclosure.
We evaluate our approach on real-world folksonomy data. Our evaluation shows that its
average response time of exact-match queries with 15 million tuples is under one second on a conventional desktop PC.