Back to Search
Start Over
A privacy-preserved full-text retrieval algorithm over encrypted data for cloud storage applications
- Source :
- Journal of Parallel and Distributed Computing. 99:14-27
- Publication Year :
- 2017
- Publisher :
- Elsevier BV, 2017.
-
Abstract
- As Cloud Computing becomes prevalent, more and more sensitive information has been outsourced into cloud. A straightforward methodology that can protect data privacy is to encrypt the data before outsourcing. Recently, many searchable encryption schemes have been proposed to allow users to execute keyword-based search over encrypted data. However, it is different for users to exactly find all the interested files from the huge amounts of data by relying solely on keyword-based search. In information retrieval domain, full-text retrieval is an efficient information retrieval technology that allows efficient searches over massive amount of web data. Unfortunately, when applied in the cloud paradigm, full-text retrieval over encrypted cloud data have not been well studied. The full-text retrieval service requires extracting all the words in the contents of documents. The huge scale of index words cannot be efficiently supported by the existing searchable encryption schemes. Moreover, to protect user’s privacy, a privacy-preserved full-text retrieval index is required. These problems make efficient full-text retrieval over a large amount of encrypted cloud data a very challenging task. In this paper, we first establish a set of strict privacy requirements for full-text retrieval in cloud storage systems. To address the challenging problem, we design a Bloom filter based tree index. Our scheme fine-tunes the similarity between the query and encrypted documents by proposing the membership entropies of index words. Our scheme is provably secure through our security analysis. We demonstrate the effectiveness and efficiency of the proposed scheme through extensive experimental evaluation. The experimental results manifest the search operation can be done in 60 milliseconds using an off-the-shelf moderate PC.
- Subjects :
- Information privacy
Information retrieval
Computer Networks and Communications
business.industry
Computer science
020206 networking & telecommunications
Cloud computing
02 engineering and technology
Bloom filter
Encryption
Theoretical Computer Science
Tree (data structure)
Information sensitivity
Data retrieval
Artificial Intelligence
Hardware and Architecture
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
business
Cloud storage
Software
Subjects
Details
- ISSN :
- 07437315
- Volume :
- 99
- Database :
- OpenAIRE
- Journal :
- Journal of Parallel and Distributed Computing
- Accession number :
- edsair.doi...........823415b703a888b48f4815274fba10c9