332 results on '"Cuckoo hashing"'
Search Results
2. Popularity Cuckoo Filter: Always Keeping Popular Items in Mind
- Author
-
Cheng, Xuetan, Luo, Lailong, Zou, Wei, Yang, Xiangrui, Guo, Deke, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Tari, Zahir, editor, Li, Keqiu, editor, and Wu, Hongyi, editor
- Published
- 2024
- Full Text
- View/download PDF
3. CostCounter: A Better Method for Collision Mitigation in Cuckoo Hashing.
- Author
-
HAONAN WU, SHUXIAN WANG, ZHANFENG JIN, YUHANG ZHANG, RUYUN MA, SIJIN FAN, and RUILI CHAO
- Subjects
FIELD programmable gate arrays ,CUCKOOS ,RANDOM walks - Abstract
Hardware is often required to support fast search and high-throughput applications. Consequently, the performance of search algorithms is limited by storage bandwidth. Hence, the search algorithm must be optimized accordingly. We propose a CostCounter (CC) algorithm based on cuckoo hashing and an Improved CostCounter (ICC) algorithm. A better path can be selected when collisions occur using a cost counter to record the kick-out situation. Our simulation results indicate that the CC and ICC algorithms can achieve more significant performance improvements than Random Walk (RW), Breadth First Search (BFS), and MinCounter (MC). With two buckets and two slots per bucket, under the 95% memory load rate of the maximum load rate, CC and ICC are optimized on read-write times over 20% and 80% compared to MC and BFS, respectively. Furthermore, the CC and ICC algorithms achieve a slight improvement in storage efficiency compared with MC. In addition, we implement RW, MC, and the proposed algorithms using fine-grained locking to support a high throughput rate. From the test on field programmable gate arrays, we verify the simulation results and our algorithms optimize the maximum throughput over 23% compared to RW and 9% compared to MC under 95% of the memory capacity. The test results indicate that our CC and ICC algorithms can achieve better performance in terms of hardware bandwidth and memory load efficiency without incurring a significant resource cost. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Load Thresholds for Cuckoo Hashing with Overlapping Blocks.
- Author
-
WALZER, STEFAN
- Subjects
STATISTICAL physics ,CUCKOOS - Abstract
We consider a natural variation of cuckoo hashing proposed by Lehman and Panigrahy (2009). Each of cn objects is assigned k = 2 intervals of size - in a linear hash table of size n and both starting points are chosen independently and uniformly at random. Each object must be placed into a table cell within its intervals, but each cell can only hold one object. Experiments suggested that this scheme outperforms the variant with blocks in which intervals are aligned at multiples of -. In particular, the load threshold is higher, i.e., the load c that can be achieved with high probability. For instance, Lehman and Panigrahy (2009) empirically observed the threshold for - = 2 to be around 96. 5% as compared to roughly 89. 7% using blocks. They pinned down the asymptotics of the thresholds for large -, but the precise values resisted rigorous analysis. We establish a method to determine these load thresholds for all - = 2, and, in fact, for general k = 2. For instance, for k = - = 2, we get ≈ 96. 4995%. We employ a theorem due to Leconte, Lelarge, and Massoulié (2013), which adapts methods from statistical physics to the world of hypergraph orientability. In effect, the orientability thresholds for our graph families are determined by belief propagation equations for certain graph limits. As a side note, we provide experimental evidence suggesting that placements can be constructed in linear time using an adapted version of an algorithm by Khosla (2013). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. Accelerated Parallel Hybrid GPU/CPU Hash Table Queries with String Keys
- Author
-
Groth, Tobias, Groppe, Sven, Pionteck, Thilo, Valdiek, Franz, Koppehel, Martin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Strauss, Christine, editor, Cuzzocrea, Alfredo, editor, Kotsis, Gabriele, editor, Tjoa, A Min, editor, and Khalil, Ismail, editor
- Published
- 2022
- Full Text
- View/download PDF
6. Linear Complexity Private Set Intersection for Secure Two-Party Protocols
- Author
-
Karakoç, Ferhat, Küpçü, Alptekin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Krenn, Stephan, editor, Shulman, Haya, editor, and Vaudenay, Serge, editor
- Published
- 2020
- Full Text
- View/download PDF
7. Fast lightweight accurate xenograft sorting
- Author
-
Jens Zentgraf and Sven Rahmann
- Subjects
Xenograft sorting ,Alignment-free method ,Cuckoo hashing ,k-mer ,Biology (General) ,QH301-705.5 ,Genetics ,QH426-470 - Abstract
Abstract Motivation With an increasing number of patient-derived xenograft (PDX) models being created and subsequently sequenced to study tumor heterogeneity and to guide therapy decisions, there is a similarly increasing need for methods to separate reads originating from the graft (human) tumor and reads originating from the host species’ (mouse) surrounding tissue. Two kinds of methods are in use: On the one hand, alignment-based tools require that reads are mapped and aligned (by an external mapper/aligner) to the host and graft genomes separately first; the tool itself then processes the resulting alignments and quality metrics (typically BAM files) to assign each read or read pair. On the other hand, alignment-free tools work directly on the raw read data (typically FASTQ files). Recent studies compare different approaches and tools, with varying results. Results We show that alignment-free methods for xenograft sorting are superior concerning CPU time usage and equivalent in accuracy. We improve upon the state of the art sorting by presenting a fast lightweight approach based on three-way bucketed quotiented Cuckoo hashing. Our hash table requires memory comparable to an FM index typically used for read alignment and less than other alignment-free approaches. It allows extremely fast lookups and uses less CPU time than other alignment-free methods and alignment-based methods at similar accuracy. Several engineering steps (e.g., shortcuts for unsuccessful lookups, software prefetching) improve the performance even further. Availability Our software xengsort is available under the MIT license at http://gitlab.com/genomeinformatics/xengsort . It is written in numba-compiled Python and comes with sample Snakemake workflows for hash table construction and dataset processing.
- Published
- 2021
- Full Text
- View/download PDF
8. Data De-duplication Using Cuckoo Hashing in Cloud Storage
- Author
-
Sridharan, J., Valliyammai, C., Karthika, R. N., Nihil Kulasekaran, L., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Nayak, Janmenjoy, editor, Abraham, Ajith, editor, Krishna, B. Murali, editor, Chandra Sekhar, G. T., editor, and Das, Asit Kumar, editor
- Published
- 2019
- Full Text
- View/download PDF
9. Data Deduplication and Fine-Grained Auditing on Big Data in Cloud Storage
- Author
-
Karthika, RN., Valliyammai, C., Abisha, D., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Reddy Edla, Damodar, editor, Lingras, Pawan, editor, and Venkatanareshbabu K., editor
- Published
- 2018
- Full Text
- View/download PDF
10. Fast lightweight accurate xenograft sorting.
- Author
-
Zentgraf, Jens and Rahmann, Sven
- Subjects
CUCKOOS ,GENOMES ,HETEROGENEITY ,MICE ,PYTHON programming language - Abstract
Motivation: With an increasing number of patient-derived xenograft (PDX) models being created and subsequently sequenced to study tumor heterogeneity and to guide therapy decisions, there is a similarly increasing need for methods to separate reads originating from the graft (human) tumor and reads originating from the host species' (mouse) surrounding tissue. Two kinds of methods are in use: On the one hand, alignment-based tools require that reads are mapped and aligned (by an external mapper/aligner) to the host and graft genomes separately first; the tool itself then processes the resulting alignments and quality metrics (typically BAM files) to assign each read or read pair. On the other hand, alignment-free tools work directly on the raw read data (typically FASTQ files). Recent studies compare different approaches and tools, with varying results. Results: We show that alignment-free methods for xenograft sorting are superior concerning CPU time usage and equivalent in accuracy. We improve upon the state of the art sorting by presenting a fast lightweight approach based on three-way bucketed quotiented Cuckoo hashing. Our hash table requires memory comparable to an FM index typically used for read alignment and less than other alignment-free approaches. It allows extremely fast lookups and uses less CPU time than other alignment-free methods and alignment-based methods at similar accuracy. Several engineering steps (e.g., shortcuts for unsuccessful lookups, software prefetching) improve the performance even further. Availability: Our software xengsort is available under the MIT license at http://gitlab.com/genomeinformatics/xengsort. It is written in numba-compiled Python and comes with sample Snakemake workflows for hash table construction and dataset processing. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
11. Adaptive Cuckoo Filters.
- Author
-
Mitzenmacher, Michael, Pontarelli, Salvatore, and Reviriego, Pedro
- Subjects
ADAPTIVE filters ,DATA structures ,CUCKOOS ,MARKOV processes - Abstract
We introduce the adaptive cuckoo filter (ACF), a data structure for approximate set membership that extends cuckoo filters by reacting to false positives, removing them for future queries. As an example application, in packet processing queries may correspond to flow identifiers, so a search for an element is likely to be followed by repeated searches for that element. Removing false positives can therefore significantly lower the false-positive rate. The ACF, like the cuckoo filter, uses a cuckoo hash table to store fingerprints. We allow fingerprint entries to be changed in response to a false positive in a manner designed to minimize the effect on the performance of the filter. We show that the ACF is able to significantly reduce the false-positive rate by presenting both a theoretical model for the false-positive rate and simulations using both synthetic data sets and real packet traces. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
12. Cuckoo Hashing
- Author
-
Pagh, Rasmus and Kao, Ming-Yang, editor
- Published
- 2016
- Full Text
- View/download PDF
13. Millions of Low-latency State Insertions on ASIC Switches
- Author
-
Caiazzi, Tommaso, Scazzariello, Mariano, Chiesa, Marco, Caiazzi, Tommaso, Scazzariello, Mariano, and Chiesa, Marco
- Abstract
Key-value data structures are an essential component of today's stateful packet processors such as load balancers, packet schedulers, and more. Realizing key-value data structures entirely in the data-plane of an ASIC switch would bring enormous energy savings. Yet, today's implementations are ill-suited for stateful packet processing as they support only a limited amount of flow-state insertions per second into these data structures. In this paper, we present SWITCHAROO, a mechanism for realizing key-value data structures on programmable ASIC switches that supports both high-frequency insertions and fast lookups entirely in the data plane. We show that SWITCHAROO can be realized on ASIC, supports millions of flow-state insertions per second with only limited amount of packet recirculation., QC 20241202
- Published
- 2023
- Full Text
- View/download PDF
14. A Faster Algorithm for Cuckoo Insertion and Bipartite Matching in Large Graphs.
- Author
-
Khosla, Megha and Anand, Avishek
- Subjects
- *
BIPARTITE graphs , *RANDOM graphs , *CUCKOOS , *ALGORITHMS , *TIME management , *RANDOM walks , *COMPUTER science , *ASSIGNMENT problems (Programming) - Abstract
Hash tables are ubiquitous in computer science for efficient access to large datasets. However, there is always a need for approaches that offer compact memory utilisation without substantial degradation of lookup performance. Cuckoo hashing is an efficient technique of creating hash tables with high space utilisation and offer a guaranteed constant access time. We are given n locations and m items. Each item has to be placed in one of the k ≥ 2 locations chosen by k random hash functions. By allowing more than one choice for a single item, cuckoo hashing resembles multiple choice allocations schemes. In addition it supports dynamically changing the location of an item among its possible locations. We propose and analyse an insertion algorithm for cuckoo hashing that runs in linear time with high probability and in expectation. Previous work on total allocation time has analysed breadth first search, and it was shown to be linear only in expectation. Our algorithm finds an assignment (with probability 1) whenever it exists. In contrast, the other known insertion method, known as random walk insertion, may run indefinitely even for a solvable instance. We also present experimental results comparing the performance of our algorithm with the random walk method, also for the case when each location can hold more than one item. As a corollary we obtain a linear time algorithm (with high probability and in expectation) for finding perfect matchings in a special class of sparse random bipartite graphs. We support this by performing experiments on a real world large dataset for finding maximum matchings in general large bipartite graphs. We report an order of magnitude improvement in the running time as compared to the Hopkraft–Karp matching algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
15. Dynamic Space Efficient Hashing.
- Author
-
Maier, Tobias, Sanders, Peter, and Walzer, Stefan
- Subjects
- *
HASHING , *SPACE , *CUCKOOS , *DATA structures - Abstract
We consider space efficient hash tables that can grow and shrink dynamically and are always highly space efficient, i.e., their space consumption is always close to the lower bound even while growing and when taking into account storage that is only needed temporarily. None of the traditionally used hash tables have this property. We show how known approaches like linear probing and bucket cuckoo hashing can be adapted to this scenario by subdividing them into many subtables or using virtual memory overcommitting. However, these rather straightforward solutions suffer from slow amortized insertion times due to frequent reallocation in small increments. Our main result is Dynamic Space Efficient Cuckoo Table (DySECT) which avoids these problems. DySECT consists of many subtables which grow by doubling their size. The resulting inhomogeneity in subtable sizes is counterbalanced by the flexibility available in bucket cuckoo hashing where each element can go to several buckets each of which containing several cells. Experiments indicate that DySECT works well with loads up to 98%. With up to 1.9 times better performance than the next best solution. Additionally, we give a tight theoretical analysis for the possible load threshold of DySECT, i.e., a bound where with high probability the table can be filled up to that load but not above said load. This load also matches our experimental findings. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
16. On the insertion time of random walk cuckoo hashing.
- Author
-
Frieze, Alan and Johansson, Tony
- Subjects
RANDOM walks ,CUCKOOS ,HASHING ,OPEN-ended questions - Abstract
Cuckoo Hashing is a hashing scheme invented by pervious study of Pagh and Rodler. It uses d ≥ 2 distinct hash functions to insert n items into the hash table of size m = (1 + ε)n. In their original paper they treated d = 2 and m = (2 + ε)n. It has been an open question for some time as to the expected time for Random Walk Insertion to add items when d > 2. We show that if the number of hash functions d ≥ dε = O(1) then the expected insertion time is O(1) per item. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
17. Hardness-Preserving Reductions via Cuckoo Hashing.
- Author
-
Berman, Itay, Haitner, Iftach, Komargodski, Ilan, and Naor, Moni
- Subjects
CUCKOOS ,HASHING ,SECURITY systems - Abstract
The focus of this work is hardness-preserving transformations of somewhat limited pseudorandom functions families (PRFs) into ones with more versatile characteristics. Consider the problem of domain extension of pseudorandom functions: given a PRF that takes as input elements of some domain U, we would like to come up with a PRF over a larger domain. Can we do it with little work and without significantly impacting the security of the system? One approach is to first hash the larger domain into the smaller one and then apply the original PRF. Such a reduction, however, is vulnerable to a "birthday attack": after U queries to the resulting PRF, a collision (i.e., two distinct inputs having the same hash value) is very likely to occur. As a consequence, the resulting PRF is insecure against an attacker making this number of queries. In this work, we show how to go beyond the aforementioned birthday attack barrier by replacing the above simple hashing approach with a variant of cuckoo hashing, a hashing paradigm that resolves collisions in a table by using two hash functions and two tables, cleverly assigning each element to one of the two tables. We use this approach to obtain: (i) a domain extension method that requires just two calls to the original PRF can withstand as many queries as the original domain size, and has a distinguishing probability that is exponentially small in the amount of non-cryptographic work; and (ii) a security-preserving reduction from non-adaptive to adaptive PRFs. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
18. An Analysis of Random-Walk Cuckoo Hashing
- Author
-
Frieze, Alan, Melsted, Páll, Mitzenmacher, Michael, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Dinur, Irit, editor, Jansen, Klaus, editor, Naor, Joseph, editor, and Rolim, José, editor
- Published
- 2009
- Full Text
- View/download PDF
19. Weaknesses of Cuckoo Hashing with a Simple Universal Hash Class: The Case of Large Universes
- Author
-
Dietzfelbinger, Martin, Schellbach, Ulf, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Nielsen, Mogens, editor, Kučera, Antonín, editor, Miltersen, Peter Bro, editor, Palamidessi, Catuscia, editor, Tůma, Petr, editor, and Valencia, Frank, editor
- Published
- 2009
- Full Text
- View/download PDF
20. KCOSS: an ultra-fast k-mer counter for assembled genome analysis
- Author
-
Yelei Tang, Jiabin Lin, Deyou Tang, Yuchen Li, Daqiang Tan, Juan Fu, Rong Zhao, Zhongming Zhao, and Hongli Du
- Subjects
Statistics and Probability ,Comparative genomics ,Computer science ,Parallel computing ,Bloom filter ,Biochemistry ,Genome ,Computer Science Applications ,Set (abstract data type) ,Computational Mathematics ,Cuckoo hashing ,Computational Theory and Mathematics ,Metagenomics ,k-mer ,Thread pool ,Molecular Biology - Abstract
Motivation The k-mer frequency in whole genome sequences provides researchers with an insightful perspective on genomic complexity, comparative genomics, metagenomics and phylogeny. The current k-mer counting tools are typically slow, and they require large memory and hard disk for assembled genome analysis. Results We propose a novel and ultra-fast k-mer counting algorithm, KCOSS, to fulfill k-mer counting mainly for assembled genomes with segmented Bloom filter, lock-free queue, lock-free thread pool and cuckoo hash table. We optimize running time and memory consumption by recycling memory blocks, merging multiple consecutive first-occurrence k-mers into C-read, and writing a set of C-reads to disk asynchronously. KCOSS was comparatively tested with Jellyfish2, CHTKC and KMC3 on seven assembled genomes and three sequencing datasets in running time, memory consumption, and hard disk occupation. The experimental results show that KCOSS counts k-mer with less memory and disk while having a shorter running time on assembled genomes. KCOSS can be used to calculate the k-mer frequency not only for assembled genomes but also for sequencing data. Availabilityand implementation The KCOSS software is implemented in C++. It is freely available on GitHub: https://github.com/kcoss-2021/KCOSS. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2021
- Full Text
- View/download PDF
21. Cuckoo Hashing : 2001; Pagh, Rodler 2001; Pagh, Rodler
- Author
-
Pagh, Rasmus and Kao, Ming-Yang, editor
- Published
- 2008
- Full Text
- View/download PDF
22. FPGA-Based Cuckoo Hashing for Pattern Matching in NIDS/NIPS
- Author
-
Tran, Thinh Ngoc, Kittitornkun, Surin, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Ata, Shingo, editor, and Hong, Choong Seon, editor
- Published
- 2007
- Full Text
- View/download PDF
23. Fast Gapped k-mer Counting with Subdivided Multi-Way Bucketed Cuckoo Hash Tables
- Author
-
Jens Zentgraf and Sven Rahmann, Zentgraf, Jens, Rahmann, Sven, Jens Zentgraf and Sven Rahmann, Zentgraf, Jens, and Rahmann, Sven
- Abstract
Motivation. In biological sequence analysis, alignment-free (also known as k-mer-based) methods are increasingly replacing mapping- and alignment-based methods for various applications. A basic step of such methods consists of building a table of all k-mers of a given set of sequences (a reference genome or a dataset of sequenced reads) and their counts. Over the past years, efficient methods and tools for k-mer counting have been developed. In a different line of work, the use of gapped k-mers has been shown to offer advantages over the use of the standard contiguous k-mers. However, no tool seems to be available that is able to count gapped k-mers with the same efficiency as contiguous k-mers. One reason is that the most efficient k-mer counters use minimizers (of a length m < k) to group k-mers into buckets, such that many consecutive k-mers are classified into the same bucket. This approach leads to cache-friendly (and hence extremely fast) algorithms, but the approach does not transfer easily to gapped k-mers. Consequently, the existing efficient k-mer counters cannot be trivially modified to count gapped k-mers with the same efficiency. Results. We present a different approach that is equally applicable to contiguous k-mers and gapped k-mers. We use multi-way bucketed Cuckoo hash tables to efficiently store (gapped) k-mers and their counts. We also describe a method to parallelize counting over multiple threads without using locks: We subdivide the hash table into independent subtables, and use a producer-consumer model, such that each thread serves one subtable. This requires designing Cuckoo hash functions with the property that all alternative locations for each k-mer are located in the same subtable. Compared to some of the fastest contiguous k-mer counters, our approach is of comparable speed, or even faster, on large datasets, and it is the only one that supports gapped k-mers.
- Published
- 2022
- Full Text
- View/download PDF
24. GPU-accelerated SPH fluids surface reconstruction using two-level spatial uniform grids.
- Author
-
Wu, Wei, Li, Hongping, Su, Tianyun, Liu, Haixing, and Lv, Zhihan
- Subjects
- *
GRAPHICS processing units , *IMAGE reconstruction , *COMPUTER simulation , *PARALLEL processing , *DIGITAL image processing , *HYDRODYNAMICS - Abstract
An efficient two-level spatial uniform grid structure-based high-quality surface reconstruction method with Marching Cubes (MC) for smoothed particle hydrodynamics (SPH) fluids was presented in this paper. Compared with the traditional way that dividing the simulation domain with uniform grid directly, an enhanced narrow-band approach using the parallel cuckoo hashing method was taken to index the coarse-level surface vertices, hence decrease the memory consumption. Moreover, a two-level spatial uniform grid structure was employed with a scheme of arranging the fine surface vertices, which could preserve the spatial locality property to facilitate the coalesced memory access on the GPU. Our algorithm was designed for parallel architectures, based on which a parallel version of the optimized surface reconstruction was performed on the CUDA platform. In the experiment of comparison to traditional approaches, the results indicated that our surface reconstruction method was more efficient at the same level of quality of the reconstructed surfaces. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
25. A Collision-Mitigation Cuckoo Hashing Scheme for Large-Scale Storage Systems.
- Author
-
Sun, Yuanyuan, Hua, Yu, Feng, Dan, Yang, Ling, Zuo, Pengfei, Cao, Shunde, and Guo, Yuncheng
- Subjects
- *
HASHING , *INFORMATION storage & retrieval systems , *CLOUD storage , *CLOUD computing , *QUERYING (Computer science) - Abstract
With the rapid growth of the amount of information, cloud computing servers need to process and analyze large amounts of high-dimensional and unstructured data timely and accurately. This usually requires many query operations. Due to simplicity and ease of use, cuckoo hashing schemes have been widely used in real-world cloud-related applications. However, due to the potential hash collisions, the cuckoo hashing suffers from endless loops and high insertion latency, even high risks of re-construction of entire hash table. In order to address these problems, we propose a cost-efficient cuckoo hashing scheme, called MinCounter. The idea behind MinCounter is to alleviate the occurrence of endless loops in the data insertion by selecting unbusy kicking-out routes. MinCounter selects the “cold” (infrequently accessed), rather than random, buckets to handle hash collisions. We further improve the concurrency of the MinCounter scheme to pursue higher performance and adapt to concurrent applications. MinCounter has the salient features of offering efficient insertion and query services and delivering high performance of cloud servers, as well as enhancing the experiences for cloud users. We have implemented MinCounter in a large-scale cloud testbed and examined the performance by using three real-world traces. Extensive experimental results demonstrate the efficacy and efficiency of MinCounter. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
26. Cuckoo Hashing with Three Hash Tables
- Author
-
Saeyoung Jang, Hyesook Lim, and Hayoung Byun
- Subjects
Cuckoo hashing ,Theoretical computer science ,Computer science ,Hash table - Published
- 2020
- Full Text
- View/download PDF
27. Insertion Time of Random Walk Cuckoo Hashing below the Peeling Threshold
- Author
-
Walzer, Stefan
- Subjects
FOS: Computer and information sciences ,Mathematics of computing → Random graphs ,Random Walk ,Random Hypergraph ,Theory of computation → Design and analysis of algorithms ,Computer Science - Data Structures and Algorithms ,Cuckoo Hashing ,Theory of computation → Bloom filters and hashing ,Cores ,Data Structures and Algorithms (cs.DS) ,Peeling - Abstract
Most hash tables have an insertion time of 𝒪(1), often qualified as "expected" and/or "amortised". While insertions into cuckoo hash tables indeed seem to take 𝒪(1) expected time in practice, only polylogarithmic guarantees are proven in all but the simplest of practically relevant cases. Given the widespread use of cuckoo hashing to implement compact dictionaries and Bloom filter alternatives, closing this gap is an important open problem for theoreticians. In this paper, we show that random walk insertions into cuckoo hash tables take 𝒪(1) expected amortised time when any number k ≥ 3 of hash functions is used and the load factor is below the corresponding peeling threshold (e.g. ≈0.81 for k = 3). To our knowledge, this is the first meaningful guarantee for constant time insertion for cuckoo hashing that works for k ∈ {3,…,9}. In addition to being useful in its own right, we hope that our key-centred analysis method can be a stepping stone on the path to the true end goal: 𝒪(1) time insertions for all load factors below the load threshold (e.g. ≈0.91 for k = 3)., LIPIcs, Vol. 244, 30th Annual European Symposium on Algorithms (ESA 2022), pages 87:1-87:11
- Published
- 2022
28. Fast Gapped k-mer Counting with Subdivided Multi-Way Bucketed Cuckoo Hash Tables
- Author
-
Zentgraf, Jens and Rahmann, Sven
- Subjects
Cuckoo hashing ,parallelization ,Applied computing → Molecular sequence analysis ,k-mer ,counting ,Theory of computation → Bloom filters and hashing ,gapped k-mer ,Applied computing → Bioinformatics ,Theory of computation → Data structures design and analysis - Abstract
Motivation. In biological sequence analysis, alignment-free (also known as k-mer-based) methods are increasingly replacing mapping- and alignment-based methods for various applications. A basic step of such methods consists of building a table of all k-mers of a given set of sequences (a reference genome or a dataset of sequenced reads) and their counts. Over the past years, efficient methods and tools for k-mer counting have been developed. In a different line of work, the use of gapped k-mers has been shown to offer advantages over the use of the standard contiguous k-mers. However, no tool seems to be available that is able to count gapped k-mers with the same efficiency as contiguous k-mers. One reason is that the most efficient k-mer counters use minimizers (of a length m < k) to group k-mers into buckets, such that many consecutive k-mers are classified into the same bucket. This approach leads to cache-friendly (and hence extremely fast) algorithms, but the approach does not transfer easily to gapped k-mers. Consequently, the existing efficient k-mer counters cannot be trivially modified to count gapped k-mers with the same efficiency. Results. We present a different approach that is equally applicable to contiguous k-mers and gapped k-mers. We use multi-way bucketed Cuckoo hash tables to efficiently store (gapped) k-mers and their counts. We also describe a method to parallelize counting over multiple threads without using locks: We subdivide the hash table into independent subtables, and use a producer-consumer model, such that each thread serves one subtable. This requires designing Cuckoo hash functions with the property that all alternative locations for each k-mer are located in the same subtable. Compared to some of the fastest contiguous k-mer counters, our approach is of comparable speed, or even faster, on large datasets, and it is the only one that supports gapped k-mers., LIPIcs, Vol. 242, 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022), pages 12:1-12:20
- Published
- 2022
- Full Text
- View/download PDF
29. Palmprint template protection scheme based on randomized cuckoo hashing and MinHash
- Author
-
Jian Qiu, Hengjian Li, and Andrew Beng Jin Teoh
- Subjects
Biometrics ,Computer Networks and Communications ,Computer science ,business.industry ,Hash function ,020207 software engineering ,Pattern recognition ,02 engineering and technology ,MinHash ,Birthday attack ,Hash table ,Cuckoo hashing ,Hardware and Architecture ,Feature (computer vision) ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Artificial intelligence ,business ,Software - Abstract
To provide the necessary security and privacy privileges for registered users of biometric systems, a novel template protection method for generating cancelable palmprint features based on randomized cuckoo hashing and minHash is proposed in this paper. Firstly, the orthogonal features of palmprint are extracted using the anisotropic filter. Then, a novel randomized cuckoo hashing is applied on the binary palmprint feature as a means of first layer security. Randomized cuckoo hashing composes of two hash tables; before hashing, each column of original biometric template is filtered with a random matrix to improve discriminative ability. For the randomized cuckoo hashing with the same positions, we use different Gray coding methods instead, which improves irreversibility. Furthermore, to improve the efficiency and resist unlinkability attacks, another layer of privacy protection, MinHash is adopted. Experimental results on PolyU database show that our method can nearly preserve the original verification performance. The irreversibility, unlinkability and revocability properties are examined experimentally. Also, the privacy of palmprint protection scheme is analyzed under the Brute-force Attack, False Accept Attack and Birthday attack.
- Published
- 2020
- Full Text
- View/download PDF
30. Privacy-Preserving Image Retrieval and Sharing in Social Multimedia Applications
- Author
-
Zifeng Xu, Jia Qiang, Zhang Zongye, Qin Shiyue, and Fucai Zhou
- Subjects
multimedia ,Information retrieval ,General Computer Science ,business.industry ,Computer science ,image sharing ,Hash function ,General Engineering ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Cloud computing ,privacy-preserving ,Encryption ,Secret sharing ,Cuckoo hashing ,Overhead (computing) ,General Materials Science ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,Key management ,Image retrieval ,lcsh:TK1-9971 - Abstract
Every day social multimedia applications generate millions of images. To handle such huge amount of images, an optimal solution is using the public cloud, since it has powerful storage capability. Images usually contain a wealth of sensitive information, therefore social service providers need not only to provide services such as retrieval and sharing but also to protect the privacies of the images. In this paper, we propose a privacy-preserving scheme for content-based image retrieval and sharing in social multimedia applications. First, the users extract visual features from the images, and perform locality-sensitive hashing functions on visual features to generate image profile vectors. We then model the retrieval on the images as the equality search on the image profile vectors. To enable accurate and efficient retrieval, we design the secure index structure based on cuckoo hashing, which has constant lookup time. To meet the requirements of dynamic image updating, we enrich our service with image insertion and deletion. In order to reduce the key management overhead and the access control overhead in social applications, we process keys using secret sharing techniques to enable the users holding similar images to query and decrypt images independently. Finally we implement the prototype of the proposed scheme, and perform experiments over encrypted image databases.
- Published
- 2020
31. A protocol for detecting missing target tags in RFID systems
- Author
-
Zhaobin Liu, Zhiyang Li, Weijiang Liu, Kaiye Zhang, and Wenyuan Han
- Subjects
Computer Networks and Communications ,business.industry ,Computer science ,Fingerprint (computing) ,Frame (networking) ,Hash function ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,Computer Science Applications ,Set (abstract data type) ,Cuckoo hashing ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Radio-frequency identification ,020201 artificial intelligence & image processing ,Data mining ,business ,computer ,Protocol (object-oriented programming) - Abstract
Radio Frequency IDentification (RFID) technology is widely used in the inventory management of goods. Fast searching a subset consisting of some particular target tags is of practical importance for a variety of applications. Existing protocols are inefficient due to the low utilization of frame. In this paper, in order to improve the utilization of slot frame, we first generalize Cuckoo hash to extended Cuckoo hash which can use more than two hash functions. Then, we propose an extended CUckoo hash-based missing tag Detection (CUD) protocol, which can efficiently identify all missing tags within a predefined specific set. In CUD, the reader first uses the extended cuckoo hash to generate a fingerprint vector of the target tags. By using this vector, almost all target tags can be assigned a single slot and the majority of non-target tags can be filtered out. Furthermore, we optimize the parameters through theoretical analysis to minimize the execution time of the protocol. Extensive simulation results demonstrate that CUD is superior to the state-of-the-art protocols.
- Published
- 2019
- Full Text
- View/download PDF
32. RHKV: An RDMA and HTM friendly key–value store for data-intensive computing
- Author
-
Haojie Zhou, Renke Wu, and Linpeng Huang
- Subjects
Atomicity ,Hardware_MEMORYSTRUCTURES ,Remote direct memory access ,Computer Networks and Communications ,Computer science ,business.industry ,Hash function ,020206 networking & telecommunications ,02 engineering and technology ,Cuckoo hashing ,Data access ,Hardware and Architecture ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Data-intensive computing ,020201 artificial intelligence & image processing ,business ,Software ,Computer network - Abstract
Serving DRAM as the storage through key–value abstraction has proved as an attractive option, which provides fast data access for data-intensive computing. However, due to the drawbacks of network round trips and requesting conflicts, remote data access over traditional commodity networking technology might incur high latency for the key–value data store. The major performance bottleneck lies in client-side request waiting and server-side I/O overhead. Accordingly, this paper proposes RHKV: a novel R DMA and H TM friendly k ey– v alue store to provide fast and scalable data management by using the designed G-Cuckoo hashing scheme. Our work expands the idea as follows: (i) An RHKV client transmits data requests to our improved Cuckoo hashing scheme — G-Cuckoo, which constructs a Cuckoo graph as directed pseudoforests in RHKV server. The RHKV server computes the reply for each data request. The server maintains a bucket-to-vertex mapping and pre-determines the possibility of a loop prior to data insertion. Through the use of this Cuckoo graph, the endless kick-out loop of data insertions that can potentially be experienced in the case of generic Cuckoo hashing can be detected. (ii) Despite messaging primitives are slower than RDMA READs for data requests, RHKV adopts RDMA messaging verbs unconventionally. It leverages rich network semantics and makes full use of RDMA’s high bandwidth and low latency for data access over high-performance RDMA interconnects. (iii) Moreover, in order to ensure the data operation’s atomicity, RHKV strives to utilize the advanced HTM technique. Experimental performance evaluation with YCSB workloads shows that, when basic data operations are conducted, RHKV outperforms several state-of-the-art key–value stores with lower latency and higher throughput.
- Published
- 2019
- Full Text
- View/download PDF
33. Incremental Edge Orientation in Forests
- Author
-
Michael A. Bender and Tsvi Kopelowitz and William Kuszmaul and Ely Porat and Clifford Stein, Bender, Michael A., Kopelowitz, Tsvi, Kuszmaul, William, Porat, Ely, Stein, Clifford, Michael A. Bender and Tsvi Kopelowitz and William Kuszmaul and Ely Porat and Clifford Stein, Bender, Michael A., Kopelowitz, Tsvi, Kuszmaul, William, Porat, Ely, and Stein, Clifford
- Abstract
For any forest G = (V, E) it is possible to orient the edges E so that no vertex in V has out-degree greater than 1. This paper considers the incremental edge-orientation problem, in which the edges E arrive over time and the algorithm must maintain a low-out-degree edge orientation at all times. We give an algorithm that maintains a maximum out-degree of 3 while flipping at most O(log log n) edge orientations per edge insertion, with high probability in n. The algorithm requires worst-case time O(log n log log n) per insertion, and takes amortized time O(1). The previous state of the art required up to O(log n / log log n) edge flips per insertion. We then apply our edge-orientation results to the problem of dynamic Cuckoo hashing. The problem of designing simple families ℋ of hash functions that are compatible with Cuckoo hashing has received extensive attention. These families ℋ are known to satisfy static guarantees, but do not come typically with dynamic guarantees for the running time of inserts and deletes. We show how to transform static guarantees (for 1-associativity) into near-state-of-the-art dynamic guarantees (for O(1)-associativity) in a black-box fashion. Rather than relying on the family ℋ to supply randomness, as in past work, we instead rely on randomness within our table-maintenance algorithm.
- Published
- 2021
- Full Text
- View/download PDF
34. Supporting Insertion in Encrypted Multi-Maps with Volume Hiding
- Author
-
Shunta Ishihara, Chiemi Watanabe, and Toshiyuki Amagasa
- Subjects
business.industry ,Computer science ,Volume (computing) ,Cryptography ,Cloud computing ,Computer security ,computer.software_genre ,Encryption ,Cuckoo hashing ,Key (cryptography) ,Differential privacy ,Cloud database ,business ,computer - Abstract
A new threat in encrypted databases, called volume leakage was reported by Kellaris et al. and Grubbs et al., where the volume of data associated with each key in a multi-map is leaked. An existing method addresses this problem by adding noise entries to the multi-map according to differential privacy so that adversaries cannot infer the volume of data. However, it assumes that the data is static and therefore does not support any update operations, such as insertion, deletion, etc. To this problem, this paper proposes a method that enables data insertion in the encrypted multi-map with volume hiding. The basic idea is to use local differential privacy instead of differential privacy in adding noise entries when performing insertions. Besides, we employ cuckoo hashing to perturb the place of insertion, thereby allowing insertions of new items without leaking the volume of entries.However, our proposed method is not suitable for large databases. We consider the use of cloud database in healthcare as a use case. The cloud database is necessary for healthcare in cases such as when multiple medical institutions want to share medical data or when home healthcare providers want to access medical data from outside of hospitals. There are privacy issues if the number of patients for each disease is known. Furthermore, when inserting data of a new patient, we want to hide which disease the patient is suffering from.We conduct experiments to assess the feasibility of the proposed method, and it presents a reasonable performance.
- Published
- 2021
- Full Text
- View/download PDF
35. Incremental Edge Orientation in Forests
- Author
-
Bender, Michael A., Kopelowitz, Tsvi, Kuszmaul, William, Porat, Ely, Stein, Clifford, Mutzel, Petra, Pagh, Rasmus, and Herman, Grzegorz
- Subjects
FOS: Computer and information sciences ,graph algorithms ,Theory of computation → Dynamic graph algorithms ,Cuckoo hashing ,hash functions ,Computer Science - Data Structures and Algorithms ,Data Structures and Algorithms (cs.DS) ,edge orientation ,Theory of computation → Data structures design and analysis ,MathematicsofComputing_DISCRETEMATHEMATICS - Abstract
For any forest G = (V, E) it is possible to orient the edges E so that no vertex in V has out-degree greater than 1. This paper considers the incremental edge-orientation problem, in which the edges E arrive over time and the algorithm must maintain a low-out-degree edge orientation at all times. We give an algorithm that maintains a maximum out-degree of 3 while flipping at most O(log log n) edge orientations per edge insertion, with high probability in n. The algorithm requires worst-case time O(log n log log n) per insertion, and takes amortized time O(1). The previous state of the art required up to O(log n / log log n) edge flips per insertion. We then apply our edge-orientation results to the problem of dynamic Cuckoo hashing. The problem of designing simple families ℋ of hash functions that are compatible with Cuckoo hashing has received extensive attention. These families ℋ are known to satisfy static guarantees, but do not come typically with dynamic guarantees for the running time of inserts and deletes. We show how to transform static guarantees (for 1-associativity) into near-state-of-the-art dynamic guarantees (for O(1)-associativity) in a black-box fashion. Rather than relying on the family ℋ to supply randomness, as in past work, we instead rely on randomness within our table-maintenance algorithm., LIPIcs, Vol. 204, 29th Annual European Symposium on Algorithms (ESA 2021), pages 12:1-12:18
- Published
- 2021
36. Parallel d-Pipeline: A Cuckoo Hashing Implementation for Increased Throughput.
- Author
-
Pontarelli, Salvatore, Reviriego, Pedro, and Maestro, Juan Antonio
- Subjects
- *
HASHING , *PARALLEL computers , *COMPUTER architecture , *FIELD programmable gate arrays , *DATA packeting - Abstract
Cuckoo hashing has proven to be an efficient option to implement exact matching in networking applications. It provides good memory utilization and deterministic worst case access time. The continuous increase in speed and complexity of networking devices creates a need for higher throughput exact matching in many applications. In this paper, a new Cuckoo hashing implementation named parallel d-pipeline is proposed to increase throughput. The scheme presented is targeted to implementations in which the tables are accessed in parallel. A parallel implementation increases the throughput and therefore is well suited to high speed applications. Parallel schemes are common for ASIC/FPGA implementations in which the tables are stored in several embedded memories. Using the proposed technique, the throughput can be significantly increased with gains that in practical scenarios can reach 60 percent compared to existing parallel implementations. The new scheme has been evaluated using a case study and detailed results for performance and implementation costs are reported. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
37. Towards Optimal Degree Distributions for Left-Perfect Matchings in Random Bipartite Graphs.
- Author
-
Dietzfelbinger, Martin and Rink, Michael
- Subjects
- *
MATCHING theory , *BIPARTITE graphs , *GRAPH theory , *PROBABILITY theory , *INFINITY (Mathematics) , *INTEGRALS - Abstract
Consider a random bipartite multigraph G with n left nodes and m ≥ n≥2 right nodes. Each left node x has d≥1 random right neighbors. The average left degree Δ is fixed, Δ≥2. We ask whether for the probability that G has a left-perfect matching it is advantageous not to fix d for each left node x but rather choose it at random according to some (cleverly chosen) distribution. We show the following, provided that the degrees of the left nodes are independent: If Δ is an integer, then it is optimal to use a fixed degree of Δ for all left nodes. If Δ is non-integral, then an optimal degree-distribution has the property that each left node x has two possible degrees, ⌊Δ⌋ and ⌈Δ⌉, with probability p and 1− p, respectively, where p is from the closed interval [0,1] and the average over all p equals ⌈Δ⌉−Δ. Furthermore, if c= n/ m and Δ>2 are constant, then each distribution of the left degrees that meets the conditions above determines the same threshold c(Δ) that has the following property as n goes to infinity: If c < c(Δ) then asymptotically almost surely there exists a left-perfect matching. If c> c(Δ) then asymptotically almost surely there exists no left-perfect matching. The threshold c(Δ) is the same as the known threshold for offline k-ary cuckoo hashing for integral or non-integral k=Δ. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
38. High-Throughput Cuckoo Hashing Accelerator on FPGA Using One-Step BFS
- Author
-
C.-J. Richard Shi, Yiyun Zhang, and Zihao Zhang
- Subjects
Cuckoo hashing ,Hardware_MEMORYSTRUCTURES ,Computer science ,Hash function ,Latency (audio) ,Table (database) ,Parallel computing ,Field-programmable gate array ,Throughput (business) ,Hash table - Abstract
Cuckoo hashing [1] is an efficient hash method with high space occupancy. For improved latency and throughput of cuckoo hashing, FPGA is used to accelerate basic operations of the hash table based on cuckoo hashing recently. However, the frequent kick out and reinsert operations degrade the throughput of cuckoo hashing implementation on FPGAs, especially under high table occupancy. In this paper, we propose a cuckoo hashing accelerator on FPGAs that uses one-step BFS (breadth-first search) to reduce the number of reinsert operations when the table occupancy becomes high. The average throughput of the accelerator achieves 185 million requests per second (MRPS) when the table occupancy reaches 92% at 200MHz. And the throughput at 90% table occupancy is 30% higher than the work in [2].
- Published
- 2021
- Full Text
- View/download PDF
39. Hardness-Preserving Reductions via Cuckoo Hashing
- Author
-
Itay Berman, Ilan Komargodski, Moni Naor, and Iftach Haitner
- Subjects
Pseudorandom number generator ,Discrete mathematics ,FOS: Computer and information sciences ,Computer Science - Cryptography and Security ,Computer science ,Applied Mathematics ,Hash function ,0102 computer and information sciences ,Extension (predicate logic) ,Birthday attack ,16. Peace & justice ,Collision ,01 natural sciences ,Computer Science Applications ,Reduction (complexity) ,03 medical and health sciences ,Cuckoo hashing ,0302 clinical medicine ,010201 computation theory & mathematics ,030220 oncology & carcinogenesis ,Domain (ring theory) ,Cryptography and Security (cs.CR) ,Software - Abstract
The focus of this work is \emph{hardness-preserving} transformations of somewhat limited pseudorandom functions families (PRFs) into ones with more versatile characteristics. Consider the problem of \emph{domain extension} of pseudorandom functions: given a PRF that takes as input elements of some domain $U$, we would like to come up with a PRF over a larger domain. Can we do it with little work and without significantly impacting the security of the system? One approach is to first hash the larger domain into the smaller one and then apply the original PRF. Such a reduction, however, is vulnerable to a "birthday attack": after $\sqrt{\size{U}}$ queries to the resulting PRF, a collision (\ie two distinct inputs having the same hash value) is very likely to occur. As a consequence, the resulting PRF is \emph{insecure} against an attacker making this number of queries. In this work we show how to go beyond the aforementioned birthday attack barrier by replacing the above simple hashing approach with a variant of \textit{cuckoo hashing}, a hashing paradigm that resolves collisions in a table by using two hash functions and two tables, cleverly assigning each element to one of the two tables. We use this approach to obtain: (i) a domain extension method that requires {\em just two calls} to the original PRF, can withstand as many queries as the original domain size, and has a distinguishing probability that is exponentially small in the amount of non-cryptographic work; and (ii) a {\em security-preserving} reduction from non-adaptive to adaptive PRFs., This is the final draft of this paper. The full version was published in the Journal of Cryptology 2019. An extended abstract of this work appeared in the Theory of Cryptography Conference (TCC) 2013
- Published
- 2021
40. DyCuckoo: Dynamic Hash Tables on GPUs
- Author
-
Zhongdong Huang, Yuchen Li, Qiwei Zhu, Jianling Sun, and Zheng Lyu
- Subjects
Scheme (programming language) ,050101 languages & linguistics ,Computer science ,05 social sciences ,02 engineering and technology ,Parallel computing ,Hash table ,Cuckoo hashing ,Information engineering ,Memory management ,Factor (programming language) ,0202 electrical engineering, electronic engineering, information engineering ,Table (database) ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Graphics ,computer ,computer.programming_language - Abstract
The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored in hash tables get updated dynamically, and existing approaches use unnecessarily large memory resources. One naive solution is to rebuild a hash table (known as rehashing) whenever it is either filled or mostly empty. However, this approach renders significant overheads for rehashing. In this paper, we propose a novel dynamic cuckoo hash table technique on GPUs, known as DyCuckoo. We devise a resizing strategy for dynamic scenarios without rehashing the entire table that ensures a guaranteed filled factor. The strategy trades search performance with resizing efficiency, and this tradeoff can be configured by users. To further improve efficiency, we propose a 2-in-d cuckoo hashing scheme that ensures a maximum of two lookups for find and delete operations, while retaining similar performance for insertions as a general cuckoo hash. Extensive experiments have validated the proposed design’s effectiveness over several state-of-the-art hash table implementations on GPUs. DyCuckoo achieves superior efficiency while enables fine-grained memory control, which is not available in existing GPU hash table approaches.
- Published
- 2021
- Full Text
- View/download PDF
41. A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time.
- Author
-
Porat, Ely and Shalem, Bar
- Abstract
Cuckoo hashing [4] is a multiple choice hashing scheme in which each item can be placed in multiple locations, and collisions are resolved by moving items to their alternative locations. In the classical implementation of two-way cuckoo hashing, the memory is partitioned into contiguous disjoint xed-size buckets. Each item is hashed to two buckets, and may be stored in any of the positions within those buckets. Ref. [2] analyzed a variation in which the buckets are contiguous and overlap. However, many systems retrieve data from secondary storage in same-size blocks called pages. Fetching a page is a relatively expensive process, but once a page is fetched, its contents can be accessed orders of magnitude faster. We utilize this property of memory retrieval, presenting a variant of cuckoo hashing incorporating the following constraint: each bucket must be fully contained in a single page, but buckets are not necessarily contiguous. Empirical results show that this modification increases memory utilization and decreases the number of iterations required to insert an item. If each item is hashed to two buckets of capacity two, the page size is 8, and each bucket is fully contained in a single page, the memory utilization equals 89.71% in the classical contiguous disjoint bucket variant, 93.78% in the contiguous overlapping bucket variant, and increases to 97.46% in our new non-contiguous bucket variant. When the memory utilization is 92% and we use breadth rest search to look for a vacant position, the number of iterations required to insert a new item is dramatically reduced from 545 in the contiguous overlapping buckets variant to 52 in our new non-contiguous bucket variant. In addition to the empirical results, we present a theoretical lower bound on the memory utilization of our variation as a function of the page size. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
42. A FPGA-based deep packet inspection engine for Network Intrusion Detection System.
- Author
-
Thinh, Tran Ngoc, Hieu, Tran Trung, Van Quoc Dung, and Kittitornkun, Surin
- Abstract
Pattern matching has became a bottleneck of software based Network Intrusion Detection System (NIDS) as the number of signature have recently increased dramatically. Many FPGA-based architectures for detecting malicious patterns have been proposed recently. However, these approaches have just considered matching pattern separately while more and more complex combination of several patterns are utilized to describe intrusion activities. In this paper we present our work which concentrates on multi-pattern signature and propose a FPGA-based deep packet inspection engine for NIDS. The system can support both static and dynamic patterns. We employ Snort signature set and realize our system on NetFPGA platform. The evaluation on real network environment shows that our system can maintain gigabit line rate throughput without dropping packets. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
43. Design and Evaluation of Cascading Cuckoo Filters for Zero-False-Positive Membership Services
- Author
-
Sai Medury, Amani Altarawneh, and Anthony Skjellum
- Subjects
Cuckoo hashing ,Hardware_MEMORYSTRUCTURES ,biology ,Computer science ,Filter (video) ,False positive paradox ,Function (mathematics) ,Bloom filter ,biology.organism_classification ,Data structure ,Cuckoo ,Algorithm ,Zero (linguistics) - Abstract
The approximate set-membership data structures (ASMDS), like the Bloom filter and cuckoo filter, provide constant-time testing of set-membership. They produce false positives because of a loss of bits during compression. However, in case all potential false positives are known (or can be evaluated), it is possible to use filter cascades and collectively eliminate such false positives. The application of filter cascading algorithm to the Bloom filter was originally proposed for optimizing memory usage and is currently an integral part of CRLLite. Recently proposed cuckoo filters function similarly to Bloom filters but with cuckoo hashing techniques. They produce comparatively lower storage overheads and additionally support efficient deletions. Therefore, applying the cascading algorithms to the cuckoo filter will also produce lower storage overheads in comparison to cascading Bloom filters. Further, cuckoo filter's support for deletions enable efficient updates to the filter cascades. In this paper, we present the design and analysis of cascading cuckoo filters, a potentially more space-optimal ASMDS in comparison to cascading Bloom filters. A novel contribution of this paper is the application of filter cascading algorithm to cuckoo filter, which has not been proposed before to the best of our knowledge.
- Published
- 2021
- Full Text
- View/download PDF
44. Simple Storage-Saving Structure for Volume-Hiding Encrypted Multi-maps
- Author
-
Jiafan Wang and Sherman S. M. Chow
- Subjects
Cuckoo hashing ,Theoretical computer science ,business.industry ,Computer science ,Ciphertext ,Hash function ,Overhead (computing) ,Differential privacy ,Encryption ,business ,Consistent hashing ,Data structure - Abstract
Severe consequences in volume leakage (subject to the conditions required by specific attacks) stimulate a new research direction (Eurocrypt 2019) of volume-hiding structured encryption (\(\mathsf {STE}\)), particularly encrypted multi-maps (\(\mathsf {EMM}\)), in which all queries should share the same (as the largest) response size unless the scheme is lossy. Meanwhile, note that the responses are originated from the actual ciphertexts outsourced to the server. Conventional wisdom suggests that the ciphertexts (to be accessed by the server while answering a query) should also contain many dummy results to make a query look uniform with others. Supporting updates is also natural; however, attaching dummy results to a query also complicates the operation and leakage of updates, which excludes many advanced data structures, e.g., cuckoo hashing (CCS 2019). This paper proposes a space-efficient \(\mathsf {EMM}\) without storing any dummy ciphertext, which is volume hiding against passive adversaries (SP 2021) and compatible with dynamic extensions. Its crux structure is a hash ring, which is famous for load balancing but rarely appears in any \(\mathsf {STE}\). Efficiency-wise, our scheme beats the state-of-the-art (Eurocrypt 2019, CCS 2019), maintaining the necessary communication overhead and downsizing the server storage to be linear in the number of values in the \(\mathsf {EMM}\), while ruling out any data loss due to truncations or differential privacy.
- Published
- 2021
- Full Text
- View/download PDF
45. Alibi: A Flaw in Cuckoo-Hashing Based Hierarchical ORAM Schemes and a Solution
- Author
-
Daniel Noble, Rafail Ostrovsky, and Brett Hemenway Falk
- Subjects
Cuckoo hashing ,Theoretical computer science ,Computer science ,Alibi - Published
- 2021
- Full Text
- View/download PDF
46. Explicit and Efficient Hash Families Suffice for Cuckoo Hashing with a Stash.
- Author
-
Aumüller, Martin, Dietzfelbinger, Martin, and Woelfel, Philipp
- Subjects
- *
RANDOM graphs , *HASHING , *DATA structures , *ALGORITHMS , *GRAPH theory - Abstract
It is shown that for cuckoo hashing with a stash as proposed by Kirsch et al. (Proc. 16th European Symposium on Algorithms (ESA), pp. 611-622, Springer, Berlin, ) families of very simple hash functions can be used, maintaining the favorable performance guarantees: with constant stash size s the probability of a rehash is O(1/ n), the lookup time and the deletion time are O( s) in the worst case, and the amortized expected insertion time is O( s) as well. Instead of the full randomness needed for the analysis of Kirsch et al. and of Kutzelnigg (Discrete Math. Theor. Comput. Sci., 12(3):81-102, ) (resp. Θ(log n)-wise independence for standard cuckoo hashing) the new approach even works with 2-wise independent hash families as building blocks. Both construction and analysis build upon the work of Dietzfelbinger and Woelfel (Proc. 35th ACM Symp. on Theory of Computing (STOC), pp. 629-638, ). The analysis, which can also be applied to the fully random case, utilizes a graph counting argument and is much simpler than previous proofs. The results can be generalized to situations where the stash size is non-constant. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
47. OEBSA-Sketch: a Combination of Sketch and Hash Table for the Measurement of Stream Data
- Author
-
Hao Zhang, Yingying Deng, Haiting Zhu, Ning Liu, Gaofeng He, and Yuan Zhang
- Subjects
0209 industrial biotechnology ,Cuckoo hashing ,020901 industrial engineering & automation ,Approximation error ,Computer science ,Hash function ,Table (database) ,02 engineering and technology ,Bloom filter ,Data structure ,Algorithm ,Sketch ,Hash table - Abstract
Sketch is a hash-based data structure that can record traffic characteristic information in a high-speed network environment. Sketch only occupies a small amount of space resources while maintaining theoretically provable estimation accuracy. Although various sketches have been proposed, most of them achieve either high accuracy or high speed in limited memory, especially for skewed data sets. In order to achieve both goal, we combined our prior work, CBFSketch, with the odd-even cuckoo hash table which is a data structure that provides storage and accurate lookup. The new scheme is named OEBSA- Sketch, we compared our sketch with the several typical sketches in effectiveness and efficiency. The results showed the accuracy of Heavy Hitter in our scheme is 4 times higher than other sketches at most, and the accuracy of Heavy Changer is up to 16 times at most. The average absolute error is reduced by about 50% compared with CBFSketch. At the same average relative error, the space used can be reduced by about 50%.
- Published
- 2020
- Full Text
- View/download PDF
48. Efficient Flow Sampling With Back-Annotated Cuckoo Hashing.
- Author
-
Pontarelli, S., Reviriego, P., and Maestro, J. A.
- Abstract
One of the applications of network traffic monitoring is to detect anomalies and security threats. Due to the huge number of packets that traverse networks, monitoring is typically implemented by sampling the traffic. Sampling can be done per packet or per flow. For flow sampling, the decision to select a flow can be purely random or based on some properties of the flows. In this later case, each incoming packet has to be compared against the set of flows being monitored to determine if the packet belongs to any of those flows. This matching can be implemented using a content addressable memory (CAM) or hash based data structures. Among those, one option is Cuckoo hashing that provides good memory utilization and a deterministic worst number of memory accesses. However, in the case of flow sampling, most packets will not belong to any of the flows being monitored. Therefore, all tables will be accessed and the worst case number of accesses will be required thus reducing throughput. In this letter, a technique to reduce the average number of accesses to search for items that are not stored in the Cuckoo hash is proposed and evaluated. The results show that the proposed scheme can significantly reduce the average number of accesses in a flow sampling application. This means that the technique can be used to increase the throughput substantially. [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
49. Enhanced chained and Cuckoo hashing methods for multi-core CPUs.
- Author
-
Kim, Euihyeok and Kim, Min-Soo
- Subjects
- *
DATA structures , *CENTRAL processing units , *INTEGRATED circuits , *HASHING , *MICROPROCESSORS , *CACHE memory - Abstract
A hash table is a fundamental data structure implementing an associative memory that maps a key to its associative value. Besides, the paradigm of micro-architecture design of CPUs is shifting away from faster uniprocessors toward slower chip multiprocessors. In this paper, we propose enhanced chained hashing and Cuckoo hashing methods for modern computers having a lot of CPU cores with exploiting CPU cache line and hardware level lock-free operations. The proposed methods outperform the existing methods in most cases and are very scalable in terms of the number of CPU cores. In addition, their performances do not degrade much even with a high fill factor (e.g., 90 %). Through extensive experiments using Intel 32-core machine, we have shown our proposed methods improve performance compared with the state-of-the-art version of the four exiting major hashing methods of linear, chained, Cuckoo, and Hopscotch. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
50. Cuckoo Prefix: A Hash Set for Compressed IP Blocklists
- Author
-
Donovan Allen and Navid Shaghaghi
- Subjects
Set (abstract data type) ,Cuckoo hashing ,Memory management ,business.industry ,Computer science ,Linear probing ,Hash function ,business ,Data structure ,Throughput (business) ,Hash table ,Computer network - Abstract
IP blocking has become a vital task for all network attached devices. Every device from Internet of Things, to routers, to application servers requires the ability to filter certain IP addresses from delivering malicious information. Blocking IPs requires storing and checking lists of tens to hundreds of millions of IP addresses. Cuckoo hash sets provide strong performance by offering relatively low numbers of memory accesses per lookup. This makes them optimal for time sensitive applications like networking. Using cuckoo++ hash tables as a baseline, we propose a new data structure known as cuckoo prefix for the purpose of blocking IPs quickly with relatively little space. Leveraging IP subnets allows us to achieve similar throughput rates as implementations such as cuckoo++ with 8 times less memory usage. In addition, in this paper we offer a comparison of throughput and memory usage of several modern hash set and hash table implementations. In particular, we examine linear probing, robin hood hashing, bit sets (including EBVBL), and cuckoo hashing implementations to determine which provides the best throughput at the lowest memory cost.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.