Article Preview
TopIntroduction
The calculation of the intersection between sets is an important operation in numerous application domains, such as information retrieval for text search engines, graph analytics for triangles counting and community detection and database systems. It serves to create indexes, especially inverted ones, by computing for every term, in the documents database, the set of documents in which it appears. Also, it allows search engines to find the responses to queries by computing their intersections with the global inverted index. Since 1971, when the first work has appeared, many techniques have been published in order to accelerate intersection time between sets, especially between sorted ones. At the basis, these techniques use well defined data structures and exploit hard components of machines. Logically, time execution depends on the manner by which the comparisons between the elements of the entries are done. In this context, three approaches are adopted to this end:
• Accelerate sequential processing time using data structures that are well adapted to requirements;
• Exploit graphics processors (GPUs) and multi-core processors (CPUs) to exploit advantage of the parallelism they offer; or
• Exploit new plate-forms or frameworks, like map-reduce, for distributing the workload.
Since current machines are equipped with GPU cards that can easily be integrated into them and since they offer a high-level parallel environment at low cost. We propose a new GTWJ (GPU Test With Jumps) algorithm for calculating the intersection between sorted lists. The objectives of our proposition are multiple. Our first objective is to reduce the number of tests by avoiding comparing parts that will not necessarily be shared between the lists in the inputs. To achieve this objective, we modify the structure of the lists to be compared in order to avoid these unnecessary tests. Our second objective is to exploit the parallelism of the graphics card to speed up calculation times by reducing redundant comparisons and distributing the workload evenly among GPU threads. Thus, we have implemented our solution with the CUDA language which provides direct access to the graphics processor programming. This programming language has a complete instruction set, such as double precision calculation, and allows thousands of threads to be simultaneously used. This solution is compared to other solutions (Hwang & Lin, 1971; Demaine et al., 2001; Inoue et al., 2014). We present and comment these solutions in this paper.
The rest of the article is structured as follows. Section 2 provides related works; so, a state of the art is given here, it covers the set of algorithms which are published since 1971. We extend this section by presenting some research papers which applied intersection between lists to other fields. In section 3, we present three important algorithms, as we will compare our proposition with them. Section 4 gives details of our solution. Experiments are presented and commented in section 5. Section 6 concludes this paper.