Range Selection Queries in Data Aware Space and Time


Külekci M. O., Thankachan S. V.

2015 Data Compression Conference, DCC 2015, Utah, United States Of America, 7 - 09 April 2015, vol.2015-July, pp.73-82 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume: 2015-July
  • Doi Number: 10.1109/dcc.2015.53
  • City: Utah
  • Country: United States Of America
  • Page Numbers: pp.73-82
  • Keywords: compact integer coding, range selection, wavelet tree
  • Istanbul Medipol University Affiliated: No

Abstract

On a given vector X = (x1, x2, , xn) of integers, the range selection (i, j, k) query is finding the k-th smallest integer in (xi, xi+1, , xj) for any (i, j, k) such that 1 ≤ i ≤ j ≤ n, and 1 ≤ k ≤ j-i+1. Previous studies on the problem kept X intact and proposed data structures that occupied additional O (n. log n) bits of space over the X itself that answer the queries in logarithmic time. In this study, we replace X and encode all integers in it via a single wavelet tree by using S= n. log u + ∑∀ logxi+o (n. log u + ∑∀logxi) bits, where u is the number of distinct log xi values observed in X. Notice that u is at most 32 (64) for 32-bit (64-bit) integers and when xi>u, the space used for xi in the proposed data structure is less then the Elias-δ coding of xi. Besides data-aware coding of X, the range selection is performed in O (log u + log x') time where x' is the k-th smallest integer in the queried range. This somewhat adaptive result interestingly achieves the range selection regardless of the size of X, and totally depends on the actual answer of the query. In summary, to the best of our knowledge, we present the first algorithm using data-aware space and time for the general range selection problem.