TY - GEN
T1 - Cost estimation of spatial k-Nearest-neighbor operators
AU - Aly, Ahmed M.
AU - Aref, Walid G.
AU - Ouzzani, Mourad
N1 - Publisher Copyright:
© 2015, Copyright is with the authors.
PY - 2015
Y1 - 2015
N2 - Advances in geo-sensing technology have led to an unprecedented spread of location-aware devices. In turn, this has resulted into a plethora of location-based services in which huge amounts of spatial data need to be efficiently consumed by spatial query processors. For a spatial query processor to properly choose among the various query processing strategies, the cost of the spatial operators has to be estimated. In this paper, we study the problem of estimating the cost of the spatial k-nearest-neighbor (k-NN, for short) operators, namely, k-NN-Select and k-NN-Join. Given a query that has a k-NN operator, the objective is to estimate the number of blocks that are going to be scanned during the processing of this operator. Estimating the cost of a k-NN operator is challenging for several reasons. For instance, the cost of a k-NN-Select operator is directly affected by the value of k, the location of the query focal point, and the distribution of the data. Hence, a cost model that captures these factors is relatively hard to realize. This paper introduces cost estimation techniques that maintain a compact set of catalog information that can be kept in main-memory to enable fast estimation via lookups. A detailed study of the performance and accuracy trade-off of each proposed technique is presented. Experimental results using real spatial datasets from OpenStreetMap demonstrate the robustness of the proposed estimation techniques.
AB - Advances in geo-sensing technology have led to an unprecedented spread of location-aware devices. In turn, this has resulted into a plethora of location-based services in which huge amounts of spatial data need to be efficiently consumed by spatial query processors. For a spatial query processor to properly choose among the various query processing strategies, the cost of the spatial operators has to be estimated. In this paper, we study the problem of estimating the cost of the spatial k-nearest-neighbor (k-NN, for short) operators, namely, k-NN-Select and k-NN-Join. Given a query that has a k-NN operator, the objective is to estimate the number of blocks that are going to be scanned during the processing of this operator. Estimating the cost of a k-NN operator is challenging for several reasons. For instance, the cost of a k-NN-Select operator is directly affected by the value of k, the location of the query focal point, and the distribution of the data. Hence, a cost model that captures these factors is relatively hard to realize. This paper introduces cost estimation techniques that maintain a compact set of catalog information that can be kept in main-memory to enable fast estimation via lookups. A detailed study of the performance and accuracy trade-off of each proposed technique is presented. Experimental results using real spatial datasets from OpenStreetMap demonstrate the robustness of the proposed estimation techniques.
UR - http://www.scopus.com/inward/record.url?scp=84952815953&partnerID=8YFLogxK
U2 - 10.5441/002/edbt.2015.40
DO - 10.5441/002/edbt.2015.40
M3 - Conference contribution
AN - SCOPUS:84952815953
T3 - EDBT 2015 - 18th International Conference on Extending Database Technology, Proceedings
SP - 457
EP - 468
BT - EDBT 2015 - 18th International Conference on Extending Database Technology, Proceedings
A2 - Popa, Lucian
A2 - Alonso, Gustavo
A2 - Van den Bussche, Jan
A2 - Barcelo, Pablo
A2 - Teubner, Jens
A2 - Paredaens, Jan
A2 - Ugarte, Martin
A2 - Geerts, Floris
PB - OpenProceedings.org, University of Konstanz, University Library
T2 - 18th International Conference on Extending Database Technology, EDBT 2015
Y2 - 23 March 2015 through 27 March 2015
ER -