We evaluate the performance of range queries in the Recursive List of Clusters (RLC) metric data structure, when the metric spaces are natural language dictionaries with the Levenshtein distance. The study compares RLC with five data structures (GNAT, H-Dsatl, LAESA, LC, and vp-trees) and comprises six dictionaries. The natural language dictionaries (in English, French, German, Italian, Portuguese, and Spanish), are characterised according to the mean and the variance of the histograms of distances.
The experimental results show that RLC has a good performance in all tested cases and, in some of them, it outperforms all the other data structures. In addition, RLC is the only data structure that always keeps its good performance, whether the space dimension is lower or higher, and whether the query radius is smaller or larger.
|Title of host publication||INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES|
|Publication status||Published - 1 Jan 2007|
|Event||22nd International Symposium on Computer and Information Sciences - |
Duration: 1 Jan 2007 → …
|Conference||22nd International Symposium on Computer and Information Sciences|
|Period||1/01/07 → …|