What is it about?

Point-to-Hyperplane Nearest Neighbor Search (P2HNNS) is a fundamental yet challenging problem, and it has plenty of applications in various fields. In this paper, we develop the first two provable hyperplane hashing schemes, Nearest Hyperplane hashing (NH) and Furthest Hyperplane hashing (FH), for high-dimensional P2HNNS beyond the unit hypersphere. With a novel asymmetric transformation, we demonstrate that the hash functions of NH and FH are locality-sensitive to the hyperplane queries, and both of them enjoy quality guarantee on query results. Moreover, we propose a data-dependent multi-partition strategy to boost the search performance of FH. We evaluate NH and FH over five real-life datasets and show that we are around 3 ~ 100x faster than the best competitor in four out of five datasets, especially for the recall in [20%, 80%]. Code is available at https://github.com/HuangQiang/P2HNNS.

Featured Image

Why is it important?

Existing hyperplane hashing schemes enjoy sub-linear query time and achieve excellent performance on applications such as large-scale active learning with Support Vector Machines (SVMs). However, they only conditionally deal with the P2HNNS problem with a strong assumption that all of the data are normalized, located at the unit hypersphere. Those hyperplane hashing schemes may be arbitrarily bad without this assumption. In this paper, we propose the first provable sublinear time method NH to deal with the P2HNNS beyond the unit hypersphere.

Perspectives

I hope this paper could bring new thought to people who design basic data structures for P2HNNS problem or other fundamental similarity search problems.

Dr Qiang Huang
National University of Singapore

Read the Original

This page is a summary of: Point-to-Hyperplane Nearest Neighbor Search Beyond the Unit Hypersphere, June 2021, ACM (Association for Computing Machinery),
DOI: 10.1145/3448016.3457240.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page