What is it about?

In this work, we present a deep-learning-based approach that incorporates the advantage of the transformer encoder architecture and the visual saliency to predict the perceived visual quality of distorted point clouds. Given only a degraded point cloud, we project different 2D projection views to surround the 3D object. Then, we weight each view with its corresponding calculated saliency map through a pointwise multiplication in order to detect the regions of interest that attract the human visual system. The sub-images extracted from the resulting salient image are then fed to a pretrained vision transformer that we fine-tune on moderate sized benchmark databases for point cloud quality assessment. Finally, the quality score of the distorted point cloud is obtained by averaging the quality scores of all the predicted scores of the salient sub-images. Experimental results show that our model achieves promising performance compared to the state-of-the-art point cloud quality assessment metrics.

Featured Image

Why is it important?

Estimating the quality of 3D objects/scenes represented by point clouds is a crucial and challenging task in computer vision. In real-world applications, reference data is not always available, motivating the development of new point cloud quality assessment (PCQA) metrics that do not require the original 3D point cloud (3DPC), the so-called blind or non-reference PCQA.

Perspectives

Through this work, we encourage the use of transformers and saliency maps for the evaluation of point cloud quality.

Salima bourbia
Universite Mohammed V Souissi

Read the Original

This page is a summary of: No-reference Point Clouds Quality Assessment using Transformer and Visual Saliency, October 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3552469.3555713.
You can read the full text:

Read

Contributors

The following have contributed to this page