Quantitative analysis of visual codewords of a protein distance matrix

Jure Pražnikar; Nuwan Tharanga Attygalle

doi:10.1371/journal.pone.0263566

What is it about?

In this work, we analysed the visual words (codebook) of protein distance matrices. We studied the relationship between the size of the vocabulary and the classification accuracy. The result was that codewords with higher relative frequency are generally closer to the main diagonal of the distance matrix. We also showed that solenoid domains have a much lower proportion of unique codewords compared to globular proteins, and that the feature vector (codeword histogram) together with a support vector machine classifier can be used very efficiently to discriminate between globular and solenoid proteins.

Photo by ANIRUDH on Unsplash

Why is it important?

We also showed that solenoid domains have a much lower proportion of unique codewords compared to globular proteins, and that the feature vector (codeword histogram) together with a support vector machine classifier can be used very efficiently to discriminate between globular and solenoid proteins.

Perspectives

We believe that further work and development can be done to investigate whether the codeword histogram is useful for classifying tandem repeats. In addition, a more advanced approach, such as pooling methods, can be used to incorporate spatial data from protein distance matrix patches.
Jure Pražnikar
University of Primorska Faculty of Mathematics, Natural Sciences and Information Technologies

This page is a summary of: Quantitative analysis of visual codewords of a protein distance matrix, PLoS ONE, February 2022, PLOS,
DOI: 10.1371/journal.pone.0263566.
You can read the full text:

Read

Contributors

The following have contributed to this page

Jure Pražnikar
University of Primorska Faculty of Mathematics, Natural Sciences and Information Technologies

Protein distance matrix

What is it about?

Why is it important?

Perspectives

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Protein distance matrix

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

You might also like

Brain-inspired computing with fluidic iontronic nanochannels

The policy effect of green finance reform and innovations: Empirical evidence at the firm level

Adherence to the test, treat and track strategy for malaria control among prescribers, Mfantseman Municipality, Central Region, Ghana

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management