What is it about?

Due to obscure language and verbose explanations majority of users of hypermedia and websites do not bother to read them or only partially read them. A summarized version of these privacy policies can be useful to help the users concentrate on the main points. To solve this problem, in this work we used machine learning-based models for policy categorizer that classifies the policy paragraphs under the attributes proposed like security, contact, etc. We benchmarked different machine learning-based classifier models, and show that artificial neural network model performs with higher accuracy on a challenging dataset of textual privacy policies.

Featured Image

Why is it important?

We showed that machine learning can help summarize the relevant paragraphs of verbose privacy policies under the various attributes so that the user can get the gist of that topic within a few lines.


How many times you just scrolled through pages and pages of privacy policies and clicked "OK" without reading or just skimmed it? This work shows that machine learning can be used to distill the main points by summarizing only the important parts. This will be useful in our hypermedia rich websites for user agreements without getting bogged-down by very verbose privacy policies.

Surya Prasath
Cincinnati Children's Hospital Medical Center

Read the Original

This page is a summary of: POCASUM: policy categorizer and summarizer based on text mining and machine learning, Soft Computing, June 2021, Springer Science + Business Media, DOI: 10.1007/s00500-021-05916-w.
You can read the full text:



The following have contributed to this page