Features for Text Comparison

  • Marek Krótkiewicz, Krystian Wojtkiewicz
  • Springer Science + Business Media
  • DOI: 10.1007/978-3-540-68168-7_52

An easy method for comparing documents in spite of plagiarism.

What is it about?

The paper introduces easy way to compare documents. There are number of factors identified that allow to answer the question how much one document was built by copying content from the other one. The method uses hashtags for coding of words. In the effect a vector of values is produced that can be used for identification algorithm to assess the plagiarism ratio. The method can be customized according to the field and specificity of texts. It also keep anonymity of sources if needed.

Why is it important?

There is lots of different methods to assess plagiarism, however not all of them take in the account the specificity of field of document type. It was the reason to introduce this method, that can be suited according to needs.


Dr Krystian Wojtkiewicz
Politechnika Wroclawska

I truly hope that appropriate methods will help to make our work easier and more reliable. In nowadays research many people try to take shortcuts by using other works. It is much better to cite the someones article than to rewrite it !!! I hope that either this method or any other rises the quality of papers and thus research.

Dr Marek Krótkiewicz
Politechnika Wroclawska

Most of the metrics showed in the paper might be used for assessing simiarity of texts. The paper itself is a resource for algorithms as well as ideas that might be used for text anonimisation.

Read Publication


The following have contributed to this page: Dr Krystian Wojtkiewicz and Dr Marek Krótkiewicz