What is it about?

The Facebook application is used as a resource for collecting the comments of this dataset, The dataset consists of 6756 comments to create a Medical Kurdish Dataset (MKD).

Featured Image

Why is it important?

The comments (short text) are labeled for positive class (medical comment) and negative class (non-medical comment) as text classification. The percentage ratio of the negative class is 55% while the positive class is 45%.


The samples are comments of users, which are gathered from different posts of pages (Medical, News, Economy, Education, and Sport). Six steps as a preprocessing technique are performed on the raw dataset to clean and remove noise in the comments by replacing characters.

Professor Tarik A. Rashid
University of Kurdistan Hewler

Read the Original

This page is a summary of: Medical dataset classification for Kurdish short text over social media, Data in Brief, June 2022, Elsevier,
DOI: 10.1016/j.dib.2022.108089.
You can read the full text:



The following have contributed to this page