What is it about?

In this article, we present a semantic approach to extract profile of person entities from Farsi text documents. Entity profiling as an important task of information extraction (IE) and Web mining is the process of extracting entities in question and their relevant information from the relevant documents. From computational viewpoint, the Farsi language is one of the less-studied and less-resourced languages, and suffers from high quality linguistic processing tools. This problem emphasizes the necessity of developing Farsi text processing systems. As an element of IE research, we present a person profile extraction system in Farsi, which includes three major components: (i) pre-processing, (ii) semantic analysis, and (iii) attribute extraction. First, our system takes as input the raw text, and annotates the text using existing pre-processing tools. In semantic analysis stage, we analyze the pre-processed text syntactically and semantically and enrich the local processed information with semantic information obtained from a distant knowledge base. We then use a semantic rule-based approach to extract the related information of the persons in question. We show the effectiveness of our approach by testing it on a small Farsi corpus. The experimental results are encouraging and show that the proposed method outperforms baseline methods.

Featured Image

Perspectives

Writing this article was a great pleasure as it has co-authors with whom I have had long standing collaborations. This article also lead to a greater involvement in Machine Learning research.

Dr Hossein Shirazi
Malek-Ashtar University of Technology

Read the Original

This page is a summary of: A semantic approach to cross-document person profiling in Web, AI Communications, November 2017, IOS Press,
DOI: 10.3233/aic-170472.
You can read the full text:

Read

Contributors

The following have contributed to this page