Lexicon-based Sentiment Analysis in Persian
- Pp. 154-183 (30)Mohammad Ehsan Basiri, Nasser Ghasem-Aghaee and Ahmad Reza Naghsh-Nilchi
Sentiment analysis is a field of study concerning the extraction of people’s opinion and attitude from their writings on the Web. Most research efforts in the area of sentiment analysis have focused on English texts and few works considered the problem of Persian sentiment analysis. Persian is spoken by more than a hundred million speakers around the world and is the official language of Iran, Tajikistan, and Afghanistan. From a computational point of view, Persian is a challenging language due to its derivational nature and the use of Arabic words, informal style of writing, and different forms of writing for compound words. In this chapter, we present a lexicon-based framework for sentiment analysis in Persian. Specifically, we develop a Persian lexicon which associates sentiment words with their sentiment strengths. Furthermore, in the proposed framework, we address several problems of sentiment analysis in Persian, such as misspelling, word spacing, and stemming. We used the proposed framework in the problem of polarity detection and rating prediction of cellphone reviews. The results show that our approach outperforms supervised machine learning techniques in terms of accuracy and mean absolute error.