Automatic Emotion Identification from Text

Wang, Wenbo

Keyword Search

School Logo

wenboDissertation with ack.pdf (679.48 KB)

Automatic Emotion Identification from Text

Author Info

Wang, Wenbo

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=wright1440974400

Year and Degree

2015, Doctor of Philosophy (PhD), Wright State University, Computer Science and Engineering PhD.

Abstract

People's emotions can be gleaned from their text using machine learning techniques to build models that exploit large self-labeled emotion data from social media. Further, the self-labeled emotion data can be effectively adapted to train emotion classifiers in different target domains where training data are sparse. Emotions are both prevalent in and essential to most aspects of our lives. They influence our decision-making, affect our social relationships and shape our daily behavior. With the rapid growth of emotion-rich textual content, such as microblog posts, blog posts, and forum discussions, there is a growing need to develop algorithms and techniques for identifying people's emotions expressed in text. It has valuable implications for the studies of suicide prevention, employee productivity, well-being of people, customer relationship management, etc. However, emotion identification is quite challenging partly due to the following reasons: i) It is a multi-class classification problem that usually involves at least six basic emotions. Text describing an event or situation that causes the emotion can be devoid of explicit emotion-bearing words, thus the distinction between different emotions can be very subtle, which makes it difficult to glean emotions purely by keywords. ii) Manual annotation of emotion data by human experts is very labor-intensive and error-prone. iii) Existing labeled emotion datasets are relatively small, which fails to provide a comprehensive coverage of emotion-triggering events and situations. This dissertation aims at understanding the emotion identification problem and developing general techniques to tackle the above challenges. First, to address the challenge of fine-grained emotion classification, we investigate a variety of lexical, syntactic, knowledge-based, context-based and class-specific features, and show how much these features contribute to the performance of the machine learning classifiers. We also propose a method that automatically extracts syntactic patterns to build a rule-based classifier to improve the accuracy of identifying minority emotions. Second, to deal with the challenge of manual annotation, we leverage emotion hashtags to harvest Twitter "big data" and collect millions of self-labeled emotion tweets, the labeling quality of which is further improved by filtering heuristics. We discover that the size of the training data plays an important role in emotion identification task as it provides a comprehensive coverage of different emotion-triggering events/situations. Further, the unigram and bigram features alone can achieve a performance that is competitive with the best performance of using a combination of ngram, knowledge-based and syntactic features. Third, to handle the paucity of the labeled emotion datasets in many domains, we seek to exploit the abundant self-labeled tweet collection to improve emotion identification in text from other domains, e.g., blog posts, fairy tales. We propose an effective data selection approach to iteratively select source data that are informative about the target domain, and use the selected data to enrich the target domain training data. Experimental results show that the proposed method outperforms the state-of-the-art domain adaptation techniques on datasets from four different domains including blog, experience, diary and fairy tales. Finally, we apply the proposed research to analyze cursing, an emotion rich activity, on Twitter. We explore a set of questions that have been recognized as crucial for understanding cursing in offline communications by prior studies, including ubiquity, utility, contextual dependencies, and people factors.

Committee

Amit Sheth, Ph.D. (Advisor)
Keke Chen, Ph.D. (Committee Member)
Kevin Haas, Ph.D. (Committee Member)
Krishnaprasad Thirunarayan, Ph.D. (Committee Member)
Ramakanth Kavuluru, Ph.D. (Committee Member)

Pages

138 p.

Subject Headings

Computer Science

Keywords

Emotion Identification; Emotion Classification; Emotion Adaptation; Self-labeled Data Creation; Emotion Analysis

Wang, W. (2015). Automatic Emotion Identification from Text [Doctoral dissertation, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1440974400
APA Style (7th edition)
Wang, Wenbo. Automatic Emotion Identification from Text. 2015. Wright State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=wright1440974400.
MLA Style (8th edition)
Wang, Wenbo. "Automatic Emotion Identification from Text." Doctoral dissertation, Wright State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=wright1440974400
Chicago Manual of Style (17th edition)

Document number:

wright1440974400

Download Count:

1,104

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Automatic Emotion Identification from Text

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Automatic Emotion Identification from Text

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations