Emotion, Sentiment, and Stance Labeled Data

This page lists various collections of sentences, tweets and documents annotated for categories such as sentiment, emotion, stance, and metaphor. (Several word-emotion lexicons (such as the NRC Emotion Lexicon) and word-sentiment lexicons (such as our sentiment composition lexicons) are available on this other page.)

Contact: Saif M. Mohammad (saif.mohammad@nrc-cnrc.gc.ca)
Terms of use:
  • The resources listed here are available free for research purposes. Cite the papers associated with the resources in your research papers and articles that make use of them. (The papers associated with each resource are listed below, and also in the individual READMEs.)

  • Do not redistribute the data. Direct interested parties to this page:

  • National Research Council Canada (NRC) disclaims any responsibility for the use of the lexicons listed here and does not provide technical support. However, the contact listed above will be happy to respond to queries and clarifications.


Emotion, Sentiment, and Stance Labeled Data

Tweets annotated for degree of emotion (intensity)

Data available at the webpage for shared task on detecting emotion intensity at WASSA-2017.

Tweets with emotion word hashtags

#Emotional Tweets, Saif Mohammad, In Proceedings of the First Joint Conference on Lexical and Computational Semantics (*Sem), June 2012, Montreal, Canada.
Paper (pdf)    BibTeX

The Hashtag Emotion Corpus (aka Twitter Emotion Corpus, or TEC) has tweets with emotion word hashtags. It was used to create the NRC Hashtag Emotion Lexicon.

Tweets annotated for sentiment and stance towards pre-chosen targets

Detecting Stance in Tweets And Analyzing its Interaction with Sentiment. Parinaz Sobhani, Saif M. Mohammad, and Svetlana Kiritchenko. In Proceedings of the Joint Conference on Lexical and Computational Semantics (*Sem), August 2016, Berlin, Germany.
Paper (pdf)   BibTeX     Presentation    Data and Visualization

Tweets annotated for sentiment, and part of the SemEval-2015 share task #10

Data available at the webpage for SemEval-2015 shared task #10: Sentiment Analysis in Twitter.

Developing a Successful SemEval Task in Sentiment Analysis of Twitter and Other Social Media Texts. Preslav Nakov, Sara Rosenthal, Svetlana Kiritchenko, Saif M. Mohammad, Zornitsa Kozareva, Alan Ritter, Veselin Stoyanov, and Xiaodan Zhu. Language Resources and Evaluation. March 2016, Volume 50, Issue 1, pages 35-65.
Paper (pdf)    Preprint Version    BibTeX

SemEval-2015 Task 10: Sentiment Analysis in Twitter. Sara Rosenthal, Preslav Nakov, Svetlana Kiritchenko, Saif M Mohammad, Alan Ritter, and Veselin Stoyanov. In Proceedings of the ninth international workshop on Semantic Evaluation Exercises (SemEval-2015), June 2015, Denver, Colorado.
Paper (pdf)   BibTeX

Electoral/Political tweets annotated for sentiment, emotion, purpose and style

Sentiment, Emotion, Purpose, and Style in Electoral Tweets. Saif M. Mohammad, Svetlana Kiritchenko, Xiaodan Zhu, and Joel Martin. Information Processing and Management, Volume 51, Issue 4, July 2015, Pages 480–499.
Paper (pdf)    BibTeX     AnnotatedData    UnannotatedData

Semantic Role Labeling of Emotions in Tweets. Saif M. Mohammad, Xiaodan Zhu, and Joel Martin, In Proceedings of the ACL 2014 Workshop on Computational Approaches to Subjectivity, Sentiment, and Social Media (WASSA), June 2014, Baltimore, MD.
Paper (pdf)    BibTeX     AnnotatedData    UnannotatedData

Arabic BBN blog posts and Syrian tweets translated manaually and automatically into English and annotated for sentiment. The original Arabic text is also annotated for sentiment.

BBN blog posts: A subset of 1200 Arabic (Levantine dialect) sentences chosen from the BBN Arabic-Dialect/English Parallel Text. The sentences are extracted social media posts and provided with their translation. We manually annotated this subset and its translations (both manual and automatic) for sentiment (positive, negative, or neutral).

Syrian tweets: dataset of 2000 tweets originating from Syria (a country where Levantine dialectal Arabic is commonly spoken). These tweets were collected in May 2014 by polling the Twitter API. This dataset is not provided with manual English translation. We manually annotated this subset and its translations (both manual and automatic) for sentiment (positive, negative, or neutral).

Sentiment After Translation: A Case-Study on Arabic Social Media Posts. Mohammad Salameh, Saif M Mohammad and Svetlana Kiritchenko, In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL-2015), June 2016, Denver, Colorado.
Paper (pdf)   BibTeX   Data: Arabic Sentiment Lexicons

WordNet sentences annotated for metaphoric vs. literal and emotional vs. not emotional impact of verbs.

Metaphor as a Medium for Emotion: An Empirical Study, Saif M. Mohammad, Ekaterina Shutova, and Peter Turney. In Proceedings of the Joint Conference on Lexical and Computational Semantics (*Sem), August 2016, Berlin, Germany.
Paper (pdf)   BibTeX    Presentation       Data and Interactive Visualization


Collections of love letters, hate mail, and suicide notes.
A mapping of directory names in the Enron email corpus to email ids and to gender.


Last Updated: March 2016