The NRC Emotion Lexicon is a list of English words and their associations with eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive). The annotations were manually done by crowdsourcing.
Version: 0.92 Publicly Released: 10 July 2011 Created By: Dr. Saif M. Mohammad, Dr. Peter Turney Home Page: http://saifmohammad.com/WebPages/NRC-Emotion-Lexicon.htm
Readme Last Updated: August 2022 Automatic translations from English to 108 languages was last updated: August 2022
Contact: Dr. Saif M. Mohammad (Senior Research Scientist, National Research Council Canada)
saif.mohammad@nrc-cnrc.gc.ca, uvgotsaif@gmail.com
Crowdsourcing a Word-Emotion Association Lexicon, Saif
Mohammad and Peter Turney, Computational Intelligence, 29 (3),
436-465, 2013. Paper
(pdf)BibTeX
Emotions Evoked by Common Words
and Phrases: Using Mechanical Turk to Create an Emotion Lexicon,
Saif Mohammad and Peter Turney, In Proceedings of the NAACL-HLT 2010 Workshop on Computational
Approaches to Analysis and Generation of Emotion in Text, June 2010, LA, California. AbstractPaper (pdf)BibTeXPresentation
This study has been approved by the NRC Research Ethics Board (NRC-REB) under protocol number 2009-94. REB review seeks to ensure that research projects involving humans as participants meet Canadian standards of ethics.
Practical and Ethical Considerations
Please see the papers below for ethical considerations involved in automatic emotion detection and the use of emotion lexicons. (These also acts as the Ethics and Data Statements for the lexicon.)
There are many third party software packages that can be used in conjunction with the NRC Emotion Lexicon to analyze emotion word use in text. We recommend Emotion Dynamics.
It is the primary package that we use to analyze text using the NRC Emotion Lexicon and the NRC VAD Lexicon. It can be used to generate a csv file with a number of emotion features pertaining to the text of interest, including metrics of utterance emotion dynamics.
May 2020, Ten years back, with some anticipation and lots of excitement, Peter Turney and I introduced the NRC Word-Emotion Association Lexicon. So grateful that, over the years, so many people have put their hopes and trust in it. Such joy to see them boldly shine a light on the human condition — the positives and the negatives; not shying away even from sadness, fear, and anger. This blog bost puts a spotlight on ten favorites (includes fun video and audio clips as well as links to papers and popular press articles):
The NRC Emotion Lexicon has affect annotations for English words. Despite some cultural differences, it has been shown that a majority of affective norms are stable acrosslanguages. Thus, we provide versions of the lexicon in over 100 languages by translating the English terms using Google Translate (August 2022).
The lexicon is thus available for English and these languages:
Note that an earlier version included translations obtained in 2017. The current 2022 translations are markedly better. That said, some of the translations may still be incorrect or they may simply be transliterations of the original English terms.
An Interactive Visualizer
Impact
Some notable ways in which the NRC Emotion Lexicon has made impact include:
First of its kind: It was the first word-emotion association lexicon, with entries for eight basic emotions as well as positive and negative sentiment. It still remains the largest such lexicon. Prior work largely focused on positive and negative sentiment. While earlier work focused on words that *denotate* emotion, this work included the larger set of words that are associated with or connotate an emotion.
Quality Control: Careful attention was paid to ensure appropriate annotations including the use of a separate word choice question to make sure annotators knew the word and to guide them to the desired sense of the word for which annotations were solicited.
Impact on NLP: The lexicon impacted work in sentiment and emotion analysis in NLP. Notably, facilitating work beyond just the positive-negative affect dimension. The lexicon has been used for word-, sentence-, tweet-, and document-level sentiment and emotion analysis, abusive language detection, personality trait identification, stance detection, etc. The lexicon is especially useful in unsupervised settings and when training data is limited or not available. However, even with the onset of deep learning methods, many top systems in shared tasks (such as SemEval-2018 Task 1 Affect in Tweets) continue to benefit from the lexicon by using it to initialize their embeddings and adding additional lexicon-derived features.
Impact on fields beyond NLP:
Work on Well-Being and Health Disorders: Used in work on understanding pandemic response, feelings towards influenza vaccinations, depression detection, hate speech detection, identifying cyber-bullying, etc. Proceedings of workshops such as CL-psych and i2b2 describe systems that use the NRC Emotion Lexicon.
Psychology, Behavioural Science, Psycolinguistics, Fairness, and Social Science: Used in work on understanding how people express emotions, relationships between word characteristics (such as length and concreteness) with its associated emotion, gender attitudes, as well as the role of emotions in the spread of information, especially news, fake news, and viral videos. The highly cited paper "The spread of true and false news online" uses the lexicon to determine associations of emotions with fake news and its virality.
Digital Humanitiesand Computational Literature (detecting narrative arcs in novels and fairy tales). Notable works include:
Articles about a symphony orchestra performed music composed using the NRC Emotion Lexicon under the glass of the Louvre museum in Paris on Sept. 20, 2016. Click here for a video of the performance.
SlashDot, March23, 2014: Algorithm Composes Music By Text Analyzing the World's Best Novels.
Ethics and Fairness (work on comparing attitudes towards men and women at work)
Data Science: see example data science projects highlighted in popular press (bottom of this page). Also see: Chatty maps. Fast Company, March 25, 2016: An Emotional Map Of The City, As Captured Through Its Sounds.
Democratization of Emotion Analysis: The large lexicon allowed for the use of simple methods to track trends in emotions. This democratized emotion analysis and it was used not just by computer scientists, but also by journalists, psychologists, social scientists, and amateur data enthusiasts to detect trends in emotions in everything from the Brexit discourse, election tweets, Radiohead songs, abusive language, reddit posts, and more.
The Telegraph, June 15, 2016: EU referendum: Remain uses Project Fear more in tweets than Leave, analysis shows. [Use of the NRC Emotion Lexicon, aka EmoLex, to track sentiment in EU referendum tweets (Brexit).]
Availability: The lexicon is made freely available for research, and has been commercially licensed to companies for a small fee.
Terms of Use
Research Use: The lexicon mentioned in this page can be used freely for non-commercial research and educational purposes.
Citation: Cite the papers associated with the lexicon in your research papers and articles that make use of them.
Media Mentions: In news articles and online posts on work using the lexicon, cite the lexicon. For example: "We make use of the <resource name>, created by <author(s)> at the National Research Council Canada." We would appreciate a hyperlink to the lexicon home page and an email to the contact author (saif.mohammad@nrc-cnrc.gc.ca). (Authors and homepage information provided at the top of the README.)
Credit: If you use the lexicon in a product or application, then acknowledge this in the 'About' page and other relevant documentation of the application by stating the name of the resource, the authors, and NRC. For example: "This application/product/tool makes use of the <resource name>, created by <author(s)> at the National Research Council Canada." We would appreciate a hyperlink to the lexicon home page and an email to the contact author (saif.mohammad@nrc-cnrc.gc.ca).
No Redistribution: Do not redistribute the data. Direct interested parties to the lexicon home page. You may not rent or license the use of the lexicon nor otherwise permit third parties to use it.
Proprietary Notice: You will ensure that any copyright notices, trademarks or other proprietary right notices placed by NRC on the lexicon remains in evidence.
Title: All intellectual property rights in and to the lexicon shall remain the property of NRC. All proprietary interests, rights, unencumbered titles, copyrights, or other Intellectual Property Rights in the lexicon and all copies thereof remain at all times with NRC.
Commercial License: If interested in commercial use of the lexicon, contact the author: saif.mohammad@nrc-cnrc.gc.ca
Disclaimer: National Research Council Canada (NRC) disclaims any responsibility for the use of the lexicon and does not provide technical support. NRC makes no representation and gives no warranty of any kind with respect to the accuracy, usefulness, novelty, validity, scope, or completeness of the lexicon and expressly disclaims any implied warranty of merchantability or fitness for a particular purpose of the lexicon. That said, the contact listed above welcomes queries and clarifications.
Limitation of Liability: You will not make claims of any kind whatsoever upon or against NRC or the creators of the lexicon, either on your own account or on behalf of any third party, arising directly or indirectly out of your use of the lexicon. In no event will NRC or the creators be liable on any theory of liability, whether in an action of contract or strict liability (including negligence or otherwise), for any losses or damages incurred by you, whether direct, indirect, incidental, special, exemplary or
consequential, including lost or anticipated profits, savings, interruption to business, loss of business opportunities, loss of business information, the cost of recovering such lost information, the cost of substitute intellectual property or any other pecuniary loss arising from the use of, or the inability to use, the lexicon regardless of whether you have advised NRC or NRC has advised you of the possibility of such damages.
Code
There are many third party software packages that can be used in conjunction with the NRC Emotion Lexicon to analyze emotion word use in text. We recommend Emotion Dynamics. It is the primary package that we use to analyze text using the NRC Emotion Lexicon andthe NRC VAD Lexicon. It can be used to generate a csv file with a number of emotion features pertaining to the text of interest, including metrics of utterance emotion dynamics. Associated Paper.
The AffectiveTweets Package: Felipe Bravo-Marquez implemented AffectiveTweets for the Weka machine learning workbench that provides a collection of filters for extracting features from tweets for sentiment classification/regression and other related tasks. The package is especially useful to generate feature vectors from a large number of affect lexicons. The vector can then be concatenated to other features vectors (say dense-distributed representations of the text) to improve perfomance. (You can use the feature vector with any classifier -- not just one with support from Weka.)
These third party packages also faculitate the use of the NRC Emotion Lexicon:
if you are interested in having us analyze your data for sentiment, emotion, and other affectual information;
if you are interested in a collaborative research project.
We regularly collaborate with graduate students, post-docs, faculty, and research professional from Computer Science, Psychology, Digital Humanities, Linguistics, Social Science, etc.
Email: Dr. Saif M. Mohammad (saif.mohammad@nrc-cnrc.gc.ca, uvgotsaif@gmail.com)