WorryLex: The NRC Word-- and MWE--Anxiety Association Lexicon

The NRC WorryLex is a list of over 44,000 English words and 10,000 multiword expressions (MWEs) with real-valued scores indicating their associations with anxiety: from -3 (maximum calmness) to 3 (maximum anxiety). The entries for words alone are known as the WorryWords subset; the entries for MWEs are called WorryMWEs. The scores were obtained by taking the means of individual labels by various native speakers of English (from manual annotations through crowdsourcing). One can also make use of the dicrete ordinal classes assigned to each word:

  • 3: very anxious (when mean anxiety score >= 2.5)
  • 2: moderately anxious (when mean >= 1.5 and < 2.5)
  • 1: slightly anxious (when mean >= 0.5 and < 1.5)
  • 0: not associated with feeling anxious or calm (when mean > -0.5 and < 0.5)
  • -1: slightly calm (when mean > -1.5 and <= -0.5)
  • -2: moderately calm (when mean > -2.5 and <= -1.5)
  • -3: very calm (when mean <= -2.5)

Download the WorryLex (Non-Commercial Use Only -- Research or Educational)
Copyright (C) 2011 National Research Council Canada (NRC)

Version: 1
Released: October 2024
Created By: Dr. Saif M. Mohammad
Home Page: http://saifmohammad.com/worrywords.html

Readme Last Updated: October 2024

Contact: Dr. Saif M. Mohammad (Principal Research Scientist, National Research Council Canada)
saif.mohammad@nrc-cnrc.gc.ca, uvgotsaif@gmail.com

See terms of use at the bottom of the page. See the Emotion Lexicons: Ethics and Data Statement before using the lexicon.

You may also be interested in these companion lexicons: NRC Emotion Lexicon and NRC Emotion Intensity Lexicon. The full list of word-emotion, word-sentiment, and word-colour lexicons is available in the Lexicons page).


Paper

From Composure to Catastrophe: Norms of Calmness–Anxiety Associations for 54,000 English Words and Multiword Expressions.

Practical and Ethical Considerations

Please see the papers below for ethical considerations involved in automatic emotion detection and the use of emotion lexicons. (These also acts as the Ethics and Data Statements for the lexicon.)

  1. Ethics Sheet for Automatic Emotion Recognition and Sentiment Analysis. Computational Linguistics. June 2022.
    Paper (pdf)    BibTeX    Slides

  2. Best Practices in the Creation and Use of Emotion Lexicons.
    Saif M. Mohammad. Findings of the Association for Computational Linguistics: EACL 2023. Dubrovnik, Croatia. 2023.
    Paper (pdf)    BibTex

Python Code to Analyze Emotions in Text

There are many third party software packages that can be used in conjunction with the NRC Emotion Lexicon to analyze emotion word use in text. We recommend Emotion Dynamics.

It is the primary package that we use to analyze text using the NRC Emotion Lexicon and the NRC VAD Lexicon. It can be used to generate a csv file with a number of emotion features pertaining to the text of interest, including metrics of utterance emotion dynamics.

WorryWords in Various Languages

(Will be available in November 2024)

The NRC WorryWords Lexicon has affect annotations for English words. Despite some cultural differences, it has been shown that a majority of affective norms are stable acrosslanguages. Thus, we provide versions of the lexicon in over 100 languages by translating the English terms using Google Translate.

The lexicon is thus available for English and these languages:

Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bosnian, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Corsican, Croatian, Czech, Danish, Dutch, Esperanto, Estonian, Filipino, Finnish, French, Frisian, Gaelic, Galician, Georgian, German, Greek, Gujarati, HaitianCreole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Kinyarwanda, Korean, Kurmanji, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Odia, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Sanskrit, Serbian, Sesotho, Shona, Simplified, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Tatar, Telugu, Thai, Traditional, Turkish, Turkmen, Ukranian, Urdu, Uyghur, Uzbek, Vietnamese, Welsh, Xhosa, Yiddish, Yoruba, Zulu.

Applications

The NRC WorryWords Lexicon has a broad range of applications in Computational Lingustics, Psychology, Digital Humanities, Computational Social Sciences, and beyond. Notably it can be used for:

  1. Understanding anxiety and the underlying mechanisms; how anxiety relates to other emotions; how it relates to our body; how anxiety changes with age, socio-economic status, weather, green spaces, etc.

  2. Determining how anxiety manifests in language; how language shapes anxiety; how culture shapes the language of anxiety; etc.

  3. Tracking the degree of anxiety towards targets of interest such as climate change, government policies, biological vectors, etc.

  4. Identifying effective coping mechanisms and clinical interventions to manage anxiety.

  5. Developing automatic systems for detecting anxiety; developing chat systems that are sensitive to nuances and diverse expressions of anxiety by people from various demographics.

  6. Studying anxiety and uneasiness in story telling; its relationship with central elements of narratology such as conflict and resilience.

  7. Studying how anxiety impacts social behaviour in physical and virtual environments.

Terms of Use

  1. Research Use: The lexicon mentioned in this page can be used freely for non-commercial research and educational purposes.

  2. Citation: Cite the papers associated with the lexicon in your research papers and articles that make use of them.

  3. Media Mentions: In news articles and online posts on work using the lexicon, cite the lexicon. For example: "We make use of the <resource name>, created by <author(s)> at the National Research Council Canada." We would appreciate a hyperlink to the lexicon home page and an email to the contact author (saif.mohammad@nrc-cnrc.gc.ca). (Authors and homepage information provided at the top of the README.)

  4. Credit: If you use the lexicon in a product or application, then acknowledge this in the 'About' page and other relevant documentation of the application by stating the name of the resource, the authors, and NRC. For example: "This application/product/tool makes use of the <resource name>, created by <author(s)> at the National Research Council Canada." We would appreciate a hyperlink to the lexicon home page and an email to the contact author (saif.mohammad@nrc-cnrc.gc.ca).

  5. No Redistribution: Do not redistribute the data. Direct interested parties to the lexicon home page. You may not rent or license the use of the lexicon nor otherwise permit third parties to use it. Do not upload the lexicon on a public website.

    ** Important **
    Do not upload or store the lexicon in any location that can be scanned by companies or other entities to create large language models. for example, do not upload to a public website, Hugging Face, GitHub, etc.

  6. Proprietary Notice: You will ensure that any copyright notices, trademarks or other proprietary right notices placed by NRC on the lexicon remains in evidence.

  7. Title: All intellectual property rights in and to the lexicon shall remain the property of NRC. All proprietary interests, rights, unencumbered titles, copyrights, or other Intellectual Property Rights in the lexicon and all copies thereof remain at all times with NRC.

  8. Commercial License: If interested in commercial use of the lexicon, contact the author: saif.mohammad@nrc-cnrc.gc.ca

  9. Disclaimer: National Research Council Canada (NRC) disclaims any responsibility for the use of the lexicon and does not provide technical support. NRC makes no representation and gives no warranty of any kind with respect to the accuracy, usefulness, novelty, validity, scope, or completeness of the lexicon and expressly disclaims any implied warranty of merchantability or fitness for a particular purpose of the lexicon. That said, the contact listed above welcomes queries and clarifications.

  10. Limitation of Liability: You will not make claims of any kind whatsoever upon or against NRC or the creators of the lexicon, either on your own account or on behalf of any third party, arising directly or indirectly out of your use of the lexicon. In no event will NRC or the creators be liable on any theory of liability, whether in an action of contract or strict liability (including negligence or otherwise), for any losses or damages incurred by you, whether direct, indirect, incidental, special, exemplary or
    consequential, including lost or anticipated profits, savings, interruption to business, loss of business opportunities, loss of business information, the cost of recovering such lost information, the cost of substitute intellectual property or any other pecuniary loss arising from the use of, or the inability to use, the lexicon regardless of whether you have advised NRC or NRC has advised you of the possibility of such damages.