BBN Arabic Sentiment Analysis dataset Version 1.0 17 March 2015 Copyright (C) 2015 National Research Council Canada (NRC) Contact: Mohammad Salameh (msalameh@ualberta.ca) Saif Mohammad (saif.mohammad@nrc-cnrc.gc.ca) Svetlana Kiritchenko (Svetlana.Kiritchenko@nrc-cnrc.gc.ca) Terms of use: 1. This dataset can be used freely for research purposes. 2. The papers listed below provide details of the creation and use of the dataset. If you use a dataset, then please cite the associated papers. 3. If you use the dataset in a product or application, then please credit the authors and NRC appropriately. Also, if you send us an email, we will be thrilled to know about how you have used the dataset. 4. National Research Council Canada (NRC) disclaims any responsibility for the use of the dataset and does not provide technical support. However, the contact listed above will be happy to respond to queries and clarifications. 5. Rather than redistributing the data, please direct interested parties to this page: [ADD LINK] Please feel free to send us an email: - with feedback regarding the datasets. - with information on how you have used the dataset. - if interested in a collaborative research project. ....................................................................... BBN Arabic Sentiment Analysis dataset ---------------------------------- The BBN dataset has a random subset of 1200 Levantine dialectal sentences chosen from the BBN Arabic-Dialect/English Parallel Text. The sentences are extracted social media posts and provided with their translation. https://catalog.ldc.upenn.edu/LDC2012T09 We manually annotated this subset and its translations (both manual and automatic) for sentiment (positive, negative, or neutral). ....................................................................... PUBLICATIONS ------------ Details of the BBN dataset and its use in an Arabic sentiment analysis system can be found in the following peer-reviewed publications: --Sentiment After Translation: A Case-Study on Arabic Social Media Posts. Mohammad Salameh, Saif M Mohammad and Svetlana Kiritchenko, In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL-2015), June 2015, Denver, Colorado. --How Translation Alters Sentiment. Saif M Mohammad, Mohammad Salameh, and Svetlana Kiritchenko, In Journal of Artificial Intelligence Research, in press. Links to the papers are available here: http://saifmohammad.com/WebPages/WebDocs/arabicSA-JAIR.pdf http://aclweb.org/anthology/N/N15/N15-1078.pdf ....................................................................... VERSION INFORMATION ------------------- Version 1.0 is the first version as of 17 March 2015. ....................................................................... FORMAT ------ The BBN dataset has sheets with these title 1)ar_manual.sent.: The manual sentiment annotation for the Arabic posts 2)en_auto.trans-manl.sent: has the automatic translation that are manually annotated for sentiment 3)en_manl.trans-manl.sent: has the manual transaltion for the Arabic posts, mannually annotated for sentiment Annotation categories are: positive, negative, neutral and both. The above datasets are also provided with the confidence of the annotation calculated by CrowdFlower. .......................................................................