|Rasoul Kaljahi 95a8d26eb9 New README||10 months ago|
|ann||11 months ago|
|lib||10 months ago|
|README.md||10 months ago|
|generate-sea.py||11 months ago|
This package contains the Sentiment Expression Annotation (SEA) described in . The SEA annotations are created on the English dataset released for task 5 of SemEval 2016 shared task on aspect-based sentiment analysis (ABSA).
Due to licensing restrictions, the original data cannot accompany these annotations. As a workaround, a script has been included which takes in the original dataset and attaches the SEA annotations. The script should be used as follows:
python generate-sea.py -x <PATH_TO_XML_FILE> -d <DOMAIN> -s <SUBSET>
PATH_TO_XML_FILE points to the original XML file released by SemEval 2016 task 5 organisers (for subtask 1), which can be found at http://alt.qcri.org/semeval2016/task5/index.php?id=data-and-tools. DOMAIN is either laptop or restaurant (not hotel) and SUBSET is either train or test. For example, assuming that the original XML files have been downloaded into the current directory, the following command generates the annotation files for the laptop training set in the current directory:
python generate-sea.py -x ABSA16_Laptops_Train_SB1_v2.xml -d laptop -s train
The generated annotation files are as follows:
The SEA annotations in the aio file match the aspect terms in the at file. This means that the first sentencs in the aio file is the SEA annotation corresponding to the first aspect term in the at file, and so on.
The annotations are in columnar format where the tokens constituting the sentiment expression are tagged with I and the others with O.
Note that the original sentences have been tokenized by the script.
 Rasoul Kaljahi and Jennifer Foster. Sentiment expression boundaries in sentiment polarity classification. In Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2018.