Feature Extraction for Sentiment Analysis in Indonesian Twitter

Eka Dyar  Wahyuni; Amalia Anjani  Arifiyanti; Mohamad Irwan  Afandi

doi:10.11594/nstp.2021.0913

Feature Extraction for Sentiment Analysis in Indonesian Twitter

Authors

Eka Dyar Wahyuni Information System Department, Faculty of Computer Science, Universitas Pembangunan Nasional “Veteran” Jawa Timur, Indonesia
Amalia Anjani Arifiyanti Information System Department, Faculty of Computer Science, Universitas Pembangunan Nasional “Veteran” Jawa Timur, Indonesia
Mohamad Irwan Afandi Information System Department, Faculty of Computer Science, Universitas Pembangunan Nasional “Veteran” Jawa Timur, Indonesia

DOI:

https://doi.org/10.11594/nstp.2021.0913

Keywords:

Feature extraction, TF-IDF, Indonesian Tweet

Abstract

Twitter's sentiment analysis is one of the most interesting fields of research lately. It intertwines the natural language processing techniques with data mining. Up to this point, many algorithms have been proposed to better understand sentiment from text. The proposed method can be focused on the preprocessing step, dataset splitting method (training and testing), dataset balancing method (when the data is unbalanced), to the improvement of the existing algorithm. But, the main focus of this paper is on feature extraction from tweets using TF-IDF. The features obtained from this process are expected to improve the accuracy of the classification process. The dataset used in this research is in Indonesian, which has a very different form when compared to English. This dataset consists of 1068 manually labeled tweets related to the "school from home" policy caused by the COVID-19 outbreak, taken from March to July. All steps required to process this data will be implemented using python. To validate its utility, the performance of the proposed method is compared with each other. Finally, the results are summarized by reflecting on the impact of the inclusion of the proposed features for each classification algorithm for sentiment detection

Downloads

Download data is not yet available.

Downloads

Published

02-05-2021

Conference Proceedings Volume

5th International Seminar of Research Month 2020

Section

Articles

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this proceedings agree to the following terms:

Authors retain copyright and grant the Nusantara Science and Technology Proceedings right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this proceeding.

Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the proceedings published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this proceeding.

Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See the Effect of Open Access).

How to Cite

Wahyuni, E. D. ., Arifiyanti, A. A. ., & Afandi, M. I. . (2021). Feature Extraction for Sentiment Analysis in Indonesian Twitter . Nusantara Science and Technology Proceedings, 86-92. https://doi.org/10.11594/nstp.2021.0913