ANALISIS SENTIMEN PADA ULASAN APLIKASI AMAZON SHOPPING DI GOOGLE PLAY STORE MENGGUNAKAN NAIVE BAYES CLASSIFIER
DOI:
https://doi.org/10.56127/jts.v1i3.434Keywords:
sentiment analysis, amazon shopping, machine learning, naive bayes classifierAbstract
Sentiment analysis or opinion mining is a study that analyzes people's opinions, thoughts and impressions on various topics, subjects, and products or services. The development of social media makes public opinion data available which can be found easily on the internet. The large volume of data causes the need for an automatic system to classify the data based on different aspects because classifying data manually is a time-consuming process. In this study, sentiment analysis will be carried out with a machine learning-based approach using the Naive Bayes algorithm using user review data on the Amazon Shopping application on the Google Play Store. The classification results using the four Naive Bayes algorithms produce an average accuracy of 82.15%, precision of 72.25%, recall of 83.49%, and f1-score of 77.41%. Multinomial NB produces the best accuracy among the four Naive Bayes algorithms used, which is 86.74%. The values of precision, recall, and f1-score are 78.82%, 85.90%, and 82.21%, respectively.
References
Vanaja, S., Belwal, M. (2018). Aspect-Level Sentiment Analysis on E-Commerce Data. Proceedings of the International Conference on Inventive Research in Computing Applications (ICIRCA 2018). doi:10.1109/ICIRCA.2018.8597286
Liu, Bing. (2020). Sentiment Analysis: Mining Opinions, Sentiments, and Emotions. Cambridge University Press. ISBN: 9781108486378
BrightLocal. (2020). Local Consumer Review Survey 2020. Available Online at: https://www.brightlocal.com/research/local-consumer-review-survey-2020/[ April 15th 2022]
Birjali, M., Kasri, M., & Beni-Hssane, A. (2021). A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226, 107134. doi:10.1016/j.knosys.2021.107134
Haque, T. U., Saber, N. N., & Shah, F. M. (2018). Sentiment analysis on large scale Amazon product reviews. 2018 IEEE International Conference on Innovative Research and Development (ICIRD). doi:10.1109/icird.2018.8376299
Tama, V. O., Sibaroni, Y., & Adiwijaya. (2019). Labeling Analysis in the Classification of Product Review Sentiments by using Multinomial Naive Bayes Algorithm. Journal of Physics: Conference Series, 1192, 012036. doi:10.1088/1742-6596/1192/1/012036
Mohri, M., Rostamizadeh, A., Talwalkar, A. (2018). Foundations of Machine Learning Second Edition. The MIT Press: Cambridge, MA. ISBN 9780262039406
Borka, K. R., Hora, S., Jain, T., Wambugu, M. (2019). Deep Learning for Natural
Language Processing. Packt Publishing Ltd: Birmingham. ISBN: 9781838553678.
Thanaki, J. (2017). Feature Engineering and NLP Algorithms. Python Natural Language Processing, 102-172.
Kumar, S., Kar, A. K., & Ilavarasan, P. V. (2021). Applications of text mining in services management: A systematic literature review. International Journal of Information Management Data Insights, 1(1), 100008. doi:10.1016/j.jjimei.2021.100008
Birjali, M., Kasri, M., & Beni-Hssane, A. (2021). A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226, 107134. doi:10.1016/j.knosys.2021.107134
Ghojogh, Benyamin, et al. "Feature selection and feature extraction in pattern analysis: A literature review." arXiv preprint arXiv:1905.02845 (2019).
Mahesh, B. (2020). Machine learning algorithms-a review. International Journal of Science and Research (IJSR).[Internet], 9, 381-386.
Kirasich, Kaitlin, Trace Smith, and Bivin Sadler.(2018) "Random forest vs logistic regression: binary classification for heterogeneous datasets." SMU Data Science Review 1.3, 2018: 9.
Charbuty, B., & Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(01), 20-28.
Speiser, J. L., Miller, M. E., Tooze, J., & Ip, E. (2019). A comparison of random forest variable selection methods for classification prediction modeling. Expert systems with applications, 134, 93-101.
Khan, M. A., Memon, S. A., Farooq, F., Javed, M. F., Aslam, F., & Alyousef, R. (2021). Compressive strength of fly-ash-based geopolymer concrete by gene expression programming and random forest. Advances in Civil Engineering, 2021.
Saim, M. M., & Ammor, H. (2022). Comparative study of machine learning algorithms (SVM, Logistic Regression and KNN) to predict cardiovascular diseases. In E3S Web of Conferences (Vol. 351, p. 01037). EDP Sciences.
Golpour, P., Ghayour-Mobarhan, M., Saki, A., Esmaily, H., Taghipour, A., Tajfard, M., ... & Ferns, G. A. (2020). Comparison of support vector machine, naïve Bayes and logistic regression for assessing the necessity for coronary angiography. International journal of environmental research and public health, 17(18), 6449.
Sharma, A., Kaur, S., Memon, N., Fathima, A. J., Ray, S., & Bhatt, M. W. (2021). Alzheimer's patients detection using support vector machine (SVM) with quantitative analysis. Neuroscience Informatics, 1(3), 100012.
Meiyazhagan, J. & S., Sudharsan & Venkatasen, A. & Senthilvelan, M.. (2021). Prediction of Occurrence of Extreme Events using Machine Learning.
Mousavi, Zarin & Mohammadi Zanjireh, Morteza & Oghbaie, Marzieh. (2020). Applying computational classification methods to diagnose Congenital Hypothyroidism: A comparative study. 18. 10.1016/j.imu.2019.100281.