Exploratory Data Analysis and Natural Language Processing Model for Analysis and Identification of the Dynamics of COVID-19 Vaccine Opinions on Small Datasets

Main Article Content

Alexander Chkhartishvili
Dmitry Gubanov
Vladislav Melnichuk
Vladislav Sych

Abstract

In this study, the successful implementation of an active learning algorithm on small-scale datasets is demonstrated. The study also examines the dynamics of public opinions on COVID-19 vaccinations using VK (social network) commentaries related to the COVID- 19 vaccine and masks for opinion evaluation. The proposed methodology includes several stages such as natural language processing, classification with active learning, exploratory data analysis, and opinion dynamics. Natural language processing is used for text preprocessing, tokenization, and feature extraction. A machine learning model with active learning is employed to identify opinions as positive, negative, or neutral/unknown. The model includes classical machine learning, machine learning and deep learning models. The results show that the highest classification accuracy is 69.1% and 73.1% without and with the active learning algorithm, respectively. The experimental results suggest that classifiers using active learning perform better than simple natural language processing classifiers on small-scale datasets.

Downloads

Download data is not yet available.

Article Details

How to Cite
Chkhartishvili, A., Gubanov, D., Melnichuk, V., & Sych, V. (2023). Exploratory Data Analysis and Natural Language Processing Model for Analysis and Identification of the Dynamics of COVID-19 Vaccine Opinions on Small Datasets. Advances in Systems Science and Applications, 23(3), 108-126. https://doi.org/10.25728/assa.2023.23.3.1381
Section
Articles