Exploratory Data Analysis and Natural Language Processing Model for Analysis and Identification of the Dynamics of COVID-19 Vaccine Opinions on Small Datasets

Authors

  • Alexander Chkhartishvili Laboratory No. 57, V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, Moscow, Russia
  • Dmitry Gubanov Laboratory No. 11, V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, Moscow, Russia
  • Vladislav Melnichuk Department of Applied Mathematics, Faculty of Fundamental Sciences, Bauman Moscow State Technical University, Moscow, Russia; Machine Learning Track, Technopark VK Education, Moscow, Russia
  • Vladislav Sych Department of Applied Mathematics, Faculty of Fundamental Sciences, Bauman Moscow State Technical University, Moscow, Russia

DOI:

https://doi.org/10.25728/assa.2023.23.3.1381

Keywords:

active learning, deep neural network, opinion dynamics, opinion analysis, Covid-19 vk opinion classification

Abstract

In this study, the successful implementation of an active learning algorithm on small-scale datasets is demonstrated. The study also examines the dynamics of public opinions on COVID-19 vaccinations using VK (social network) commentaries related to the COVID- 19 vaccine and masks for opinion evaluation. The proposed methodology includes several stages such as natural language processing, classification with active learning, exploratory data analysis, and opinion dynamics. Natural language processing is used for text preprocessing, tokenization, and feature extraction. A machine learning model with active learning is employed to identify opinions as positive, negative, or neutral/unknown. The model includes classical machine learning, machine learning and deep learning models. The results show that the highest classification accuracy is 69.1% and 73.1% without and with the active learning algorithm, respectively. The experimental results suggest that classifiers using active learning perform better than simple natural language processing classifiers on small-scale datasets.

Downloads

Download data is not yet available.

Downloads

Published

2023-10-12

How to Cite

Chkhartishvili, A., Gubanov, D., Melnichuk, V., & Sych, V. (2023). Exploratory Data Analysis and Natural Language Processing Model for Analysis and Identification of the Dynamics of COVID-19 Vaccine Opinions on Small Datasets. Advances in Systems Science and Applications, 23(3), 108–126. https://doi.org/10.25728/assa.2023.23.3.1381