Quran and Bible Project: Exploring Sentiment Analysis
Project Manager: Dr. Farhana Akter
Research Team: AISoftSolution
Introduction
Sentiment analysis (or opinion mining) is a Natural Language Processing (NLP) technique that can be used to determine whether data is positive, negative, or neutral. It enables the analysis of textual data, aiding businesses in monitoring brand and product sentiment within customer feedback. The goal of our study was to employ sentiment analysis to understand the sentiments expressed in the Quran and the Bible (King James Version).
Analysis & Study
We utilized pre-trained models for conducting sentiment analysis on the textual data. However, these models pose a challenge as they were trained on Twitter comments, tweets, and IMDB movie reviews. In this project, dealing with religious textual data, accuracy might be compromised since these models lack training on religious texts. Despite this limitation, the models showed remarkable results after data preprocessing. Some religious texts contain words not commonly used in English vocabulary, such as “thy,” which the model classifies as neutral. These words cannot be removed as they are not part of the NLTK stop words list.
For this project, the code was written with comments and headings. Google Colab or Jupyter Notebook can be employed to run and execute the code.
Running the Code:
For the current project, the code was written with comments and a heading. Google Colab or Jupyter Notebook was used to execute the code.
Results:
The following results display the output of the model for the Bible text after running the model on the Bible (King James Version).
Neutral Words:
These words are classified as neither negative nor positive. The abundance of neutral words results from the prevalence of religious texts or words not commonly used in modern English.
Conclusion:
To achieve complete accuracy and precision in our model, we recommend the creation of a custom dataset. It is essential to train this dataset using our proprietary sentiment analysis model. Additionally, implementing custom stop-word removal functions is necessary for improved performance with religious texts. The current model demonstrates a 67% accuracy level, mainly because it was trained on Twitter and IMDB movie review texts. Training our model with religious texts or using TEXT BLOB or VADER SENTIMENT on our custom dataset will significantly enhance the model’s accuracy.