الفهرس | Only 14 pages are availabe for public view |
Abstract Social media platforms have grown rapidly in recent years, with billions of people worldwide using them for communication, entertainment, and information. Social media development has dramatically impacted society, affecting how people interact, communicate, and consume information. While social media has numerous advantages, it has also prompted worries about privacy, misinformation, and the influence on mental health, especially among young people. The dissemination of rumors has been significantly impacted by social media platforms. The major platform that has been used for spreading news regarding the Covid-19 pandemic is Twitter. The Covid-19 pandemic has spread a considerable deal of false material on social media. Artificial intelligence proposed several methods to relieve the spread of fake news. In this study, we proposed a model that can discriminate between “fake” and “true” news tweets capable of working with any up-to-date problem. To address this issue, this research explored various learning approaches to detect fake news. We compare different deep learning and machine learning methods for fake news detection, such as CNN, LSTM, Na¨ıve Bayes, and Support Vector Machine. The efficiency of these models was evaluated on benchmark datasets and self-collected dataset. This research aims to improve the model used in classifying rumors by utilizing various techniques for text representation such as Word Embedding and TF-IDF. It involves extracting the underlying meanings in texts by searching for semantic relationships between words, phrases, and texts. These processes help in analyzing and understanding texts. The efficiency of these models was tested by training data on a set of tweets. New tweets were collected using Snscrape to track different writing methods and build a model capable of detecting errors with all the changes that occur in a word and returning to the origin of the word. The results of the first model using TF-IDF algorithms and machine learning algorithms showed the superiority of Multi-Layer Perceptron algorithm, achieving an accuracy of 93.8% and an F-score of 93.6% when applied to the English language. The results of the Arabic language models showed the superiority of the Support Vector Machine algorithm, achieving an accuracy of 82.90%, while the K-Nearest Neighbor achieved better results with iv an F-score of 57.5%. The results showed the superiority of Uni-gram text vectorization over Bi-gram. GloVe word embedding was used with deep learning algorithms to improve text understanding and discover relationships between words. Recurrent neural networks achieved the best results for the English language with an accuracy of 99%, but the ensemble learning model achieved better results in terms of F-score achieved 97%. The Convolutional Neural Network achieved the best results with the Arabic language achieved an accuracy of 83% using the Accuracy measure, while the Ensemble learning model achieved better results using the F-score at a rate of 81.7%. The second step was to test the model on a new test set that had not been tested before. A significant decline of about 25% was found in the English language model, achieving an accuracy of 74%. The experiments showed that adding some modifications to the evidence processing stage to develop the model made it capable of dealing with all the changes that occur in a word and showed an improvement of about 8% achieving an accuracy of 83%. As for the proposed model for the Arabic language, there was a decline of about 5%, achieving an accuracy of 70%. The results vary between deep learning models, but the BI-LSTM showed the difference between the differences in the data. With some modifications to the word processing stage to develop the model and make it capable of dealing with all the changes that occur in a word, there was an improvement of about 8% achieving an accuracy of 78%. |