DETECTING CYBERBULLYING TWEETS USING MACHINE LEARNING AND DEEP LEARNING WITH PYTHON GUI

·
· BALIGE PUBLISHING
4,5
4 atsauksmes
E-grāmata
275
Lappuses
Atsauksmes un vērtējumi nav pārbaudīti. Uzzināt vairāk

Par šo e-grāmatu

This project focuses on detecting cyberbullying tweets using both Machine Learning and Deep Learning techniques with a Python GUI implemented using PyQt. The first step involves data exploration, where the dataset is loaded and analyzed to gain insights into its structure and contents. Visualizations are created to understand the distribution of cyberbullying types and other features in the data.

 

After data exploration, preprocessing is performed to clean and prepare the tweets for analysis. Text cleaning techniques, such as removing emojis, punctuation, links, and stop words, are applied to the tweet text. The data is then categorized based on the length of the tweets to facilitate further analysis.

 

Next, the data is split into input and output variables, where the tweet text becomes the input feature (X) and the cyberbullying type becomes the output (y). The cyberbullying types are converted into numerical labels for ML models' compatibility.

 

Machine Learning models are trained and evaluated using TF-IDF, Count Vectorizer, and Hashing Vectorizer as feature extraction techniques. SMOTE is applied to handle class imbalance, and the data is split into training and testing sets. Grid Search is utilized to find the best hyperparameters for ML models, optimizing their performance. Machine Learning models used are Logistic Regression, Support Vector Machines, K-Nearest Neighbors, Decision Trees, Random Forests, Gradient Boosting, Extreme Gradient Boosting, Light Gradient Boosting.

 

Moving to Deep Learning, LSTM (Long Short-Term Memory) and 1D CNN (Convolutional Neural Network) models are constructed to detect cyberbullying types. The tweet text is embedded, and various layers are added to the models to extract meaningful features. The models are then compiled with appropriate loss functions and optimizers.

 

The evaluation process is carried out using the test set for both Machine Learning and Deep Learning models. Metrics like accuracy, precision, recall, and F1-score are used to assess the models' performance in detecting cyberbullying types.

 

To enable easy access to the functionalities, a Graphical User Interface (GUI) is developed using PyQt. The GUI allows users to interact with the models and dataset easily. Users can select the feature extraction technique, choose the classifier, and initiate the model training process using the GUI's buttons and dropdown menus.

For better data visualization, the GUI includes various plots, such as bar plots and pie charts, to show the distribution of cyberbullying types and other features. These visualizations help users understand the data and model performance intuitively.

 

The final stage involves testing the GUI with different inputs and exploring model predictions on user-provided text. The GUI provides predictions for the given tweet text, indicating the likelihood of each cyberbullying type based on the trained models.

 

Overall, this project combines data exploration, preprocessing, feature extraction using Machine Learning and Deep Learning models, GUI development, and data visualization to detect cyberbullying tweets effectively and provide an accessible interface for users to interact with the models. The end product allows users to analyze tweets for potential cyberbullying and contributes to promoting a safer and more respectful online environment.


Vērtējumi un atsauksmes

4,5
4 atsauksmes

Par autoru

Vivian Siahaan is a fast-learner who likes to do new things. She was born, raised in Hinalang Bagasan, Balige, on the banks of Lake Toba, and completed high school education from SMAN 1 Balige. She started herself learning Java, Android, JavaScript, CSS, C ++, Python, R, Visual Basic, Visual C #, MATLAB, Mathematica, PHP, JSP, MySQL, SQL Server, Oracle, Access, and other programming languages. She studied programming from scratch, starting with the most basic syntax and logic, by building several simple and applicable GUI applications. Animation and games are fields of programming that are interests that she always wants to develop. Besides studying mathematical logic and programming, the author also has the pleasure of reading novels. Vivian Siahaan has written dozens of ebooks that have been published on Sparta Publisher: Data Structure with Java; Java Programming: Cookbook; C ++ Programming: Cookbook; C Programming For High Schools / Vocational Schools and Students; Java Programming for SMA / SMK; Java Tutorial: GUI, Graphics and Animation; Visual Basic Programming: From A to Z; Java Programming for Animation and Games; C # Programming for SMA / SMK and Students; MATLAB For Students and Researchers; Graphics in JavaScript: Quick Learning Series; JavaScript Image Processing Methods: From A to Z; Java GUI Case Study: AWT & Swing; Basic CSS and JavaScript; PHP / MySQL Programming: Cookbook; Visual Basic: Cookbook; C ++ Programming for High Schools / Vocational Schools and Students; Concepts and Practices of C ++; PHP / MySQL For Students; C # Programming: From A to Z; Visual Basic for SMA / SMK and Students; C # .NET and SQL Server for High School / Vocational School and Students. At the ANDI Yogyakarta publisher, Vivian Siahaan also wrote a number of books including: Python Programming Theory and Practice; Python GUI Programming; Python GUI and Database; Build From Zero School Database Management System In Python / MySQL; Database Management System in Python / MySQL; Python / MySQL For Management Systems of Criminal Track Record Database; Java / MySQL For Management Systems of Criminal Track Records Database; Database and Cryptography Using Java / MySQL; Build From Zero School Database Management System With Java / MySQL.

Rismon Hasiholan Sianipar was born in Pematang Siantar, in 1994. After graduating from SMAN 3 Pematang Siantar 3, the writer traveled to the city of Jogjakarta. In 1998 and 2001 the author completed his Bachelor of Engineering (S.T) and Master of Engineering (M.T) education in the Electrical Engineering of Gadjah Mada University, under the guidance of Prof. Dr. Adhi Soesanto and Prof. Dr. Thomas Sri Widodo, focusing on research on non-stationary signals by analyzing their energy using time-frequency maps. Because of its non-stationary nature, the distribution of signal energy becomes very dynamic on a time-frequency map. By mapping the distribution of energy in the time-frequency field using discrete wavelet transformations, one can design non-linear filters so that they can analyze the pattern of the data contained in it. In 2003, the author received a Monbukagakusho scholarship from the Japanese Government. In 2005 and 2008, he completed his Master of Engineering (M.Eng) and Doctor of Engineering (Dr.Eng) education at Yamaguchi University, under the guidance of Prof. Dr. Hidetoshi Miike. Both the master's thesis and his doctoral thesis, R.H. Sianipar combines SR-FHN (Stochastic Resonance Fitzhugh-Nagumo) filter strength with cryptosystem ECC (elliptic curve cryptography) 4096-bit both to suppress noise in digital images and digital video and maintain its authenticity. The results of this study have been documented in international scientific journals and officially patented in Japan. One of the patents was published in Japan with a registration number 2008-009549. He is active in collaborating with several universities and research institutions in Japan, particularly in the fields of cryptography, cryptanalysis and audio / image / video digital forensics. R.H. Sianipar also has experience in conducting code-breaking methods (cryptanalysis) on a number of intelligence data that are the object of research studies in Japan. R.H. Sianipar has a number of Japanese patents, and has written a number of national / international scientific articles, and dozens of national books. R.H. Sianipar has also participated in a number of workshops related to cryptography, cryptanalysis, digital watermarking, and digital forensics. In a number of workshops, R.H. Sianipar helps Prof. Hidetoshi Miike to create applications related to digital image / video processing, steganography, cryptography, watermarking, non-linear screening, intelligent descriptor-based computer vision, and others, which are used as training materials. Field of interest in the study of R.H. Sianipar is multimedia security, signal processing / digital image / video, cryptography, digital communication, digital forensics, and data compression / coding. Until now, R.H. Sianipar continues to develop applications related to analysis of signal, image, and digital video, both for research purposes and for commercial purposes based on the Python programming language, MATLAB, C ++, C, VB.NET, C # .NET, R, and Java.

Novērtējiet šo e-grāmatu

Izsakiet savu viedokli!

Informācija lasīšanai

Viedtālruņi un planšetdatori
Instalējiet lietotni Google Play grāmatas Android ierīcēm un iPad planšetdatoriem/iPhone tālruņiem. Lietotne tiks automātiski sinhronizēta ar jūsu kontu un ļaus lasīt saturu tiešsaistē vai bezsaistē neatkarīgi no jūsu atrašanās vietas.
Klēpjdatori un galddatori
Varat klausīties pakalpojumā Google Play iegādātās audiogrāmatas, izmantojot datora tīmekļa pārlūkprogrammu.
E-lasītāji un citas ierīces
Lai lasītu grāmatas tādās elektroniskās tintes ierīcēs kā Kobo e-lasītāji, nepieciešams lejupielādēt failu un pārsūtīt to uz savu ierīci. Izpildiet palīdzības centrā sniegtos detalizētos norādījumus, lai pārsūtītu failus uz atbalstītiem e-lasītājiem.