-
Notifications
You must be signed in to change notification settings - Fork 0
TheoSimier/Moderation-online-content
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Project: Moderation of online content through Natural Language Processing, Machine Learning and Deep Learning The objective of this master project is to illustrate: how techniques of Natural Language Processing and Machine Learning can be used to gain meaningful information from text. During this thesis, I explain the most commonly used techniques and apply them on a concrete example: the creation of an algorithm capable of monitoring questions by checking if a question respects or not the terms and conditions of a question-and-answer website named Quora. I started by creating statistical features, normalizing the questions and transforming them into a format that classification algorithms can handle. Finally, I used a Logistic Regression, a Random Forest and a Deep Learning model to predict if the questions were compliant or not. The best algorithm reached an accuracy of 87%. The database can be found at the following url: https://www.kaggle.com/c/quora-insincere-questions-classification/data Glove embeddings can be retrieved at the following url: https://nlp.stanford.edu/projects/glove/ Details on the sources of the project can be found on the references of "Report - Moderation of online content.pdf" This master project was realized during my MSc in Data Analytics & Artificial Intelligence at EDHEC. Author: Theo Simier under the direction of: Prof. Dr. Christophe Croux
About
Moderation of online content through Natural Language Processing, Machine Learning and Deep Learning
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published