This project is an implementation of toxic comment classification challenge in Kaggle and it is hosted here as a consumable application.
This project has three parts
-
Preprocessing
-
Modeling
-
Dash Application
Removed stopwords, punctuations, blank lines and some urls, hyperlinks and IPs from the input texts. Used WordNetLemmatizer to lemmatize the words and used glove 100d word vectors as embeddings.
Built three different models using Keras. It includes a CNN, a RNN and a Naive Bayes SVM. These model outputs are stacked to get the final output.
A consumable UI is created using dash and is hosted in heroku.