Project NLP | Business Case: Automated Customer Reviews

The goal of this project is to develop a product review system. That can classify review sentiment to Positive/Neutral/Negative, cluster the reviews by product categories and write product summaries into recommendation articles.

For this project a dataset with Amazon reviews is used, more specifically reviews from the Video Games category. This data set can be found here 👉 https://amazon-reviews-2023.github.io/

Full project guidelines can be found here 👇
https://github.com/ironhack-labs/project-nlp-business-case-automated-customers-reviews-v2

📊 Best Results

Best-performing model for review sentiment classification was RoBERTa-base:

Accuracy: 0.875
F1 Score (Macro Avg): 0.734
Precision (Macro Avg): 0.719
Recall (Macro Avg): 0.766
Weighted F1 Score: 0.885

🔍 Clusters found

Keyboards & Mice
Headsets & Audio
Games
Controllers

🤗 Models and deployment on HuggingFace

👉 https://huggingface.co/spaces/DaanBooy/games_and_accessories_reviews 👈

What you can do:

Classification: Upload a review and see its sentiment classification (RoBERTa-base).
Clustering: Explore product category clusters.
Summaries: Read AI-generated articles highlighting top products, main complaints, and the worst product per category.

The model files generated in this project can be found here: 👇

⚙️ How to Run & Reproduce Results

Clone the repository

git clone <your-repo-url>
cd <repo-folder>

Install dependencies

pip install -r requirements.txt

Data Preparation
Run data_prep.ipynb to preprocess the Amazon Video Games review dataset.

Train Models

review_classification_dbu.ipynb → DistilBERT
review_classification_bbu.ipynb → BERT-base
review_classification_rb.ipynb → RoBERTa-base

Evaluate Models
Use model_comparison_viz.ipynb to visualize results (precision, recall, F1, confusion matrix).

Clustering

Run review_clustering_sample.ipynb for exploratory clustering.
Run review_clustering_full.ipynb to cluster the full dataset into 4 categories.

Summarization
Execute review_summarization.ipynb to generate product recommendation articles.

Deployment
Use model_deployment_HF.ipynb to deploy the system to Hugging Face Spaces.

Notes:
To ensure you have enough RAM to run notebooks, use Google Colab on GPU T4 (High-RAM).
To run review_summarization.ipynb, make sure you create a .env file in your project folder/Google Drive containing your OpenAI API key:

OPENAI_API_KEY=your_api_key_here

📂 Repository Contents

File Name	Description
`requirements.txt`	Lists all Python dependencies needed to run and reproduce the entire project
`data_prep.ipynb`	Used to prepare review data for the rest of the project
`review_classification_dbu.ipynb`	Notebook with distilbert-base-uncased model training
`review_classification_bbu.ipynb`	Notebook with bert-base-uncased model training
`review_classification_rb.ipynb`	Notebook with roberta-base model training
`model_comparison_viz.ipynb`	Notebook used to visualize model results
`review_clustering_sample.ipynb`	Notebook used to cluster sample of data to inspect dataset
`review_clustering_full.ipynb`	Notebook used to full cluster the entire dataset
`review_summarization.ipynb`	Notebook used to make the generative review summarization articles
`model_deployment_HF`	File used to deploy to HuggingFace space
`Project NLP _ Business Case_ Automated Customer Reviews - Final report.pdf`	PDF containing the final report on this project
`Presentation.pptx`	Powerpoint Presentation of this project

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project NLP | Business Case: Automated Customer Reviews

📊 Best Results

🔍 Clusters found

🤗 Models and deployment on HuggingFace

⚙️ How to Run & Reproduce Results

📂 Repository Contents

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Presentation.pptx		Presentation.pptx
Project NLP _ Business Case_ Automated Customer Reviews - Final report.pdf		Project NLP _ Business Case_ Automated Customer Reviews - Final report.pdf
README.md		README.md
data_prep.ipynb		data_prep.ipynb
model_comparison_viz.ipynb		model_comparison_viz.ipynb
model_deployment_HF.ipynb		model_deployment_HF.ipynb
requirements.txt		requirements.txt
review_classification_bbu.ipynb		review_classification_bbu.ipynb
review_classification_dbu.ipynb		review_classification_dbu.ipynb
review_classification_rb.ipynb		review_classification_rb.ipynb
review_clustering_full.ipynb		review_clustering_full.ipynb
review_clustering_sample.ipynb		review_clustering_sample.ipynb
review_summarization.ipynb		review_summarization.ipynb

DaanBooy/project-nlp-business-case-automated-customers-reviews

Folders and files

Latest commit

History

Repository files navigation

Project NLP | Business Case: Automated Customer Reviews

📊 Best Results

🔍 Clusters found

🤗 Models and deployment on HuggingFace

⚙️ How to Run & Reproduce Results

📂 Repository Contents

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages