A Streamlit web application for exploring, visualizing, and analyzing datasets, with options to clean data and download reports as PDF or CSV.
-
Upload Data
- Upload CSV or Excel files.
- Automatically stores the dataset in the app session.
-
Data Summary
- View number of rows & columns.
- Descriptive statistics for all columns.
- Check missing values.
- Quick data cleaning: drop missing values or duplicates.
-
Visualization
-
Generate charts:
- Line, Bar, Scatter, Pie, Histogram, Box, Heatmap, Pairplot.
-
Option to use interactive Plotly charts or static Seaborn/Matplotlib charts.
-
Select axes and columns dynamically.
-
-
Advanced Analysis
- Correlation heatmap for numeric columns.
- Value counts for any column.
-
Download Report
-
Generate PDF report with:
- Dataset overview, missing values, descriptive stats, and a sample chart.
-
Download cleaned CSV dataset.
-
- Clone the repository
git clone <your-repo-url>
cd <your-repo-folder>
- Create a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate # Linux/macOS
venv\Scripts\activate # Windows
- Install dependencies
pip install streamlit pandas matplotlib seaborn plotly fpdf openpyxl
- Run the Streamlit app:
streamlit run app.py
-
Use the sidebar menu to navigate:
- Upload your dataset.
- View data summary & clean data.
- Generate visualizations.
- Perform advanced analysis.
- Download PDF/CSV report.
- CSV (
.csv
) - Excel (
.xlsx
)

- For large datasets, visualizations like pairplots may take time to render.
- PDF generation may truncate long text in tables.
This project is open-source and free to use.