diff --git a/lab-dw-aggregating.ipynb b/lab-dw-aggregating.ipynb
index fadd718..6b1fd95 100644
--- a/lab-dw-aggregating.ipynb
+++ b/lab-dw-aggregating.ipynb
@@ -1,165 +1,1766 @@
{
- "cells": [
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "31969215-2a90-4d8b-ac36-646a7ae13744",
+ "metadata": {
+ "id": "31969215-2a90-4d8b-ac36-646a7ae13744"
+ },
+ "source": [
+ "# Lab | Data Aggregation and Filtering"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d",
+ "metadata": {
+ "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d"
+ },
+ "source": [
+ "In this challenge, we will continue to work with customer data from an insurance company. We will use the dataset called marketing_customer_analysis.csv, which can be found at the following link:\n",
+ "\n",
+ "https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv\n",
+ "\n",
+ "This dataset contains information such as customer demographics, policy details, vehicle information, and the customer's response to the last marketing campaign. Our goal is to explore and analyze this data by first performing data cleaning, formatting, and structuring."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50",
+ "metadata": {
+ "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50"
+ },
+ "source": [
+ "1. Create a new DataFrame that only includes customers who:\n",
+ " - have a **low total_claim_amount** (e.g., below $1,000),\n",
+ " - have a response \"Yes\" to the last marketing campaign."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b9be383e-5165-436e-80c8-57d4c757c8c3",
+ "metadata": {
+ "id": "b9be383e-5165-436e-80c8-57d4c757c8c3"
+ },
+ "source": [
+ "2. Using the original Dataframe, analyze:\n",
+ " - the average `monthly_premium` and/or customer lifetime value by `policy_type` and `gender` for customers who responded \"Yes\", and\n",
+ " - compare these insights to `total_claim_amount` patterns, and discuss which segments appear most profitable or low-risk for the company."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0",
+ "metadata": {
+ "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0"
+ },
+ "source": [
+ "3. Analyze the total number of customers who have policies in each state, and then filter the results to only include states where there are more than 500 customers."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d",
+ "metadata": {
+ "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d"
+ },
+ "source": [
+ "4. Find the maximum, minimum, and median customer lifetime value by education level and gender. Write your conclusions."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "42bf6a1a-a6ec-4a0b-8721-e954bdac8898",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "e6107ca1-e917-4cd4-a35b-41bf8b23059a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "url = \"https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv\"\n",
+ "df = pd.read_csv(url)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "19ce8430-583e-4fc0-8e06-dfcd9621376e",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "markdown",
- "id": "31969215-2a90-4d8b-ac36-646a7ae13744",
- "metadata": {
- "id": "31969215-2a90-4d8b-ac36-646a7ae13744"
- },
- "source": [
- "# Lab | Data Aggregation and Filtering"
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " Unnamed: 0 | \n",
+ " Customer | \n",
+ " State | \n",
+ " Customer Lifetime Value | \n",
+ " Response | \n",
+ " Coverage | \n",
+ " Education | \n",
+ " Effective To Date | \n",
+ " EmploymentStatus | \n",
+ " Gender | \n",
+ " ... | \n",
+ " Number of Open Complaints | \n",
+ " Number of Policies | \n",
+ " Policy Type | \n",
+ " Policy | \n",
+ " Renew Offer Type | \n",
+ " Sales Channel | \n",
+ " Total Claim Amount | \n",
+ " Vehicle Class | \n",
+ " Vehicle Size | \n",
+ " Vehicle Type | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " 0 | \n",
+ " DK49336 | \n",
+ " Arizona | \n",
+ " 4809.216960 | \n",
+ " No | \n",
+ " Basic | \n",
+ " College | \n",
+ " 2/18/11 | \n",
+ " Employed | \n",
+ " M | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 9 | \n",
+ " Corporate Auto | \n",
+ " Corporate L3 | \n",
+ " Offer3 | \n",
+ " Agent | \n",
+ " 292.800000 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " 1 | \n",
+ " KX64629 | \n",
+ " California | \n",
+ " 2228.525238 | \n",
+ " No | \n",
+ " Basic | \n",
+ " College | \n",
+ " 1/18/11 | \n",
+ " Unemployed | \n",
+ " F | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 1 | \n",
+ " Personal Auto | \n",
+ " Personal L3 | \n",
+ " Offer4 | \n",
+ " Call Center | \n",
+ " 744.924331 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " 2 | \n",
+ " LZ68649 | \n",
+ " Washington | \n",
+ " 14947.917300 | \n",
+ " No | \n",
+ " Basic | \n",
+ " Bachelor | \n",
+ " 2/10/11 | \n",
+ " Employed | \n",
+ " M | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 2 | \n",
+ " Personal Auto | \n",
+ " Personal L3 | \n",
+ " Offer3 | \n",
+ " Call Center | \n",
+ " 480.000000 | \n",
+ " SUV | \n",
+ " Medsize | \n",
+ " A | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " 3 | \n",
+ " XL78013 | \n",
+ " Oregon | \n",
+ " 22332.439460 | \n",
+ " Yes | \n",
+ " Extended | \n",
+ " College | \n",
+ " 1/11/11 | \n",
+ " Employed | \n",
+ " M | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 2 | \n",
+ " Corporate Auto | \n",
+ " Corporate L3 | \n",
+ " Offer2 | \n",
+ " Branch | \n",
+ " 484.013411 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " A | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " 4 | \n",
+ " QA50777 | \n",
+ " Oregon | \n",
+ " 9025.067525 | \n",
+ " No | \n",
+ " Premium | \n",
+ " Bachelor | \n",
+ " 1/17/11 | \n",
+ " Medical Leave | \n",
+ " F | \n",
+ " ... | \n",
+ " NaN | \n",
+ " 7 | \n",
+ " Personal Auto | \n",
+ " Personal L2 | \n",
+ " Offer1 | \n",
+ " Branch | \n",
+ " 707.925645 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
5 rows × 26 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Unnamed: 0 Customer State Customer Lifetime Value Response \\\n",
+ "0 0 DK49336 Arizona 4809.216960 No \n",
+ "1 1 KX64629 California 2228.525238 No \n",
+ "2 2 LZ68649 Washington 14947.917300 No \n",
+ "3 3 XL78013 Oregon 22332.439460 Yes \n",
+ "4 4 QA50777 Oregon 9025.067525 No \n",
+ "\n",
+ " Coverage Education Effective To Date EmploymentStatus Gender ... \\\n",
+ "0 Basic College 2/18/11 Employed M ... \n",
+ "1 Basic College 1/18/11 Unemployed F ... \n",
+ "2 Basic Bachelor 2/10/11 Employed M ... \n",
+ "3 Extended College 1/11/11 Employed M ... \n",
+ "4 Premium Bachelor 1/17/11 Medical Leave F ... \n",
+ "\n",
+ " Number of Open Complaints Number of Policies Policy Type Policy \\\n",
+ "0 0.0 9 Corporate Auto Corporate L3 \n",
+ "1 0.0 1 Personal Auto Personal L3 \n",
+ "2 0.0 2 Personal Auto Personal L3 \n",
+ "3 0.0 2 Corporate Auto Corporate L3 \n",
+ "4 NaN 7 Personal Auto Personal L2 \n",
+ "\n",
+ " Renew Offer Type Sales Channel Total Claim Amount Vehicle Class \\\n",
+ "0 Offer3 Agent 292.800000 Four-Door Car \n",
+ "1 Offer4 Call Center 744.924331 Four-Door Car \n",
+ "2 Offer3 Call Center 480.000000 SUV \n",
+ "3 Offer2 Branch 484.013411 Four-Door Car \n",
+ "4 Offer1 Branch 707.925645 Four-Door Car \n",
+ "\n",
+ " Vehicle Size Vehicle Type \n",
+ "0 Medsize NaN \n",
+ "1 Medsize NaN \n",
+ "2 Medsize A \n",
+ "3 Medsize A \n",
+ "4 Medsize NaN \n",
+ "\n",
+ "[5 rows x 26 columns]"
]
- },
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "2d87774c-4380-466c-95a6-999cb3fb4135",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "markdown",
- "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d",
- "metadata": {
- "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d"
- },
- "source": [
- "In this challenge, we will continue to work with customer data from an insurance company. We will use the dataset called marketing_customer_analysis.csv, which can be found at the following link:\n",
- "\n",
- "https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv\n",
- "\n",
- "This dataset contains information such as customer demographics, policy details, vehicle information, and the customer's response to the last marketing campaign. Our goal is to explore and analyze this data by first performing data cleaning, formatting, and structuring."
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " Customer | \n",
+ " State | \n",
+ " Customer Lifetime Value | \n",
+ " Response | \n",
+ " Coverage | \n",
+ " Education | \n",
+ " Effective To Date | \n",
+ " EmploymentStatus | \n",
+ " Gender | \n",
+ " Income | \n",
+ " ... | \n",
+ " Number of Open Complaints | \n",
+ " Number of Policies | \n",
+ " Policy Type | \n",
+ " Policy | \n",
+ " Renew Offer Type | \n",
+ " Sales Channel | \n",
+ " Total Claim Amount | \n",
+ " Vehicle Class | \n",
+ " Vehicle Size | \n",
+ " Vehicle Type | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " DK49336 | \n",
+ " Arizona | \n",
+ " 4809.216960 | \n",
+ " No | \n",
+ " Basic | \n",
+ " College | \n",
+ " 2/18/11 | \n",
+ " Employed | \n",
+ " M | \n",
+ " 48029 | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 9 | \n",
+ " Corporate Auto | \n",
+ " Corporate L3 | \n",
+ " Offer3 | \n",
+ " Agent | \n",
+ " 292.800000 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " KX64629 | \n",
+ " California | \n",
+ " 2228.525238 | \n",
+ " No | \n",
+ " Basic | \n",
+ " College | \n",
+ " 1/18/11 | \n",
+ " Unemployed | \n",
+ " F | \n",
+ " 0 | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 1 | \n",
+ " Personal Auto | \n",
+ " Personal L3 | \n",
+ " Offer4 | \n",
+ " Call Center | \n",
+ " 744.924331 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " LZ68649 | \n",
+ " Washington | \n",
+ " 14947.917300 | \n",
+ " No | \n",
+ " Basic | \n",
+ " Bachelor | \n",
+ " 2/10/11 | \n",
+ " Employed | \n",
+ " M | \n",
+ " 22139 | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 2 | \n",
+ " Personal Auto | \n",
+ " Personal L3 | \n",
+ " Offer3 | \n",
+ " Call Center | \n",
+ " 480.000000 | \n",
+ " SUV | \n",
+ " Medsize | \n",
+ " A | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " XL78013 | \n",
+ " Oregon | \n",
+ " 22332.439460 | \n",
+ " Yes | \n",
+ " Extended | \n",
+ " College | \n",
+ " 1/11/11 | \n",
+ " Employed | \n",
+ " M | \n",
+ " 49078 | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 2 | \n",
+ " Corporate Auto | \n",
+ " Corporate L3 | \n",
+ " Offer2 | \n",
+ " Branch | \n",
+ " 484.013411 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " A | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " QA50777 | \n",
+ " Oregon | \n",
+ " 9025.067525 | \n",
+ " No | \n",
+ " Premium | \n",
+ " Bachelor | \n",
+ " 1/17/11 | \n",
+ " Medical Leave | \n",
+ " F | \n",
+ " 23675 | \n",
+ " ... | \n",
+ " NaN | \n",
+ " 7 | \n",
+ " Personal Auto | \n",
+ " Personal L2 | \n",
+ " Offer1 | \n",
+ " Branch | \n",
+ " 707.925645 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
5 rows × 25 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Customer State Customer Lifetime Value Response Coverage Education \\\n",
+ "0 DK49336 Arizona 4809.216960 No Basic College \n",
+ "1 KX64629 California 2228.525238 No Basic College \n",
+ "2 LZ68649 Washington 14947.917300 No Basic Bachelor \n",
+ "3 XL78013 Oregon 22332.439460 Yes Extended College \n",
+ "4 QA50777 Oregon 9025.067525 No Premium Bachelor \n",
+ "\n",
+ " Effective To Date EmploymentStatus Gender Income ... \\\n",
+ "0 2/18/11 Employed M 48029 ... \n",
+ "1 1/18/11 Unemployed F 0 ... \n",
+ "2 2/10/11 Employed M 22139 ... \n",
+ "3 1/11/11 Employed M 49078 ... \n",
+ "4 1/17/11 Medical Leave F 23675 ... \n",
+ "\n",
+ " Number of Open Complaints Number of Policies Policy Type Policy \\\n",
+ "0 0.0 9 Corporate Auto Corporate L3 \n",
+ "1 0.0 1 Personal Auto Personal L3 \n",
+ "2 0.0 2 Personal Auto Personal L3 \n",
+ "3 0.0 2 Corporate Auto Corporate L3 \n",
+ "4 NaN 7 Personal Auto Personal L2 \n",
+ "\n",
+ " Renew Offer Type Sales Channel Total Claim Amount Vehicle Class \\\n",
+ "0 Offer3 Agent 292.800000 Four-Door Car \n",
+ "1 Offer4 Call Center 744.924331 Four-Door Car \n",
+ "2 Offer3 Call Center 480.000000 SUV \n",
+ "3 Offer2 Branch 484.013411 Four-Door Car \n",
+ "4 Offer1 Branch 707.925645 Four-Door Car \n",
+ "\n",
+ " Vehicle Size Vehicle Type \n",
+ "0 Medsize NaN \n",
+ "1 Medsize NaN \n",
+ "2 Medsize A \n",
+ "3 Medsize A \n",
+ "4 Medsize NaN \n",
+ "\n",
+ "[5 rows x 25 columns]"
]
- },
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df = df.drop(columns=[\"Unnamed: 0\"])\n",
+ "\n",
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "bb7f8f40-5d0c-464c-a88c-aa8c49afd175",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "markdown",
- "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50",
- "metadata": {
- "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50"
- },
- "source": [
- "1. Create a new DataFrame that only includes customers who:\n",
- " - have a **low total_claim_amount** (e.g., below $1,000),\n",
- " - have a response \"Yes\" to the last marketing campaign."
- ]
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "1399\n"
+ ]
},
{
- "cell_type": "markdown",
- "id": "b9be383e-5165-436e-80c8-57d4c757c8c3",
- "metadata": {
- "id": "b9be383e-5165-436e-80c8-57d4c757c8c3"
- },
- "source": [
- "2. Using the original Dataframe, analyze:\n",
- " - the average `monthly_premium` and/or customer lifetime value by `policy_type` and `gender` for customers who responded \"Yes\", and\n",
- " - compare these insights to `total_claim_amount` patterns, and discuss which segments appear most profitable or low-risk for the company."
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " Customer | \n",
+ " State | \n",
+ " Customer Lifetime Value | \n",
+ " Response | \n",
+ " Coverage | \n",
+ " Education | \n",
+ " Effective To Date | \n",
+ " EmploymentStatus | \n",
+ " Gender | \n",
+ " Income | \n",
+ " ... | \n",
+ " Number of Open Complaints | \n",
+ " Number of Policies | \n",
+ " Policy Type | \n",
+ " Policy | \n",
+ " Renew Offer Type | \n",
+ " Sales Channel | \n",
+ " Total Claim Amount | \n",
+ " Vehicle Class | \n",
+ " Vehicle Size | \n",
+ " Vehicle Type | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 3 | \n",
+ " XL78013 | \n",
+ " Oregon | \n",
+ " 22332.439460 | \n",
+ " Yes | \n",
+ " Extended | \n",
+ " College | \n",
+ " 1/11/11 | \n",
+ " Employed | \n",
+ " M | \n",
+ " 49078 | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 2 | \n",
+ " Corporate Auto | \n",
+ " Corporate L3 | \n",
+ " Offer2 | \n",
+ " Branch | \n",
+ " 484.013411 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " A | \n",
+ "
\n",
+ " \n",
+ " | 8 | \n",
+ " FM55990 | \n",
+ " California | \n",
+ " 5989.773931 | \n",
+ " Yes | \n",
+ " Premium | \n",
+ " College | \n",
+ " 1/19/11 | \n",
+ " Employed | \n",
+ " M | \n",
+ " 66839 | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 1 | \n",
+ " Personal Auto | \n",
+ " Personal L1 | \n",
+ " Offer2 | \n",
+ " Branch | \n",
+ " 739.200000 | \n",
+ " Sports Car | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ " | 15 | \n",
+ " CW49887 | \n",
+ " California | \n",
+ " 4626.801093 | \n",
+ " Yes | \n",
+ " Basic | \n",
+ " Master | \n",
+ " 1/16/11 | \n",
+ " Employed | \n",
+ " F | \n",
+ " 79487 | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 1 | \n",
+ " Special Auto | \n",
+ " Special L1 | \n",
+ " Offer2 | \n",
+ " Branch | \n",
+ " 547.200000 | \n",
+ " SUV | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ " | 19 | \n",
+ " NJ54277 | \n",
+ " California | \n",
+ " 3746.751625 | \n",
+ " Yes | \n",
+ " Extended | \n",
+ " College | \n",
+ " 2/26/11 | \n",
+ " Employed | \n",
+ " F | \n",
+ " 41479 | \n",
+ " ... | \n",
+ " 1.0 | \n",
+ " 1 | \n",
+ " Personal Auto | \n",
+ " Personal L2 | \n",
+ " Offer2 | \n",
+ " Call Center | \n",
+ " 19.575683 | \n",
+ " Two-Door Car | \n",
+ " Large | \n",
+ " A | \n",
+ "
\n",
+ " \n",
+ " | 27 | \n",
+ " MQ68407 | \n",
+ " Oregon | \n",
+ " 4376.363592 | \n",
+ " Yes | \n",
+ " Premium | \n",
+ " Bachelor | \n",
+ " 2/28/11 | \n",
+ " Employed | \n",
+ " F | \n",
+ " 63774 | \n",
+ " ... | \n",
+ " 0.0 | \n",
+ " 1 | \n",
+ " Personal Auto | \n",
+ " Personal L3 | \n",
+ " Offer2 | \n",
+ " Agent | \n",
+ " 60.036683 | \n",
+ " Four-Door Car | \n",
+ " Medsize | \n",
+ " NaN | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
5 rows × 25 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Customer State Customer Lifetime Value Response Coverage Education \\\n",
+ "3 XL78013 Oregon 22332.439460 Yes Extended College \n",
+ "8 FM55990 California 5989.773931 Yes Premium College \n",
+ "15 CW49887 California 4626.801093 Yes Basic Master \n",
+ "19 NJ54277 California 3746.751625 Yes Extended College \n",
+ "27 MQ68407 Oregon 4376.363592 Yes Premium Bachelor \n",
+ "\n",
+ " Effective To Date EmploymentStatus Gender Income ... \\\n",
+ "3 1/11/11 Employed M 49078 ... \n",
+ "8 1/19/11 Employed M 66839 ... \n",
+ "15 1/16/11 Employed F 79487 ... \n",
+ "19 2/26/11 Employed F 41479 ... \n",
+ "27 2/28/11 Employed F 63774 ... \n",
+ "\n",
+ " Number of Open Complaints Number of Policies Policy Type Policy \\\n",
+ "3 0.0 2 Corporate Auto Corporate L3 \n",
+ "8 0.0 1 Personal Auto Personal L1 \n",
+ "15 0.0 1 Special Auto Special L1 \n",
+ "19 1.0 1 Personal Auto Personal L2 \n",
+ "27 0.0 1 Personal Auto Personal L3 \n",
+ "\n",
+ " Renew Offer Type Sales Channel Total Claim Amount Vehicle Class \\\n",
+ "3 Offer2 Branch 484.013411 Four-Door Car \n",
+ "8 Offer2 Branch 739.200000 Sports Car \n",
+ "15 Offer2 Branch 547.200000 SUV \n",
+ "19 Offer2 Call Center 19.575683 Two-Door Car \n",
+ "27 Offer2 Agent 60.036683 Four-Door Car \n",
+ "\n",
+ " Vehicle Size Vehicle Type \n",
+ "3 Medsize A \n",
+ "8 Medsize NaN \n",
+ "15 Medsize NaN \n",
+ "19 Large A \n",
+ "27 Medsize NaN \n",
+ "\n",
+ "[5 rows x 25 columns]"
]
- },
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "low_claim_yes = df[\n",
+ " (df[\"Total Claim Amount\"] < 1000) &\n",
+ " (df[\"Response\"] == \"Yes\")\n",
+ "].copy()\n",
+ "\n",
+ "print( low_claim_yes.shape[0])\n",
+ "low_claim_yes.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "15c82700-d6c8-4ef8-8e14-55d6d4a923c8",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "markdown",
- "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0",
- "metadata": {
- "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0"
- },
- "source": [
- "3. Analyze the total number of customers who have policies in each state, and then filter the results to only include states where there are more than 500 customers."
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " Policy Type | \n",
+ " Gender | \n",
+ " Monthly Premium Auto | \n",
+ " Customer Lifetime Value | \n",
+ " Total Claim Amount | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " Corporate Auto | \n",
+ " F | \n",
+ " 94.30 | \n",
+ " 7712.63 | \n",
+ " 433.74 | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Corporate Auto | \n",
+ " M | \n",
+ " 92.19 | \n",
+ " 7944.47 | \n",
+ " 408.58 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Personal Auto | \n",
+ " F | \n",
+ " 99.00 | \n",
+ " 8339.79 | \n",
+ " 452.97 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Personal Auto | \n",
+ " M | \n",
+ " 91.09 | \n",
+ " 7448.38 | \n",
+ " 457.01 | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " Special Auto | \n",
+ " F | \n",
+ " 92.31 | \n",
+ " 7691.58 | \n",
+ " 453.28 | \n",
+ "
\n",
+ " \n",
+ " | 5 | \n",
+ " Special Auto | \n",
+ " M | \n",
+ " 86.34 | \n",
+ " 8247.09 | \n",
+ " 429.53 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Policy Type Gender Monthly Premium Auto Customer Lifetime Value \\\n",
+ "0 Corporate Auto F 94.30 7712.63 \n",
+ "1 Corporate Auto M 92.19 7944.47 \n",
+ "2 Personal Auto F 99.00 8339.79 \n",
+ "3 Personal Auto M 91.09 7448.38 \n",
+ "4 Special Auto F 92.31 7691.58 \n",
+ "5 Special Auto M 86.34 8247.09 \n",
+ "\n",
+ " Total Claim Amount \n",
+ "0 433.74 \n",
+ "1 408.58 \n",
+ "2 452.97 \n",
+ "3 457.01 \n",
+ "4 453.28 \n",
+ "5 429.53 "
]
- },
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "\n",
+ "df_yes = df[df[\"Response\"] == \"Yes\"].copy()\n",
+ "\n",
+ "\n",
+ "agg_yes = (\n",
+ " df_yes\n",
+ " .groupby([\"Policy Type\", \"Gender\"], as_index=False)\n",
+ " .agg({\n",
+ " \"Monthly Premium Auto\": \"mean\",\n",
+ " \"Customer Lifetime Value\": \"mean\",\n",
+ " \"Total Claim Amount\": \"mean\"\n",
+ " })\n",
+ ")\n",
+ "\n",
+ "\n",
+ "agg_yes[[\"Monthly Premium Auto\", \"Customer Lifetime Value\", \"Total Claim Amount\"]] = (\n",
+ " agg_yes[[\"Monthly Premium Auto\", \"Customer Lifetime Value\", \"Total Claim Amount\"]]\n",
+ " .round(2)\n",
+ ")\n",
+ "\n",
+ "\n",
+ "agg_yes"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "71269012-716a-4b7e-9b8e-f1bcbfe1892d",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "markdown",
- "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d",
- "metadata": {
- "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d"
- },
- "source": [
- "4. Find the maximum, minimum, and median customer lifetime value by education level and gender. Write your conclusions."
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " State | \n",
+ " Num_Customers | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 1 | \n",
+ " California | \n",
+ " 3552 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Oregon | \n",
+ " 2909 | \n",
+ "
\n",
+ " \n",
+ " | 0 | \n",
+ " Arizona | \n",
+ " 1937 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Nevada | \n",
+ " 993 | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " Washington | \n",
+ " 888 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " State Num_Customers\n",
+ "1 California 3552\n",
+ "3 Oregon 2909\n",
+ "0 Arizona 1937\n",
+ "2 Nevada 993\n",
+ "4 Washington 888"
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "cell_type": "markdown",
- "id": "b42999f9-311f-481e-ae63-40a5577072c5",
- "metadata": {
- "id": "b42999f9-311f-481e-ae63-40a5577072c5"
- },
- "source": [
- "## Bonus"
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " State | \n",
+ " Num_Customers | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 1 | \n",
+ " California | \n",
+ " 3552 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Oregon | \n",
+ " 2909 | \n",
+ "
\n",
+ " \n",
+ " | 0 | \n",
+ " Arizona | \n",
+ " 1937 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Nevada | \n",
+ " 993 | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " Washington | \n",
+ " 888 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " State Num_Customers\n",
+ "1 California 3552\n",
+ "3 Oregon 2909\n",
+ "0 Arizona 1937\n",
+ "2 Nevada 993\n",
+ "4 Washington 888"
]
- },
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "\n",
+ "customers_by_state = (\n",
+ " df.groupby(\"State\", as_index=False)\n",
+ " .agg({\"Customer\": \"count\"})\n",
+ " .rename(columns={\"Customer\": \"Num_Customers\"})\n",
+ " .sort_values(\"Num_Customers\", ascending=False)\n",
+ ")\n",
+ "\n",
+ "display(customers_by_state)\n",
+ "\n",
+ "\n",
+ "big_states = customers_by_state[customers_by_state[\"Num_Customers\"] > 500]\n",
+ "\n",
+ "\n",
+ "big_states"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "89a38067-1c77-49a7-b551-177e40017846",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "markdown",
- "id": "81ff02c5-6584-4f21-a358-b918697c6432",
- "metadata": {
- "id": "81ff02c5-6584-4f21-a358-b918697c6432"
- },
- "source": [
- "5. The marketing team wants to analyze the number of policies sold by state and month. Present the data in a table where the months are arranged as columns and the states are arranged as rows."
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " Education | \n",
+ " Gender | \n",
+ " max | \n",
+ " min | \n",
+ " median | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " Bachelor | \n",
+ " F | \n",
+ " 73225.96 | \n",
+ " 1904.00 | \n",
+ " 5640.51 | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Bachelor | \n",
+ " M | \n",
+ " 67907.27 | \n",
+ " 1898.01 | \n",
+ " 5548.03 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " College | \n",
+ " F | \n",
+ " 61850.19 | \n",
+ " 1898.68 | \n",
+ " 5623.61 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " College | \n",
+ " M | \n",
+ " 61134.68 | \n",
+ " 1918.12 | \n",
+ " 6005.85 | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " Doctor | \n",
+ " F | \n",
+ " 44856.11 | \n",
+ " 2395.57 | \n",
+ " 5332.46 | \n",
+ "
\n",
+ " \n",
+ " | 5 | \n",
+ " Doctor | \n",
+ " M | \n",
+ " 32677.34 | \n",
+ " 2267.60 | \n",
+ " 5577.67 | \n",
+ "
\n",
+ " \n",
+ " | 6 | \n",
+ " High School or Below | \n",
+ " F | \n",
+ " 55277.45 | \n",
+ " 2144.92 | \n",
+ " 6039.55 | \n",
+ "
\n",
+ " \n",
+ " | 7 | \n",
+ " High School or Below | \n",
+ " M | \n",
+ " 83325.38 | \n",
+ " 1940.98 | \n",
+ " 6286.73 | \n",
+ "
\n",
+ " \n",
+ " | 8 | \n",
+ " Master | \n",
+ " F | \n",
+ " 51016.07 | \n",
+ " 2417.78 | \n",
+ " 5729.86 | \n",
+ "
\n",
+ " \n",
+ " | 9 | \n",
+ " Master | \n",
+ " M | \n",
+ " 50568.26 | \n",
+ " 2272.31 | \n",
+ " 5579.10 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Education Gender max min median\n",
+ "0 Bachelor F 73225.96 1904.00 5640.51\n",
+ "1 Bachelor M 67907.27 1898.01 5548.03\n",
+ "2 College F 61850.19 1898.68 5623.61\n",
+ "3 College M 61134.68 1918.12 6005.85\n",
+ "4 Doctor F 44856.11 2395.57 5332.46\n",
+ "5 Doctor M 32677.34 2267.60 5577.67\n",
+ "6 High School or Below F 55277.45 2144.92 6039.55\n",
+ "7 High School or Below M 83325.38 1940.98 6286.73\n",
+ "8 Master F 51016.07 2417.78 5729.86\n",
+ "9 Master M 50568.26 2272.31 5579.10"
]
- },
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "clv_stats = (\n",
+ " df.groupby([\"Education\", \"Gender\"])[\"Customer Lifetime Value\"]\n",
+ " .agg([\"max\", \"min\", \"median\"])\n",
+ " .round(2)\n",
+ " .reset_index()\n",
+ ")\n",
+ "\n",
+ "\n",
+ "clv_stats"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b42999f9-311f-481e-ae63-40a5577072c5",
+ "metadata": {
+ "id": "b42999f9-311f-481e-ae63-40a5577072c5"
+ },
+ "source": [
+ "## Bonus"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "81ff02c5-6584-4f21-a358-b918697c6432",
+ "metadata": {
+ "id": "81ff02c5-6584-4f21-a358-b918697c6432"
+ },
+ "source": [
+ "5. The marketing team wants to analyze the number of policies sold by state and month. Present the data in a table where the months are arranged as columns and the states are arranged as rows."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b6aec097-c633-4017-a125-e77a97259cda",
+ "metadata": {
+ "id": "b6aec097-c633-4017-a125-e77a97259cda"
+ },
+ "source": [
+ "6. Display a new DataFrame that contains the number of policies sold by month, by state, for the top 3 states with the highest number of policies sold.\n",
+ "\n",
+ "*Hint:*\n",
+ "- *To accomplish this, you will first need to group the data by state and month, then count the number of policies sold for each group. Afterwards, you will need to sort the data by the count of policies sold in descending order.*\n",
+ "- *Next, you will select the top 3 states with the highest number of policies sold.*\n",
+ "- *Finally, you will create a new DataFrame that contains the number of policies sold by month for each of the top 3 states.*"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009",
+ "metadata": {
+ "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009"
+ },
+ "source": [
+ "7. The marketing team wants to analyze the effect of different marketing channels on the customer response rate.\n",
+ "\n",
+ "Hint: You can use melt to unpivot the data and create a table that shows the customer response rate (those who responded \"Yes\") by marketing channel."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d",
+ "metadata": {
+ "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d"
+ },
+ "source": [
+ "External Resources for Data Filtering: https://towardsdatascience.com/filtering-data-frames-in-pandas-b570b1f834b9"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "id": "449513f4-0459-46a0-a18d-9398d974c9ad",
+ "metadata": {
+ "id": "449513f4-0459-46a0-a18d-9398d974c9ad"
+ },
+ "outputs": [],
+ "source": [
+ "df[\"Effective To Date\"] = pd.to_datetime(df[\"Effective To Date\"], format=\"%m/%d/%y\")\n",
+ "df[\"Month\"] = df[\"Effective To Date\"].dt.month_name()\n",
+ "\n",
+ "month_order = [\"January\", \"February\", \"March\", \"April\", \"May\", \"June\",\n",
+ " \"July\", \"August\", \"September\", \"October\", \"November\", \"December\"]\n",
+ "df[\"Month\"] = pd.Categorical(df[\"Month\"], month_order, ordered=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "1d329432-4251-461e-a808-2828c0d7b936",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "markdown",
- "id": "b6aec097-c633-4017-a125-e77a97259cda",
- "metadata": {
- "id": "b6aec097-c633-4017-a125-e77a97259cda"
- },
- "source": [
- "6. Display a new DataFrame that contains the number of policies sold by month, by state, for the top 3 states with the highest number of policies sold.\n",
- "\n",
- "*Hint:*\n",
- "- *To accomplish this, you will first need to group the data by state and month, then count the number of policies sold for each group. Afterwards, you will need to sort the data by the count of policies sold in descending order.*\n",
- "- *Next, you will select the top 3 states with the highest number of policies sold.*\n",
- "- *Finally, you will create a new DataFrame that contains the number of policies sold by month for each of the top 3 states.*"
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | Month | \n",
+ " January | \n",
+ " February | \n",
+ " March | \n",
+ " April | \n",
+ " May | \n",
+ " June | \n",
+ " July | \n",
+ " August | \n",
+ " September | \n",
+ " October | \n",
+ " November | \n",
+ " December | \n",
+ "
\n",
+ " \n",
+ " | State | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | Arizona | \n",
+ " 3052 | \n",
+ " 2864 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | California | \n",
+ " 5673 | \n",
+ " 4929 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | Nevada | \n",
+ " 1493 | \n",
+ " 1278 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | Oregon | \n",
+ " 4697 | \n",
+ " 3969 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | Washington | \n",
+ " 1358 | \n",
+ " 1225 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ "Month January February March April May June July August \\\n",
+ "State \n",
+ "Arizona 3052 2864 0 0 0 0 0 0 \n",
+ "California 5673 4929 0 0 0 0 0 0 \n",
+ "Nevada 1493 1278 0 0 0 0 0 0 \n",
+ "Oregon 4697 3969 0 0 0 0 0 0 \n",
+ "Washington 1358 1225 0 0 0 0 0 0 \n",
+ "\n",
+ "Month September October November December \n",
+ "State \n",
+ "Arizona 0 0 0 0 \n",
+ "California 0 0 0 0 \n",
+ "Nevada 0 0 0 0 \n",
+ "Oregon 0 0 0 0 \n",
+ "Washington 0 0 0 0 "
]
- },
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "\n",
+ "policies_by_state_month = (\n",
+ " df.groupby([\"State\", \"Month\"], as_index=False, observed=False)[\"Number of Policies\"]\n",
+ " .sum()\n",
+ " .rename(columns={\"Number of Policies\": \"Policies_Sold\"})\n",
+ ")\n",
+ "\n",
+ "pivot_state_month = (\n",
+ " policies_by_state_month\n",
+ " .pivot(index=\"State\", columns=\"Month\", values=\"Policies_Sold\")\n",
+ " .fillna(0)\n",
+ " .astype(int)\n",
+ ")\n",
+ "\n",
+ "pivot_state_month"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "b403ccd0-b7e9-4cf5-9370-8173d5dd459b",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "markdown",
- "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009",
- "metadata": {
- "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009"
- },
- "source": [
- "7. The marketing team wants to analyze the effect of different marketing channels on the customer response rate.\n",
- "\n",
- "Hint: You can use melt to unpivot the data and create a table that shows the customer response rate (those who responded \"Yes\") by marketing channel."
+ "data": {
+ "text/plain": [
+ "State\n",
+ "California 10602\n",
+ "Oregon 8666\n",
+ "Arizona 5916\n",
+ "Nevada 2771\n",
+ "Washington 2583\n",
+ "Name: Policies_Sold, dtype: int64"
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "cell_type": "markdown",
- "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d",
- "metadata": {
- "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d"
- },
- "source": [
- "External Resources for Data Filtering: https://towardsdatascience.com/filtering-data-frames-in-pandas-b570b1f834b9"
- ]
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "['California', 'Oregon', 'Arizona']\n"
+ ]
},
{
- "cell_type": "code",
- "execution_count": null,
- "id": "449513f4-0459-46a0-a18d-9398d974c9ad",
- "metadata": {
- "id": "449513f4-0459-46a0-a18d-9398d974c9ad"
- },
- "outputs": [],
- "source": [
- "# your code goes here"
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | Month | \n",
+ " January | \n",
+ " February | \n",
+ " March | \n",
+ " April | \n",
+ " May | \n",
+ " June | \n",
+ " July | \n",
+ " August | \n",
+ " September | \n",
+ " October | \n",
+ " November | \n",
+ " December | \n",
+ "
\n",
+ " \n",
+ " | State | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | Arizona | \n",
+ " 3052 | \n",
+ " 2864 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | California | \n",
+ " 5673 | \n",
+ " 4929 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | Oregon | \n",
+ " 4697 | \n",
+ " 3969 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ "Month January February March April May June July August \\\n",
+ "State \n",
+ "Arizona 3052 2864 0 0 0 0 0 0 \n",
+ "California 5673 4929 0 0 0 0 0 0 \n",
+ "Oregon 4697 3969 0 0 0 0 0 0 \n",
+ "\n",
+ "Month September October November December \n",
+ "State \n",
+ "Arizona 0 0 0 0 \n",
+ "California 0 0 0 0 \n",
+ "Oregon 0 0 0 0 "
]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
}
- ],
- "metadata": {
- "colab": {
- "provenance": []
- },
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.9.13"
+ ],
+ "source": [
+ "\n",
+ "total_policies_by_state = (\n",
+ " policies_by_state_month.groupby(\"State\")[\"Policies_Sold\"]\n",
+ " .sum()\n",
+ " .sort_values(ascending=False)\n",
+ ")\n",
+ "\n",
+ "\n",
+ "display(total_policies_by_state.head(10))\n",
+ "\n",
+ "\n",
+ "top3_states = total_policies_by_state.head(3).index.tolist()\n",
+ "print(top3_states)\n",
+ "\n",
+ "\n",
+ "top3_policies = policies_by_state_month[\n",
+ " policies_by_state_month[\"State\"].isin(top3_states)\n",
+ "]\n",
+ "\n",
+ "\n",
+ "pivot_top3 = (\n",
+ " top3_policies\n",
+ " .pivot(index=\"State\", columns=\"Month\", values=\"Policies_Sold\")\n",
+ " .fillna(0)\n",
+ " .astype(int)\n",
+ ")\n",
+ "\n",
+ "pivot_top3"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "045b04de-297e-4b6d-a8bc-b6d2b96fe54a",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " Sales Channel | \n",
+ " Response Rate (%) | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " Agent | \n",
+ " 18.01 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Web | \n",
+ " 10.89 | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Branch | \n",
+ " 10.79 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Call Center | \n",
+ " 10.32 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Sales Channel Response Rate (%)\n",
+ "0 Agent 18.01\n",
+ "3 Web 10.89\n",
+ "1 Branch 10.79\n",
+ "2 Call Center 10.32"
+ ]
+ },
+ "execution_count": 14,
+ "metadata": {},
+ "output_type": "execute_result"
}
+ ],
+ "source": [
+ "\n",
+ "df[\"Response_Flag\"] = (df[\"Response\"] == \"Yes\").astype(int)\n",
+ "\n",
+ "\n",
+ "response_rate = (\n",
+ " df.groupby(\"Sales Channel\")[\"Response_Flag\"]\n",
+ " .mean()\n",
+ " .mul(100) \n",
+ " .round(2)\n",
+ " .rename(\"Response Rate (%)\")\n",
+ " .reset_index()\n",
+ " .sort_values(\"Response Rate (%)\", ascending=False)\n",
+ ")\n",
+ "\n",
+ "\n",
+ "response_rate"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python [conda env:base] *",
+ "language": "python",
+ "name": "conda-base-py"
},
- "nbformat": 4,
- "nbformat_minor": 5
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.13.5"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
}