diff --git a/Untitled.ipynb b/Untitled.ipynb new file mode 100644 index 0000000..363fcab --- /dev/null +++ b/Untitled.ipynb @@ -0,0 +1,6 @@ +{ + "cells": [], + "metadata": {}, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/lab-dw-aggregating.ipynb b/lab-dw-aggregating.ipynb index fadd718..87f75db 100644 --- a/lab-dw-aggregating.ipynb +++ b/lab-dw-aggregating.ipynb @@ -1,165 +1,2407 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "31969215-2a90-4d8b-ac36-646a7ae13744", - "metadata": { - "id": "31969215-2a90-4d8b-ac36-646a7ae13744" - }, - "source": [ - "# Lab | Data Aggregation and Filtering" + "cells": [ + { + "cell_type": "markdown", + "id": "31969215-2a90-4d8b-ac36-646a7ae13744", + "metadata": { + "id": "31969215-2a90-4d8b-ac36-646a7ae13744" + }, + "source": [ + "# Lab | Data Aggregation and Filtering" + ] + }, + { + "cell_type": "markdown", + "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d", + "metadata": { + "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d" + }, + "source": [ + "In this challenge, we will continue to work with customer data from an insurance company. We will use the dataset called marketing_customer_analysis.csv, which can be found at the following link:\n", + "\n", + "https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv\n", + "\n", + "This dataset contains information such as customer demographics, policy details, vehicle information, and the customer's response to the last marketing campaign. Our goal is to explore and analyze this data by first performing data cleaning, formatting, and structuring." + ] + }, + { + "cell_type": "markdown", + "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50", + "metadata": { + "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50" + }, + "source": [ + "1. Create a new DataFrame that only includes customers who:\n", + " - have a **low total_claim_amount** (e.g., below $1,000),\n", + " - have a response \"Yes\" to the last marketing campaign." + ] + }, + { + "cell_type": "markdown", + "id": "b9be383e-5165-436e-80c8-57d4c757c8c3", + "metadata": { + "id": "b9be383e-5165-436e-80c8-57d4c757c8c3" + }, + "source": [ + "2. Using the original Dataframe, analyze:\n", + " - the average `monthly_premium` and/or customer lifetime value by `policy_type` and `gender` for customers who responded \"Yes\", and\n", + " - compare these insights to `total_claim_amount` patterns, and discuss which segments appear most profitable or low-risk for the company." + ] + }, + { + "cell_type": "markdown", + "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0", + "metadata": { + "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0" + }, + "source": [ + "3. Analyze the total number of customers who have policies in each state, and then filter the results to only include states where there are more than 500 customers." + ] + }, + { + "cell_type": "markdown", + "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d", + "metadata": { + "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d" + }, + "source": [ + "4. Find the maximum, minimum, and median customer lifetime value by education level and gender. Write your conclusions." + ] + }, + { + "cell_type": "markdown", + "id": "b42999f9-311f-481e-ae63-40a5577072c5", + "metadata": { + "id": "b42999f9-311f-481e-ae63-40a5577072c5" + }, + "source": [ + "## Bonus" + ] + }, + { + "cell_type": "markdown", + "id": "81ff02c5-6584-4f21-a358-b918697c6432", + "metadata": { + "id": "81ff02c5-6584-4f21-a358-b918697c6432" + }, + "source": [ + "5. The marketing team wants to analyze the number of policies sold by state and month. Present the data in a table where the months are arranged as columns and the states are arranged as rows." + ] + }, + { + "cell_type": "markdown", + "id": "b6aec097-c633-4017-a125-e77a97259cda", + "metadata": { + "id": "b6aec097-c633-4017-a125-e77a97259cda" + }, + "source": [ + "6. Display a new DataFrame that contains the number of policies sold by month, by state, for the top 3 states with the highest number of policies sold.\n", + "\n", + "*Hint:*\n", + "- *To accomplish this, you will first need to group the data by state and month, then count the number of policies sold for each group. Afterwards, you will need to sort the data by the count of policies sold in descending order.*\n", + "- *Next, you will select the top 3 states with the highest number of policies sold.*\n", + "- *Finally, you will create a new DataFrame that contains the number of policies sold by month for each of the top 3 states.*" + ] + }, + { + "cell_type": "markdown", + "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009", + "metadata": { + "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009" + }, + "source": [ + "7. The marketing team wants to analyze the effect of different marketing channels on the customer response rate.\n", + "\n", + "Hint: You can use melt to unpivot the data and create a table that shows the customer response rate (those who responded \"Yes\") by marketing channel." + ] + }, + { + "cell_type": "markdown", + "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d", + "metadata": { + "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d" + }, + "source": [ + "External Resources for Data Filtering: https://towardsdatascience.com/filtering-data-frames-in-pandas-b570b1f834b9" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "449513f4-0459-46a0-a18d-9398d974c9ad", + "metadata": { + "id": "449513f4-0459-46a0-a18d-9398d974c9ad" + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Unnamed: 0CustomerStateCustomer Lifetime ValueResponseCoverageEducationEffective To DateEmploymentStatusGender...Number of Open ComplaintsNumber of PoliciesPolicy TypePolicyRenew Offer TypeSales ChannelTotal Claim AmountVehicle ClassVehicle SizeVehicle Type
00DK49336Arizona4809.216960NoBasicCollege2/18/11EmployedM...0.09Corporate AutoCorporate L3Offer3Agent292.800000Four-Door CarMedsizeNaN
11KX64629California2228.525238NoBasicCollege1/18/11UnemployedF...0.01Personal AutoPersonal L3Offer4Call Center744.924331Four-Door CarMedsizeNaN
22LZ68649Washington14947.917300NoBasicBachelor2/10/11EmployedM...0.02Personal AutoPersonal L3Offer3Call Center480.000000SUVMedsizeA
33XL78013Oregon22332.439460YesExtendedCollege1/11/11EmployedM...0.02Corporate AutoCorporate L3Offer2Branch484.013411Four-Door CarMedsizeA
44QA50777Oregon9025.067525NoPremiumBachelor1/17/11Medical LeaveF...NaN7Personal AutoPersonal L2Offer1Branch707.925645Four-Door CarMedsizeNaN
..................................................................
1090510905FE99816Nevada15563.369440NoPremiumBachelor1/19/11UnemployedF...NaN7Personal AutoPersonal L1Offer3Web1214.400000Luxury CarMedsizeA
1090610906KX53892Oregon5259.444853NoBasicCollege1/6/11EmployedF...0.06Personal AutoPersonal L3Offer2Branch273.018929Four-Door CarMedsizeA
1090710907TL39050Arizona23893.304100NoExtendedBachelor2/6/11EmployedF...0.02Corporate AutoCorporate L3Offer1Web381.306996Luxury SUVMedsizeNaN
1090810908WA60547California11971.977650NoPremiumCollege2/13/11EmployedF...4.06Personal AutoPersonal L1Offer1Branch618.288849SUVMedsizeA
1090910909IV32877NaN6857.519928NaNBasicBachelor1/8/11UnemployedM...0.03Personal AutoPersonal L1Offer4Web1021.719397SUVMedsizeNaN
\n", + "

10910 rows × 26 columns

\n", + "
" + ], + "text/plain": [ + " Unnamed: 0 Customer State Customer Lifetime Value Response \\\n", + "0 0 DK49336 Arizona 4809.216960 No \n", + "1 1 KX64629 California 2228.525238 No \n", + "2 2 LZ68649 Washington 14947.917300 No \n", + "3 3 XL78013 Oregon 22332.439460 Yes \n", + "4 4 QA50777 Oregon 9025.067525 No \n", + "... ... ... ... ... ... \n", + "10905 10905 FE99816 Nevada 15563.369440 No \n", + "10906 10906 KX53892 Oregon 5259.444853 No \n", + "10907 10907 TL39050 Arizona 23893.304100 No \n", + "10908 10908 WA60547 California 11971.977650 No \n", + "10909 10909 IV32877 NaN 6857.519928 NaN \n", + "\n", + " Coverage Education Effective To Date EmploymentStatus Gender ... \\\n", + "0 Basic College 2/18/11 Employed M ... \n", + "1 Basic College 1/18/11 Unemployed F ... \n", + "2 Basic Bachelor 2/10/11 Employed M ... \n", + "3 Extended College 1/11/11 Employed M ... \n", + "4 Premium Bachelor 1/17/11 Medical Leave F ... \n", + "... ... ... ... ... ... ... \n", + "10905 Premium Bachelor 1/19/11 Unemployed F ... \n", + "10906 Basic College 1/6/11 Employed F ... \n", + "10907 Extended Bachelor 2/6/11 Employed F ... \n", + "10908 Premium College 2/13/11 Employed F ... \n", + "10909 Basic Bachelor 1/8/11 Unemployed M ... \n", + "\n", + " Number of Open Complaints Number of Policies Policy Type \\\n", + "0 0.0 9 Corporate Auto \n", + "1 0.0 1 Personal Auto \n", + "2 0.0 2 Personal Auto \n", + "3 0.0 2 Corporate Auto \n", + "4 NaN 7 Personal Auto \n", + "... ... ... ... \n", + "10905 NaN 7 Personal Auto \n", + "10906 0.0 6 Personal Auto \n", + "10907 0.0 2 Corporate Auto \n", + "10908 4.0 6 Personal Auto \n", + "10909 0.0 3 Personal Auto \n", + "\n", + " Policy Renew Offer Type Sales Channel Total Claim Amount \\\n", + "0 Corporate L3 Offer3 Agent 292.800000 \n", + "1 Personal L3 Offer4 Call Center 744.924331 \n", + "2 Personal L3 Offer3 Call Center 480.000000 \n", + "3 Corporate L3 Offer2 Branch 484.013411 \n", + "4 Personal L2 Offer1 Branch 707.925645 \n", + "... ... ... ... ... \n", + "10905 Personal L1 Offer3 Web 1214.400000 \n", + "10906 Personal L3 Offer2 Branch 273.018929 \n", + "10907 Corporate L3 Offer1 Web 381.306996 \n", + "10908 Personal L1 Offer1 Branch 618.288849 \n", + "10909 Personal L1 Offer4 Web 1021.719397 \n", + "\n", + " Vehicle Class Vehicle Size Vehicle Type \n", + "0 Four-Door Car Medsize NaN \n", + "1 Four-Door Car Medsize NaN \n", + "2 SUV Medsize A \n", + "3 Four-Door Car Medsize A \n", + "4 Four-Door Car Medsize NaN \n", + "... ... ... ... \n", + "10905 Luxury Car Medsize A \n", + "10906 Four-Door Car Medsize A \n", + "10907 Luxury SUV Medsize NaN \n", + "10908 SUV Medsize A \n", + "10909 SUV Medsize NaN \n", + "\n", + "[10910 rows x 26 columns]" ] - }, + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# your code goes here\n", + "import pandas as pd\n", + "import numpy as np\n", + "url = 'https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv'\n", + "mkt_customer_df = pd.read_csv(url)\n", + "mkt_customer_df" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "7093dddc-b5ce-4eaf-ba23-de872734c4f5", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d", - "metadata": { - "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d" - }, - "source": [ - "In this challenge, we will continue to work with customer data from an insurance company. We will use the dataset called marketing_customer_analysis.csv, which can be found at the following link:\n", - "\n", - "https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv\n", - "\n", - "This dataset contains information such as customer demographics, policy details, vehicle information, and the customer's response to the last marketing campaign. Our goal is to explore and analyze this data by first performing data cleaning, formatting, and structuring." + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Unnamed: 0CustomerStateCustomer Lifetime ValueResponseCoverageEducationEffective To DateEmploymentStatusGender...Number of Open ComplaintsNumber of PoliciesPolicy TypePolicyRenew Offer TypeSales ChannelTotal Claim AmountVehicle ClassVehicle SizeVehicle Type
33XL78013Oregon22332.439460YesExtendedCollege1/11/11EmployedM...0.02Corporate AutoCorporate L3Offer2Branch484.013411Four-Door CarMedsizeA
88FM55990California5989.773931YesPremiumCollege1/19/11EmployedM...0.01Personal AutoPersonal L1Offer2Branch739.200000Sports CarMedsizeNaN
1515CW49887California4626.801093YesBasicMaster1/16/11EmployedF...0.01Special AutoSpecial L1Offer2Branch547.200000SUVMedsizeNaN
1919NJ54277California3746.751625YesExtendedCollege2/26/11EmployedF...1.01Personal AutoPersonal L2Offer2Call Center19.575683Two-Door CarLargeA
2727MQ68407Oregon4376.363592YesPremiumBachelor2/28/11EmployedF...0.01Personal AutoPersonal L3Offer2Agent60.036683Four-Door CarMedsizeNaN
..................................................................
1084410844FM31768Arizona5979.724161YesExtendedHigh School or Below2/7/11EmployedF...0.03Personal AutoPersonal L1Offer2Agent547.200000Four-Door CarMedsizeNaN
1085210852KZ80424Washington8382.478392YesBasicBachelor1/27/11EmployedM...0.02Personal AutoPersonal L2Offer2Call Center791.878042NaNNaNA
1087210872XT67997California5979.724161YesExtendedHigh School or Below2/7/11EmployedF...0.03Personal AutoPersonal L3Offer2Agent547.200000Four-Door CarMedsizeNaN
1088710887BY78730Oregon8879.790017YesBasicHigh School or Below2/3/11EmployedF...0.07Special AutoSpecial L2Offer1Agent528.200860SUVSmallA
1089710897MM70762Arizona9075.768214YesBasicMaster1/26/11EmployedM...0.08Personal AutoPersonal L1Offer1Agent158.077504Sports CarMedsizeA
\n", + "

1399 rows × 26 columns

\n", + "
" + ], + "text/plain": [ + " Unnamed: 0 Customer State Customer Lifetime Value Response \\\n", + "3 3 XL78013 Oregon 22332.439460 Yes \n", + "8 8 FM55990 California 5989.773931 Yes \n", + "15 15 CW49887 California 4626.801093 Yes \n", + "19 19 NJ54277 California 3746.751625 Yes \n", + "27 27 MQ68407 Oregon 4376.363592 Yes \n", + "... ... ... ... ... ... \n", + "10844 10844 FM31768 Arizona 5979.724161 Yes \n", + "10852 10852 KZ80424 Washington 8382.478392 Yes \n", + "10872 10872 XT67997 California 5979.724161 Yes \n", + "10887 10887 BY78730 Oregon 8879.790017 Yes \n", + "10897 10897 MM70762 Arizona 9075.768214 Yes \n", + "\n", + " Coverage Education Effective To Date EmploymentStatus \\\n", + "3 Extended College 1/11/11 Employed \n", + "8 Premium College 1/19/11 Employed \n", + "15 Basic Master 1/16/11 Employed \n", + "19 Extended College 2/26/11 Employed \n", + "27 Premium Bachelor 2/28/11 Employed \n", + "... ... ... ... ... \n", + "10844 Extended High School or Below 2/7/11 Employed \n", + "10852 Basic Bachelor 1/27/11 Employed \n", + "10872 Extended High School or Below 2/7/11 Employed \n", + "10887 Basic High School or Below 2/3/11 Employed \n", + "10897 Basic Master 1/26/11 Employed \n", + "\n", + " Gender ... Number of Open Complaints Number of Policies \\\n", + "3 M ... 0.0 2 \n", + "8 M ... 0.0 1 \n", + "15 F ... 0.0 1 \n", + "19 F ... 1.0 1 \n", + "27 F ... 0.0 1 \n", + "... ... ... ... ... \n", + "10844 F ... 0.0 3 \n", + "10852 M ... 0.0 2 \n", + "10872 F ... 0.0 3 \n", + "10887 F ... 0.0 7 \n", + "10897 M ... 0.0 8 \n", + "\n", + " Policy Type Policy Renew Offer Type Sales Channel \\\n", + "3 Corporate Auto Corporate L3 Offer2 Branch \n", + "8 Personal Auto Personal L1 Offer2 Branch \n", + "15 Special Auto Special L1 Offer2 Branch \n", + "19 Personal Auto Personal L2 Offer2 Call Center \n", + "27 Personal Auto Personal L3 Offer2 Agent \n", + "... ... ... ... ... \n", + "10844 Personal Auto Personal L1 Offer2 Agent \n", + "10852 Personal Auto Personal L2 Offer2 Call Center \n", + "10872 Personal Auto Personal L3 Offer2 Agent \n", + "10887 Special Auto Special L2 Offer1 Agent \n", + "10897 Personal Auto Personal L1 Offer1 Agent \n", + "\n", + " Total Claim Amount Vehicle Class Vehicle Size Vehicle Type \n", + "3 484.013411 Four-Door Car Medsize A \n", + "8 739.200000 Sports Car Medsize NaN \n", + "15 547.200000 SUV Medsize NaN \n", + "19 19.575683 Two-Door Car Large A \n", + "27 60.036683 Four-Door Car Medsize NaN \n", + "... ... ... ... ... \n", + "10844 547.200000 Four-Door Car Medsize NaN \n", + "10852 791.878042 NaN NaN A \n", + "10872 547.200000 Four-Door Car Medsize NaN \n", + "10887 528.200860 SUV Small A \n", + "10897 158.077504 Sports Car Medsize A \n", + "\n", + "[1399 rows x 26 columns]" ] - }, + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#1\n", + "mkt_customer_df_filt = mkt_customer_df[(mkt_customer_df[\"Total Claim Amount\"] < 1000) & (mkt_customer_df[\"Response\"] == \"Yes\")]\n", + "mkt_customer_df_filt" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "84f78127-662b-43d3-8d05-56e6c1365b7b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 10910 entries, 0 to 10909\n", + "Data columns (total 26 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 Unnamed: 0 10910 non-null int64 \n", + " 1 Customer 10910 non-null object \n", + " 2 State 10279 non-null object \n", + " 3 Customer Lifetime Value 10910 non-null float64\n", + " 4 Response 10279 non-null object \n", + " 5 Coverage 10910 non-null object \n", + " 6 Education 10910 non-null object \n", + " 7 Effective To Date 10910 non-null object \n", + " 8 EmploymentStatus 10910 non-null object \n", + " 9 Gender 10910 non-null object \n", + " 10 Income 10910 non-null int64 \n", + " 11 Location Code 10910 non-null object \n", + " 12 Marital Status 10910 non-null object \n", + " 13 Monthly Premium Auto 10910 non-null int64 \n", + " 14 Months Since Last Claim 10277 non-null float64\n", + " 15 Months Since Policy Inception 10910 non-null int64 \n", + " 16 Number of Open Complaints 10277 non-null float64\n", + " 17 Number of Policies 10910 non-null int64 \n", + " 18 Policy Type 10910 non-null object \n", + " 19 Policy 10910 non-null object \n", + " 20 Renew Offer Type 10910 non-null object \n", + " 21 Sales Channel 10910 non-null object \n", + " 22 Total Claim Amount 10910 non-null float64\n", + " 23 Vehicle Class 10288 non-null object \n", + " 24 Vehicle Size 10288 non-null object \n", + " 25 Vehicle Type 5428 non-null object \n", + "dtypes: float64(4), int64(5), object(17)\n", + "memory usage: 2.2+ MB\n" + ] + } + ], + "source": [ + "mkt_customer_df.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "92d5e376-9b91-40cb-b36f-0c3b366810af", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50", - "metadata": { - "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50" - }, - "source": [ - "1. Create a new DataFrame that only includes customers who:\n", - " - have a **low total_claim_amount** (e.g., below $1,000),\n", - " - have a response \"Yes\" to the last marketing campaign." + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
mean
GenderPolicy Type
FCorporate Auto94.301775
Personal Auto98.998148
Special Auto92.314286
MCorporate Auto92.188312
Personal Auto91.085821
Special Auto86.343750
\n", + "
" + ], + "text/plain": [ + " mean\n", + "Gender Policy Type \n", + "F Corporate Auto 94.301775\n", + " Personal Auto 98.998148\n", + " Special Auto 92.314286\n", + "M Corporate Auto 92.188312\n", + " Personal Auto 91.085821\n", + " Special Auto 86.343750" ] - }, + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#2\n", + "mkt_customer_df_filt2 = mkt_customer_df[(mkt_customer_df[\"Response\"] == \"Yes\")]\n", + "mkt_customer_df_filt2.groupby(['Gender', 'Policy Type'])['Monthly Premium Auto'].agg(['mean'])" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "6b942d8a-bd63-4d45-89d8-448e19ac5aa0", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "b9be383e-5165-436e-80c8-57d4c757c8c3", - "metadata": { - "id": "b9be383e-5165-436e-80c8-57d4c757c8c3" - }, - "source": [ - "2. Using the original Dataframe, analyze:\n", - " - the average `monthly_premium` and/or customer lifetime value by `policy_type` and `gender` for customers who responded \"Yes\", and\n", - " - compare these insights to `total_claim_amount` patterns, and discuss which segments appear most profitable or low-risk for the company." + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
mean
State
Arizona8084.373066
California7951.655993
Nevada6827.612261
Oregon8063.150266
Washington7722.509147
\n", + "
" + ], + "text/plain": [ + " mean\n", + "State \n", + "Arizona 8084.373066\n", + "California 7951.655993\n", + "Nevada 6827.612261\n", + "Oregon 8063.150266\n", + "Washington 7722.509147" ] - }, + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "mkt_customer_df_filt3 = mkt_customer_df[(mkt_customer_df[\"Response\"] == \"Yes\")]\n", + "mkt_customer_df_filt3.groupby(['State'])['Customer Lifetime Value'].agg(['mean'])\t" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "787bd73e-6f64-485f-919f-f3e6b2e515ec", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0", - "metadata": { - "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0" - }, - "source": [ - "3. Analyze the total number of customers who have policies in each state, and then filter the results to only include states where there are more than 500 customers." + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Monthly Premium AutoCustomer Lifetime ValueTotal Claim Amount
GenderPolicy Type
FCorporate Auto94.3017757712.628736433.738499
Personal Auto98.9981488339.791842452.965929
Special Auto92.3142867691.584111453.280164
MCorporate Auto92.1883127944.465414408.582459
Personal Auto91.0858217448.383281457.010178
Special Auto86.3437508247.088702429.527942
\n", + "
" + ], + "text/plain": [ + " Monthly Premium Auto Customer Lifetime Value \\\n", + "Gender Policy Type \n", + "F Corporate Auto 94.301775 7712.628736 \n", + " Personal Auto 98.998148 8339.791842 \n", + " Special Auto 92.314286 7691.584111 \n", + "M Corporate Auto 92.188312 7944.465414 \n", + " Personal Auto 91.085821 7448.383281 \n", + " Special Auto 86.343750 8247.088702 \n", + "\n", + " Total Claim Amount \n", + "Gender Policy Type \n", + "F Corporate Auto 433.738499 \n", + " Personal Auto 452.965929 \n", + " Special Auto 453.280164 \n", + "M Corporate Auto 408.582459 \n", + " Personal Auto 457.010178 \n", + " Special Auto 429.527942 " ] - }, + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "mkt_customer_df_filt4 = mkt_customer_df[(mkt_customer_df[\"Response\"] == \"Yes\")]\n", + "mkt_customer_df_filt4.groupby(['Gender', 'Policy Type'])[['Monthly Premium Auto', 'Customer Lifetime Value', 'Total Claim Amount']].agg('mean')" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "8de37efe-cbf1-490c-a70e-8c527c5dc51d", + "metadata": {}, + "outputs": [], + "source": [ + "#The most profitable segment seems to be for women the personal auto and for men the special auto, as they are the ones with the higher customer lifetime value." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "b9ddce58-7e02-44a1-b200-be0cfe49c744", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d", - "metadata": { - "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d" - }, - "source": [ - "4. Find the maximum, minimum, and median customer lifetime value by education level and gender. Write your conclusions." + "data": { + "text/plain": [ + "State\n", + "Arizona 1937\n", + "California 3552\n", + "Nevada 993\n", + "Oregon 2909\n", + "Washington 888\n", + "Name: Customer, dtype: int64" ] - }, + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#3\n", + "mkt_df_states = mkt_customer_df.groupby('State')['Customer'].count()\n", + "mkt_df_states = mkt_df_states[mkt_df_states > 500]\n", + "mkt_df_states" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "6248879f-ece2-41a6-ba85-c797e0ad521a", + "metadata": { + "scrolled": true + }, + "outputs": [ { - "cell_type": "markdown", - "id": "b42999f9-311f-481e-ae63-40a5577072c5", - "metadata": { - "id": "b42999f9-311f-481e-ae63-40a5577072c5" - }, - "source": [ - "## Bonus" + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
maxminmedian
EducationGender
BachelorF73225.956521904.0008525640.505303
M67907.270501898.0076755548.031892
CollegeF61850.188031898.6836865623.611187
M61134.683071918.1197006005.847375
DoctorF44856.113972395.5700005332.462694
M32677.342842267.6040385577.669457
High School or BelowF55277.445892144.9215356039.553187
M83325.381191940.9812216286.731006
MasterF51016.067042417.7770325729.855012
M50568.259122272.3073105579.099207
\n", + "
" + ], + "text/plain": [ + " max min median\n", + "Education Gender \n", + "Bachelor F 73225.95652 1904.000852 5640.505303\n", + " M 67907.27050 1898.007675 5548.031892\n", + "College F 61850.18803 1898.683686 5623.611187\n", + " M 61134.68307 1918.119700 6005.847375\n", + "Doctor F 44856.11397 2395.570000 5332.462694\n", + " M 32677.34284 2267.604038 5577.669457\n", + "High School or Below F 55277.44589 2144.921535 6039.553187\n", + " M 83325.38119 1940.981221 6286.731006\n", + "Master F 51016.06704 2417.777032 5729.855012\n", + " M 50568.25912 2272.307310 5579.099207" ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#4\n", + "mkt_customer_df.groupby(['Education', 'Gender'])['Customer Lifetime Value'].agg(['max', 'min', 'median'])" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "7ec8dafe-b585-4936-9caa-728c1a5169bd", + "metadata": {}, + "outputs": [], + "source": [ + "#Generally, as higher the education level is, the lower the customer lifetime value is. " + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "f7753c61-7d04-4742-a6eb-bd7899127824", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "C:\\Users\\julia\\AppData\\Local\\Temp\\ipykernel_18512\\1644421404.py:3: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.\n", + " mkt_customer_df['Effective To Date'] = pd.to_datetime(mkt_customer_df['Effective To Date'], errors='coerce')\n" + ] }, { - "cell_type": "markdown", - "id": "81ff02c5-6584-4f21-a358-b918697c6432", - "metadata": { - "id": "81ff02c5-6584-4f21-a358-b918697c6432" - }, - "source": [ - "5. The marketing team wants to analyze the number of policies sold by state and month. Present the data in a table where the months are arranged as columns and the states are arranged as rows." + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Number of Policies
MonthFebruaryJanuary
State
Arizona28643052
California49295673
Nevada12781493
Oregon39694697
Washington12251358
\n", + "
" + ], + "text/plain": [ + " Number of Policies \n", + "Month February January\n", + "State \n", + "Arizona 2864 3052\n", + "California 4929 5673\n", + "Nevada 1278 1493\n", + "Oregon 3969 4697\n", + "Washington 1225 1358" ] - }, + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#5\n", + "\n", + "mkt_customer_df['Effective To Date'] = pd.to_datetime(mkt_customer_df['Effective To Date'], errors='coerce')\n", + "mkt_customer_df['Month'] = mkt_customer_df['Effective To Date'].dt.month_name()\n", + "mkt_pivot_df = mkt_customer_df.pivot_table(index='State', columns='Month', values=['Number of Policies'], aggfunc='sum')\n", + "mkt_pivot_df" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "id": "5713e323-bbad-4eeb-a6c0-479131bb7a3f", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "b6aec097-c633-4017-a125-e77a97259cda", - "metadata": { - "id": "b6aec097-c633-4017-a125-e77a97259cda" - }, - "source": [ - "6. Display a new DataFrame that contains the number of policies sold by month, by state, for the top 3 states with the highest number of policies sold.\n", - "\n", - "*Hint:*\n", - "- *To accomplish this, you will first need to group the data by state and month, then count the number of policies sold for each group. Afterwards, you will need to sort the data by the count of policies sold in descending order.*\n", - "- *Next, you will select the top 3 states with the highest number of policies sold.*\n", - "- *Finally, you will create a new DataFrame that contains the number of policies sold by month for each of the top 3 states.*" + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Number of Policies
State
Arizona5916
California10602
Nevada2771
Oregon8666
Washington2583
\n", + "
" + ], + "text/plain": [ + " Number of Policies\n", + "State \n", + "Arizona 5916\n", + "California 10602\n", + "Nevada 2771\n", + "Oregon 8666\n", + "Washington 2583" ] - }, + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#6\n", + "mkt_df_st = pd.DataFrame(mkt_customer_df.groupby(['State'])['Number of Policies'].sum())\n", + "mkt_df_st" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "74cd214d-6396-4e0f-828a-b4aed762a63a", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009", - "metadata": { - "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009" - }, - "source": [ - "7. The marketing team wants to analyze the effect of different marketing channels on the customer response rate.\n", - "\n", - "Hint: You can use melt to unpivot the data and create a table that shows the customer response rate (those who responded \"Yes\") by marketing channel." + "data": { + "text/plain": [ + "Index(['California', 'Oregon', 'Arizona'], dtype='object', name='State')" ] - }, + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "mkt_df_st_month_filt = mkt_df_st.sort_values('Number of Policies', ascending=False).head(3).index\n", + "mkt_df_st_month_filt" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "id": "d59b9868-8708-472b-a8b7-34192d4350ca", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d", - "metadata": { - "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d" - }, - "source": [ - "External Resources for Data Filtering: https://towardsdatascience.com/filtering-data-frames-in-pandas-b570b1f834b9" + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Unnamed: 0CustomerStateCustomer Lifetime ValueResponseCoverageEducationEffective To DateEmploymentStatusGender...Number of PoliciesPolicy TypePolicyRenew Offer TypeSales ChannelTotal Claim AmountVehicle ClassVehicle SizeVehicle TypeMonth
00DK49336Arizona4809.216960NoBasicCollege2011-02-18EmployedM...9Corporate AutoCorporate L3Offer3Agent292.800000Four-Door CarMedsizeNaNFebruary
11KX64629California2228.525238NoBasicCollege2011-01-18UnemployedF...1Personal AutoPersonal L3Offer4Call Center744.924331Four-Door CarMedsizeNaNJanuary
33XL78013Oregon22332.439460YesExtendedCollege2011-01-11EmployedM...2Corporate AutoCorporate L3Offer2Branch484.013411Four-Door CarMedsizeAJanuary
44QA50777Oregon9025.067525NoPremiumBachelor2011-01-17Medical LeaveF...7Personal AutoPersonal L2Offer1Branch707.925645Four-Door CarMedsizeNaNJanuary
66IW72280California5035.035257NoBasicDoctor2011-02-14EmployedF...4Corporate AutoCorporate L2Offer2Branch287.556107Four-Door CarMedsizeNaNFebruary
..................................................................
1090210902PP30874California3579.023825NoExtendedHigh School or Below2011-01-24EmployedF...1Personal AutoPersonal L2Offer2Agent655.200000Four-Door CarMedsizeAJanuary
1090310903SU71163Arizona2771.663013NoBasicCollege2011-01-07EmployedM...1Personal AutoPersonal L2Offer2Branch355.200000Two-Door CarMedsizeAJanuary
1090610906KX53892Oregon5259.444853NoBasicCollege2011-01-06EmployedF...6Personal AutoPersonal L3Offer2Branch273.018929Four-Door CarMedsizeAJanuary
1090710907TL39050Arizona23893.304100NoExtendedBachelor2011-02-06EmployedF...2Corporate AutoCorporate L3Offer1Web381.306996Luxury SUVMedsizeNaNFebruary
1090810908WA60547California11971.977650NoPremiumCollege2011-02-13EmployedF...6Personal AutoPersonal L1Offer1Branch618.288849SUVMedsizeAFebruary
\n", + "

8398 rows × 27 columns

\n", + "
" + ], + "text/plain": [ + " Unnamed: 0 Customer State Customer Lifetime Value Response \\\n", + "0 0 DK49336 Arizona 4809.216960 No \n", + "1 1 KX64629 California 2228.525238 No \n", + "3 3 XL78013 Oregon 22332.439460 Yes \n", + "4 4 QA50777 Oregon 9025.067525 No \n", + "6 6 IW72280 California 5035.035257 No \n", + "... ... ... ... ... ... \n", + "10902 10902 PP30874 California 3579.023825 No \n", + "10903 10903 SU71163 Arizona 2771.663013 No \n", + "10906 10906 KX53892 Oregon 5259.444853 No \n", + "10907 10907 TL39050 Arizona 23893.304100 No \n", + "10908 10908 WA60547 California 11971.977650 No \n", + "\n", + " Coverage Education Effective To Date EmploymentStatus \\\n", + "0 Basic College 2011-02-18 Employed \n", + "1 Basic College 2011-01-18 Unemployed \n", + "3 Extended College 2011-01-11 Employed \n", + "4 Premium Bachelor 2011-01-17 Medical Leave \n", + "6 Basic Doctor 2011-02-14 Employed \n", + "... ... ... ... ... \n", + "10902 Extended High School or Below 2011-01-24 Employed \n", + "10903 Basic College 2011-01-07 Employed \n", + "10906 Basic College 2011-01-06 Employed \n", + "10907 Extended Bachelor 2011-02-06 Employed \n", + "10908 Premium College 2011-02-13 Employed \n", + "\n", + " Gender ... Number of Policies Policy Type Policy \\\n", + "0 M ... 9 Corporate Auto Corporate L3 \n", + "1 F ... 1 Personal Auto Personal L3 \n", + "3 M ... 2 Corporate Auto Corporate L3 \n", + "4 F ... 7 Personal Auto Personal L2 \n", + "6 F ... 4 Corporate Auto Corporate L2 \n", + "... ... ... ... ... ... \n", + "10902 F ... 1 Personal Auto Personal L2 \n", + "10903 M ... 1 Personal Auto Personal L2 \n", + "10906 F ... 6 Personal Auto Personal L3 \n", + "10907 F ... 2 Corporate Auto Corporate L3 \n", + "10908 F ... 6 Personal Auto Personal L1 \n", + "\n", + " Renew Offer Type Sales Channel Total Claim Amount Vehicle Class \\\n", + "0 Offer3 Agent 292.800000 Four-Door Car \n", + "1 Offer4 Call Center 744.924331 Four-Door Car \n", + "3 Offer2 Branch 484.013411 Four-Door Car \n", + "4 Offer1 Branch 707.925645 Four-Door Car \n", + "6 Offer2 Branch 287.556107 Four-Door Car \n", + "... ... ... ... ... \n", + "10902 Offer2 Agent 655.200000 Four-Door Car \n", + "10903 Offer2 Branch 355.200000 Two-Door Car \n", + "10906 Offer2 Branch 273.018929 Four-Door Car \n", + "10907 Offer1 Web 381.306996 Luxury SUV \n", + "10908 Offer1 Branch 618.288849 SUV \n", + "\n", + " Vehicle Size Vehicle Type Month \n", + "0 Medsize NaN February \n", + "1 Medsize NaN January \n", + "3 Medsize A January \n", + "4 Medsize NaN January \n", + "6 Medsize NaN February \n", + "... ... ... ... \n", + "10902 Medsize A January \n", + "10903 Medsize A January \n", + "10906 Medsize A January \n", + "10907 Medsize NaN February \n", + "10908 Medsize A February \n", + "\n", + "[8398 rows x 27 columns]" ] - }, + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "filtered_df_top3 = mkt_customer_df[mkt_customer_df['State'].isin(mkt_df_st_month_filt)]\n", + "filtered_df_top3" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "id": "9350226e-1c85-4d90-a8b3-adc2a60e827d", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Month February January\n", + "State \n", + "Arizona 2864 3052\n", + "California 4929 5673\n", + "Oregon 3969 4697\n" + ] + } + ], + "source": [ + "mkt_df_st_month_top3 = (\n", + " filtered_df_top3\n", + " .groupby(['State', 'Month'])['Number of Policies']\n", + " .sum()\n", + " .unstack(fill_value=0)\n", + ")\n", + "\n", + "print(mkt_df_st_month_top3)" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "id": "f6d406dd-98d1-4c8b-bf5c-4ee2bfc43826", + "metadata": {}, + "outputs": [ { - "cell_type": "code", - "execution_count": null, - "id": "449513f4-0459-46a0-a18d-9398d974c9ad", - "metadata": { - "id": "449513f4-0459-46a0-a18d-9398d974c9ad" - }, - "outputs": [], - "source": [ - "# your code goes here" + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Sales ChannelResponseCount
0AgentNo3148
1AgentYes742
2BranchNo2539
3BranchYes326
4Call CenterNo1792
5Call CenterYes221
6WebNo1334
7WebYes177
\n", + "
" + ], + "text/plain": [ + " Sales Channel Response Count\n", + "0 Agent No 3148\n", + "1 Agent Yes 742\n", + "2 Branch No 2539\n", + "3 Branch Yes 326\n", + "4 Call Center No 1792\n", + "5 Call Center Yes 221\n", + "6 Web No 1334\n", + "7 Web Yes 177" ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + ], + "source": [ + "#7\n", + "response_counts = mkt_customer_df.groupby(['Sales Channel', 'Response']).size().reset_index(name='Count')\n", + "response_counts" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "id": "06ead3e9-5a38-4c0f-8236-1e2106884224", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ResponseSales ChannelNoYes
0Agent3148742
1Branch2539326
2Call Center1792221
3Web1334177
\n", + "
" + ], + "text/plain": [ + "Response Sales Channel No Yes\n", + "0 Agent 3148 742\n", + "1 Branch 2539 326\n", + "2 Call Center 1792 221\n", + "3 Web 1334 177" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pivot_counts = response_counts.pivot(index='Sales Channel', columns='Response', values='Count').reset_index()\n", + "pivot_counts" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "id": "b60cae33-6bd3-43e8-86a9-a7caf0fec21c", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Sales ChannelResponseCount
0AgentYes742
1BranchYes326
2Call CenterYes221
3WebYes177
4AgentNo3148
5BranchNo2539
6Call CenterNo1792
7WebNo1334
\n", + "
" + ], + "text/plain": [ + " Sales Channel Response Count\n", + "0 Agent Yes 742\n", + "1 Branch Yes 326\n", + "2 Call Center Yes 221\n", + "3 Web Yes 177\n", + "4 Agent No 3148\n", + "5 Branch No 2539\n", + "6 Call Center No 1792\n", + "7 Web No 1334" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "melted_counts = pd.melt(pivot_counts, id_vars=['Sales Channel'], value_vars=['Yes', 'No'], var_name='Response', value_name='Count')\n", + "melted_counts" + ] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.13.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 }