Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
4,457 changes: 4,457 additions & 0 deletions .ipynb_checkpoints/EDA Notebook-checkpoint.ipynb

Large diffs are not rendered by default.

Binary file added Data Intake Report.pdf
Binary file not shown.
4,457 changes: 4,457 additions & 0 deletions EDA Notebook.ipynb

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
# G2M Cab DataSets
## Resources
https://github.com/DataGlacier/DataSets.git
21 changes: 21 additions & 0 deletions city_data_cleaned.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
,City,Population,Users
0,NEW YORK NY," 8,405,837 "," 302,149 "
1,CHICAGO IL," 1,955,130 "," 164,468 "
2,LOS ANGELES CA," 1,595,037 "," 144,132 "
3,MIAMI FL," 1,339,155 "," 17,675 "
4,SILICON VALLEY," 1,177,609 "," 27,247 "
5,ORANGE COUNTY," 1,030,185 "," 12,994 "
6,SAN DIEGO CA," 959,307 "," 69,995 "
7,PHOENIX AZ," 943,999 "," 6,133 "
8,DALLAS TX," 942,908 "," 22,157 "
9,ATLANTA GA," 814,885 "," 24,701 "
10,DENVER CO," 754,233 "," 12,421 "
11,AUSTIN TX," 698,371 "," 14,978 "
12,SEATTLE WA," 671,238 "," 25,063 "
13,TUCSON AZ," 631,442 "," 5,712 "
14,SAN FRANCISCO CA," 629,591 "," 213,609 "
15,SACRAMENTO CA," 545,776 "," 7,044 "
16,PITTSBURGH PA," 542,085 "," 3,643 "
17,WASHINGTON DC," 418,859 "," 127,001 "
18,NASHVILLE TN," 327,225 "," 9,270 "
19,BOSTON MA," 248,968 "," 80,021 "
359,393 changes: 359,393 additions & 0 deletions data_cleaning.csv

Large diffs are not rendered by default.

15 changes: 15 additions & 0 deletions data_cleaning.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import pandas as pd
df1= pd.read_csv('Cab_Data.csv')
df2= pd.read_csv('Transaction_ID.csv')
df3= pd.read_csv('Customer_ID.csv')

inner = pd.merge(df1, df2)

new_dataframe = pd.merge(inner, df3)

new_dataframe_out = new_dataframe.drop(["0"], axis = 1)





5 changes: 5 additions & 0 deletions data_cleaning_2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
import pandas as pd

df = pd.read_csv('City.csv')

df.to_csv("city_data_cleaned.csv")