Skip to content

Conversation

@villanueval
Copy link
Contributor

Setting sex to x is not used in the next step of the challenge.


When I took the course, challenge 1 in the "Challenge - Putting it all together" section of the lesson 03-index-slice-subset was a bit confusing.

  1. Create a new DataFrame that only contains observations with sex values that
    are not female or male. Assign each sex value in the new DataFrame to a
    new value of 'x'. Determine the number of null values in the subset.

The Lesson guide shows this solution:

new = surveys_df[~surveys_df['sex'].isin(['M', 'F'])].copy()
new['sex']='x'
print(len(new))

This returns: 2511, which is the same as:

sum(surveys_df['sex'].isnull())

However, as written in the lesson guide, setting 'sex' to 'x' serves no purpose because the len() value is for the whole new DataFrame, 'sex' is not used to count anything. Setting the value of the 'sex' column to x, then asking about the null values in the DataFrame was confusing because it could be asking for the null values in 'sex' only:

print(len(new[pd.isnull(new['sex'])]['sex']))

which returns 0, or in all the columns of the new DataFrame:

print(len(new[pd.isnull(new).any(axis=1)]))

which returns 2449.

With the proposed edit in the lesson, the lesson guide could be:

new = surveys_df[~surveys_df['sex'].isin(['M', 'F'])].copy()
#Print number of rows in the new DataFrame
new_no_rows = len(new)
print(new_no_rows)

2511

#How many rows in surveys_df had null values in sex
surveys_df_sexnull = sum(surveys_df['sex'].isnull())

#Compare
new_no_rows == surveys_df_sexnull

True

Hope this makes sense, or maybe I missed something?

Setting sex to x is not used in the next step of the challenge.
@wrightaprilm
Copy link
Contributor

Sorry I'm so late getting back on this - busy month. I like these edits, and I think they make the challenge clearer. Happy if you are, @maxim-belkin

@wrightaprilm wrightaprilm requested a review from maxim-belkin May 17, 2019 15:53
@tobyhodges tobyhodges merged commit 5650e97 into datacarpentry:gh-pages Feb 28, 2023
zkamvar pushed a commit that referenced this pull request May 8, 2023
Clarifying challenge and removing unnecessary step
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants