Skip to content

ENH add sample #2419 #7274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

ENH add sample #2419 #7274

wants to merge 2 commits into from

Conversation

hayd
Copy link
Contributor

@hayd hayd commented May 29, 2014

fixes #2419

Hmmm np.random.choice not available on numpy < 1.7.

@hayd hayd added this to the 0.14.1 milestone May 29, 2014
@@ -297,3 +297,20 @@ def _bucket_labels(series, k):
mat[v] = i

return mat + 1


def choice(arr, size, replace):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot that arr can be an int. Also, I should make the private.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this anymore since we require numpy 1.7 now

@jreback
Copy link
Contributor

jreback commented Jan 18, 2015

closing for now. @hayd if you want to update at some point, ok,.

@jreback jreback closed this Jan 18, 2015
except ImportError:
from pandas.stats.misc import choice
msk = choice(len(self), size, replace=replace)
return self.iloc[msk]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use take, which is faster

@shoyer
Copy link
Member

shoyer commented Jan 21, 2015

Just wrote a little utility function to do this, so I may pick this up.

def sample_n(df, size, replace=False, weight=None, seed=None):
    rs = np.random.RandomState(seed)
    locs = rs.choice(df.shape[0], size=size, replace=replace, p=weight)
    return df.take(locs, axis=0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Series/DataFrame sample method with/without replacement
4 participants