Skip to content

Predicates in itertools and filter #102105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pochmann opened this issue Feb 21, 2023 · 2 comments
Closed

Predicates in itertools and filter #102105

pochmann opened this issue Feb 21, 2023 · 2 comments
Assignees
Labels
docs Documentation in the Doc dir

Comments

@pochmann
Copy link
Contributor

pochmann commented Feb 21, 2023

Documentation

filterfalse

The doc says "Make an iterator that filters elements from iterable returning only those for which the predicate is False". That's not correct, it also returns other false elements:

>>> list(filterfalse(lambda x: x, [False, 0, 1, '', ()]))
[False, 0, '', ()]

quantify

This recipe has the opposite issue:

def quantify(iterable, pred=bool):
    "Count how many times the predicate is true"
    return sum(map(pred, iterable))

It says "true", but it only really works with True. Other predicates, like re.compile(...).match returning re.Match objects or None, don't work and quantify crashes. Predicates returning numbers other than 1 and 0 don't crash it but lead to wrong "counts". So quantify is limited. A pity and an odd outlier, since the other six tools/recipes and filter all support Python's general truth value concept.

The easy fix is to map bool over the predicate values:

def quantify(iterable, pred=bool):
    "Count how many times the predicate is true"
    return sum(map(bool, map(pred, iterable)))

I don't see something "nicer" with the available tools/recipes. If only there was a length recipe, then we could do:

def quantify(iterable, pred=bool):
    "Count how many times the predicate is true"
    return length(filter(pred, iterable))

filter

The doc says:

filter(function, iterable)
Construct an iterator from those elements of iterable for which function returns true.

The "returns true" doesn't work well. I'd say "returns a true value" or "for which function is true".

The last paragraph has the same issue:

See itertools.filterfalse() for the complementary function that returns elements of iterable for which function returns false.

Linked PRs

@pochmann pochmann added the docs Documentation in the Doc dir label Feb 21, 2023
@picnixz
Copy link
Member

picnixz commented Feb 21, 2023

You can patch quantify as follows if you don't care about efficiency:

def quantify(iterable, pred=bool):
    "Count how many times the predicate is true"
    return sum(1 for item in iterable if pred(item))

Remark

Formally speaking, a predicate must return True or False (in the sense of the language). We abusively call a function a predicate if it returns something which can be implicitly interpreted as a boolean value.

I think this is not limited to this specific part of the documentation and there should probably be other locations where we write "true" instead of True or vice-versa. For instance, we could add the "predicate" term in the glossary and say that, for a predicate, returning "true" or "True" has the same meaning.

@rhettinger
Copy link
Contributor

I concur that the "is False" (uppercase) in the filterfalse() doc is incorrect. Interestingly, the summary at the top of the page is correct and so is the docstring. While "predicate that returns a false value" is technically the most correct, it is more readable to say "is false" (lowercase f).

For quantify, just use the uppercase True. The intent of the tool is to count (sum) actual True and False values which are equivalent to 1 and 0 respectively.

Please make a PR and assign to me for review. Thanks for finding these. It's interesting that no one noticed in the 12 years they have been published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

3 participants