Skip to content

refactor(profiling): type pprof exporter and fix type comparison #2987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 15, 2021

Conversation

jd
Copy link
Contributor

@jd jd commented Nov 5, 2021

There was an ongoing issue wih the pprof encode mixing type between strings,
ints and None when dealing with labels. This has been fixed in multiple places
and in multiple occasion, but there were still holes.

When encoding Events to pprof, they are grouped by event fields. Those fields
might be str, int or something else. In order to group those events, all the
fields used for grouping must be:

  1. hashable (because we index in a dict)
  2. sortable (because we sort event to have reproducible)
  3. string (because we store those as labels in pprof)

The problem is that most value type used in events are hashable (None, int or
str) but sortability often fails when one of the field is None: you can't
compare int with None, making sorted() call fails like in #2962.

As we need string in the end, the fix is to convert everything to strings as
soon as possible when grouping events: that makes sure conditions 1, 2 and 3
are validated ASAP and everything works.

To do this, the code has been updated and typed to catch any future error.

Fixes #2962

@jd jd requested a review from a team as a code owner November 5, 2021 09:58
@jd jd force-pushed the profiling/type-pprof branch from 93eb939 to 21af6bf Compare November 5, 2021 09:58
@jd jd added the changelog/no-changelog A changelog entry is not required for this PR. label Nov 5, 2021
@jd jd force-pushed the profiling/type-pprof branch 3 times, most recently from 28c6d3e to abcbf70 Compare November 5, 2021 17:09
brettlangdon
brettlangdon previously approved these changes Nov 5, 2021
@mergify
Copy link
Contributor

mergify bot commented Nov 5, 2021

@jd this pull request is now in conflict 😩

@mergify
Copy link
Contributor

mergify bot commented Nov 5, 2021

@jd this pull request is now in conflict 😩

@mergify mergify bot added the conflict label Nov 5, 2021
P403n1x87
P403n1x87 previously approved these changes Nov 8, 2021
brettlangdon
brettlangdon previously approved these changes Nov 8, 2021
@jd jd dismissed stale reviews from brettlangdon and P403n1x87 via 18c8c56 November 9, 2021 10:47
@jd jd force-pushed the profiling/type-pprof branch from 3ab82cb to 18c8c56 Compare November 9, 2021 10:47
@mergify mergify bot removed the conflict label Nov 9, 2021
@jd
Copy link
Contributor Author

jd commented Nov 9, 2021

It looks like it's a real failure, I'll take a look.

There was an ongoing issue wih the pprof encode mixing type between strings,
ints and None when dealing with labels. This has been fixed in multiple places
and in multiple occasion, but there were still holes.

When encoding Events to pprof, they are grouped by event fields. Those fields
might be str, int or something else. In order to group those events, all the
fields used for grouping must be:

1. hashable (because we index in a dict)
2. sortable (because we sort event to have reproducible)
3. string (because we store those as labels in pprof)

The problem is that most value type used in events are hashable (None, int or
str) but sortability often fails when one of the field is None: you can't
compare `int` with `None`, making sorted() call fails like in DataDog#2962.

As we need string in the end, the fix is to convert everything to strings as
soon as possible when grouping events: that makes sure conditions 1, 2 and 3
are validated ASAP and everything works.

To do this, the code has been updated and typed to catch any future error.

Fixes DataDog#2962
@jd jd force-pushed the profiling/type-pprof branch from 5988a19 to 2c5555a Compare November 10, 2021 14:35
@jd jd requested a review from P403n1x87 November 15, 2021 10:49
@mergify mergify bot merged commit 6520212 into DataDog:master Nov 15, 2021
@jd
Copy link
Contributor Author

jd commented Dec 10, 2021

@Mergifyio backport 0.56

mergify bot added a commit that referenced this pull request Dec 10, 2021
There was an ongoing issue wih the pprof encode mixing type between strings,
ints and None when dealing with labels. This has been fixed in multiple places
and in multiple occasion, but there were still holes.

When encoding Events to pprof, they are grouped by event fields. Those fields
might be str, int or something else. In order to group those events, all the
fields used for grouping must be:

1. hashable (because we index in a dict)
2. sortable (because we sort event to have reproducible)
3. string (because we store those as labels in pprof)

The problem is that most value type used in events are hashable (None, int or
str) but sortability often fails when one of the field is None: you can't
compare `int` with `None`, making sorted() call fails like in #2962.

As we need string in the end, the fix is to convert everything to strings as
soon as possible when grouping events: that makes sure conditions 1, 2 and 3
are validated ASAP and everything works.

To do this, the code has been updated and typed to catch any future error.

Fixes #2962

Co-authored-by: Brett Langdon <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
(cherry picked from commit 6520212)
@mergify
Copy link
Contributor

mergify bot commented Dec 10, 2021

backport 0.56

✅ Backports have been created

mergify bot added a commit that referenced this pull request Dec 10, 2021
…) (#3062)

There was an ongoing issue wih the pprof encode mixing type between strings,
ints and None when dealing with labels. This has been fixed in multiple places
and in multiple occasion, but there were still holes.

When encoding Events to pprof, they are grouped by event fields. Those fields
might be str, int or something else. In order to group those events, all the
fields used for grouping must be:

1. hashable (because we index in a dict)
2. sortable (because we sort event to have reproducible)
3. string (because we store those as labels in pprof)

The problem is that most value type used in events are hashable (None, int or
str) but sortability often fails when one of the field is None: you can't
compare `int` with `None`, making sorted() call fails like in #2962.

As we need string in the end, the fix is to convert everything to strings as
soon as possible when grouping events: that makes sure conditions 1, 2 and 3
are validated ASAP and everything works.

To do this, the code has been updated and typed to catch any future error.

Fixes #2962

Co-authored-by: Brett Langdon <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
(cherry picked from commit 6520212)

Co-authored-by: Julien Danjou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Profiling pprof converter is raising exception when sorting _locations
3 participants