Skip to content

CLOUDP-317911 Add pprof integration in operator #101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 15, 2025

Conversation

MaciejKaras
Copy link
Collaborator

@MaciejKaras MaciejKaras commented May 9, 2025

Summary

pprof can be configured by two new environment variables:

  • MDB_OPERATOR_PPROF_ENABLED - together with OPERATOR_ENV controls enabling of pprof server. Basically the rule for enabling pprof is defined in IsPprofEnabled function:
    // IsPprofEnabled checks if pprof is enabled based on the PPROF_ENABLED
    // and OPERATOR_ENV environment variables. It returns true if:
    // - PPROF_ENABLED is set to true
    // - OPERATOR_ENV is set to dev or local and PPROF_ENABLED is not set
    // Otherwise, it returns false.
  • MDB_OPERATOR_PPROF_PORT - by default it is set to 10081

It's more than _ "net/http/pprof" one liner for a couple of reasons:

  • having the possibility to enable pprof for the production environment is necessary for debugging memory issues and it does not add much overhead either https://stackoverflow.com/a/64057856. Previously it was only enabled for dev and local
  • exposing by default pprof server in production is a no-go in many organisations due to sensitive information exposed or just exposing some port is enough to alert security staff https://cwe.mitre.org/data/definitions/200.html
    standard way of starting pprof by is discouraged for lack of configurability and security issues
    • G114: Use of net/http serve function that has no support for setting timeouts (gosec)
    • G108: Profiling endpoint is automatically exposed on /debug/pprof (gosec)
import _ "net/http/pprof"

[...]
  go func() {
    log.Println(http.ListenAndServe("localhost:10081", nil))
  }()

Proof of Work

pprof debug page is available at default localhost:10081 port
Screenshot 2025-05-09 at 16 33 59

Added unit tests that verify IsPprofEnabled function. Shutdown is also working:

2025-05-09T16:32:56.750+0200	INFO	pprof/pprof.go:57	Stopping pprof server
2025-05-09T16:32:56.750+0200	INFO	controller/controller.go:235	Shutdown signal received, waiting for all workers to finish	{"controller": "mongodbuser-controller"}
2025-05-09T16:32:56.750+0200	INFO	controller/controller.go:237	All workers finished	{"controller": "mongodbmulticluster-controller"}
2025-05-09T16:32:56.750+0200	INFO	controller/controller.go:237	All workers finished	{"controller": "mongodbreplicaset-controller"}
2025-05-09T16:32:56.750+0200	INFO	controller/controller.go:237	All workers finished	{"controller": "mongodbstandalone-controller"}
2025-05-09T16:32:56.750+0200	INFO	controller/controller.go:237	All workers finished	{"controller": "mongodbuser-controller"}
2025-05-09T16:32:56.750+0200	INFO	controller/controller.go:237	All workers finished	{"controller": "opsmanager-controller"}
2025-05-09T16:32:56.750+0200	INFO	controller/controller.go:237	All workers finished	{"controller": "mongodbshardedcluster-controller"}
2025-05-09T16:32:56.750+0200	INFO	manager/internal.go:537	Stopping and waiting for caches
2025-05-09T16:32:56.750+0200	INFO	pprof/pprof.go:52	pprof server stopped

Checklist

  • Have you linked a jira ticket and/or is the ticket in the title?
  • Have you checked whether your jira ticket required DOCSP changes?
  • Have you checked for release_note changes?

Reminder (Please remove this when merging)

  • Please try to Approve or Reject Changes the PR, keep PRs in review as short as possible
  • Our Short Guide for PRs: Link
  • Remember the following Communication Standards - use comment prefixes for clarity:
    • blocking: Must be addressed before approval.
    • follow-up: Can be addressed in a later PR or ticket.
    • q: Clarifying question.
    • nit: Non-blocking suggestions.
    • note: Side-note, non-actionable. Example: Praise
    • --> no prefix is considered a question

@MaciejKaras MaciejKaras force-pushed the feature/mk-CLOUDP-317911 branch from 94fc9ea to 8590b00 Compare May 9, 2025 14:43
@MaciejKaras MaciejKaras marked this pull request as ready for review May 9, 2025 14:43
@MaciejKaras MaciejKaras requested a review from a team as a code owner May 9, 2025 14:43
@MaciejKaras MaciejKaras force-pushed the feature/mk-CLOUDP-317911 branch 2 times, most recently from 4ccc892 to 4162845 Compare May 9, 2025 15:03
Copy link
Contributor

@lsierant lsierant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider one comment, but other than that it's awesome. LGTM!

@MaciejKaras MaciejKaras force-pushed the feature/mk-CLOUDP-317911 branch from a6ba353 to 890c0cb Compare May 15, 2025 07:53
@MaciejKaras MaciejKaras force-pushed the feature/mk-CLOUDP-317911 branch from 890c0cb to 904c0c8 Compare May 15, 2025 07:55
@MaciejKaras MaciejKaras merged commit 2404e92 into master May 15, 2025
31 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants