Skip to content

VC-41203: Allow users to select the Machine Hub mode #653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 4, 2025

Conversation

maelvls
Copy link
Member

@maelvls maelvls commented May 14, 2025

Ref: VC-41203

Still to be done:

Context

Up to now, Venafi Kubernetes Agent (this project) was already able to push to two different backends (Jetstack Secure, and Venafi Cloud). We want Machine Hub to act as another backend ("control plane").

Unlike Jetstack Secure and Venafi Cloud that were exclusive to each other, it was decided that the Venafi Cloud backend could be used alongside the Machine Hub backend. The intention is (probably) to make the transition painless.

What am I adding?

This PR originates from Ashley's prototype (#652); I've added a new flag (--machine-hub) as a temporary hidden feature flag just to make sure we don't change the existing behavior unexpectedly. Later on, the way people will use the feature will be to just configure the machineHub field on config.yaml.

Just to clarify: the Machine Hub integration won't be working yet after this PR. It's just an empty shell for now. This PR is just about adding a way to turn on the feature, but doesn't implement the feature.

Things to consider

Why --machine-hub when everything must be configured in the config file?

You might have noticed that I could have just used the presence of the machineHub field in the configuration to turn on the feature, e.g.:

machineHub:
  subdomain: foo
  credentialsSecretName: secret-1
period: 1h

Since we want to merge things to master, we need a way to turn on/off this feature. So we've decided to have a hidden flag, --machine-hub, that gates the feature.

Later on, once the feature is ready, we will turn on this flag by default (it will thus become useless).

Testing

I've added two new unit tests to check that the configuration logic works.

Here is the manual testing I've done:

Case 1: Machine Hub only

$ go run . agent --machine-hub --install-namespace=default -c /dev/stdin <<EOF
period: 30s
machineHub:
  subdomain: "machinehub"
  credentialsSecretName: creds
data-gatherers: []
EOF
I0514 11:11:52.880764   10543 run.go:59] "Starting" logger="Run" version="development" commit=""
I0514 11:11:52.881631   10543 config.go:432] "Will push to CyberArk MachineHub using a username and password loaded from a Kubernetes Secret" logger="Run" credentialsSecretName="creds"
I0514 11:11:52.881639   10543 config.go:620] "Using period from config" logger="Run" period="30s"
I0514 11:11:52.881645   10543 run.go:117] "Healthz endpoints enabled" logger="Run.APIServer" addr=":8081" path="/healthz"
I0514 11:11:52.881652   10543 run.go:121] "Readyz endpoints enabled" logger="Run.APIServer" addr=":8081" path="/readyz"
E0514 11:11:52.882350   10543 run.go:275] "Error messages will not show in the pod's events because the POD_NAME environment variable is empty" logger="Run"
I0514 11:11:52.882373   10543 run.go:344] "machine hub mode not yet implemented" logger="Run.gatherAndOutputData"

Case 2: Machine Hub + Venafi Cloud Key Pair Auth mode

For this test, I've used the tenant https://ven-tlspk.venafi.cloud/. To access the API key, use the user [email protected] and the password is visible in the page Production Accounts (private to Venafi). Then go to the settings and find the API key, and set it in the env var APIKEY.

export APIKEY=...
venctl iam service-account firefly create --name $USER-temp \
  --output json \
  --owning-team "$(curl -sS https://api.venafi.cloud/v1/teams -H "tppl-api-key: $APIKEY" | jq '.teams[0].id')" \
  --output-file svc-acct.json \
  --api-key "$APIKEY"
jq -r .private_key svc-acct.json >svc-acct-priv-key.pem
go run . agent --machine-hub --venafi-cloud --client-id "$(jq -r .client_id svc-acct.json)" --private-key-path svc-acct-priv-key.pem --install-namespace=default -c /dev/stdin <<EOF
period: 30s
machineHub:
  subdomain: "machinehub"
  credentialsSecretName: creds
data-gatherers: []
venafi-cloud:
  upload_path: /v1/tlspk/upload/clusterdata
cluster_id: mael temp
EOF

I got:

I0514 11:39:52.431463   30938 run.go:59] "Starting" logger="Run" version="development" commit=""
I0514 11:39:52.432450   30938 config.go:432] "Will push to CyberArk MachineHub using a username and password loaded from a Kubernetes Secret" logger="Run" credentialsSecretName="creds"
I0514 11:39:52.432462   30938 config.go:620] "Using period from config" logger="Run" period="30s"
I0514 11:39:52.432467   30938 config.go:841] "Loading upload_path from \"venafi-cloud\" configuration." logger="Run"
I0514 11:39:52.432875   30938 run.go:117] "Healthz endpoints enabled" logger="Run.APIServer" addr=":8081" path="/healthz"
I0514 11:39:52.432885   30938 run.go:121] "Readyz endpoints enabled" logger="Run.APIServer" addr=":8081" path="/readyz"
E0514 11:39:52.433426   30938 run.go:275] "Error messages will not show in the pod's events because the POD_NAME environment variable is empty" logger="Run"
I0514 11:39:52.433451   30938 run.go:344] "machine hub mode not yet implemented" logger="Run.gatherAndOutputData"
I0514 11:39:52.997928   30938 run.go:339] "Warning: PushingErr: retrying" logger="Run.gatherAndOutputData" in="43.997497139s" reason=<
	post to server failed: received response with status code 400. Body: [{"message":"Bad Request"}
	]
 >

That's seemingly because https://ven-tlspk.venafi.cloud/ no longer has the feature flat enabled; from Datadog:

Feature flag evaluated to true. Tenant temporarily restricted from pushing data to Venafi Control Plane

Screenshot 2025-05-14 at 11 45 49-fs8

Copy link
Contributor

@SgtCoDFish SgtCoDFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I don't think we can merge with the flag as-is because I think it should be hidden or have a disclaimer (or both).

This looks great!

Comment on lines 339 to 347
c.PersistentFlags().BoolVar(
&cfg.MachineHubMode,
"machine-hub",
false,
"Enables MachineHub mode. The agent will push data to CyberArk MachineHub. Can be used in conjunction with --venafi-cloud.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I agree that this is probably not required in the long term, we can just enable it if configuration is provided.

In the short term, I think this feature will be rolled out slowly and we'll want it to be there for testing before it's ready for production.

With that in mind, I think it makes sense to have this flag, but hidden by default or else with a description that indicates the feature is for testing only. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of hiding it so that we can roll this out without waiting. I'll do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Here is how I describe --machine-hub:

This is a hidden feature flag we use to build the "Machine Hub" feature gradually without impacting customers. Once the feature is GA, we will turn this flag "on" by default.

Comment on lines +102 to +105
// CredentialsSecretName is the name of a Kubernetes Secret in the same
// namespace as the agent, which will be watched for a username and password
// to send to CyberArk Identity for authentication.
CredentialsSecretName string `yaml:"credentialsSecretName"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: do you think we could include the secret watcher in this PR, or should that be left for a separate PR? It's definitely something we'll need!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the watcher to the next PR 😅

@SgtCoDFish
Copy link
Contributor

@maelvls what's the latest on this? I think once tests are passing we can probably get merged, right?

@maelvls
Copy link
Member Author

maelvls commented Jun 3, 2025

Hey, I keep forgetting about this PR. It looks ready to be merged, all feedback seems to have been addressed. I'll resolve the conflicts and hope that I don't get 429s during the go-licenses check 😅

@maelvls maelvls force-pushed the cyberark-enable-machinehub branch from 4cd49ed to 82295ef Compare June 3, 2025 10:37
@maelvls maelvls force-pushed the cyberark-enable-machinehub branch from 82295ef to 129b53d Compare June 3, 2025 10:45
@maelvls
Copy link
Member Author

maelvls commented Jun 3, 2025

I rebased, it should be good to go. I hope I didn't make any mistake when migrating to backoff/v5 🙏 PTAL @SgtCoDFish

Copy link
Contributor

@SgtCoDFish SgtCoDFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Seems like a great base to build upon, thank you 😁

@SgtCoDFish SgtCoDFish merged commit 214f4d7 into master Jun 4, 2025
2 checks passed
@SgtCoDFish SgtCoDFish deleted the cyberark-enable-machinehub branch June 4, 2025 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants