Skip to content

Implement sending via sidecar #192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 4, 2023
Merged

Implement sending via sidecar #192

merged 4 commits into from
Aug 4, 2023

Conversation

bwoebi
Copy link
Contributor

@bwoebi bwoebi commented Jul 10, 2023

What does this PR do?

  • Endpoint is exposed via FFI
  • Sidecar gains an API to handle agent sampling rates independently of the backend (i.e. working with sidecar and without sidecar). This is handled via mapped (shared) memory pages.
    • The goal is to have an as up-to-date version as possible, but be lock-free. This is achieved by just ignoring the shared data while writes to the shared memory are happening, falling back to a prior process-local copy.
  • Sidecar config is now targets an Endpoint, which can be dynamically manipulated to fit the target url, with a possible API key. (telemetry mock file targets file:// Endpoint)
  • Sidecar uses trace-* crates to send data to an agent or intake endpoint.
  • Some common paths between sidecar and trace-mini-agent have been factored out into trace-utils. @DataDog/libdatadog-serverless, please review that part.
  • File passing in ipc got a macro handling the boilerplate code for FileTransferHandles.
    • Primary motivation is that I've been utterly confused how this works (I was expecting some magic recognizing a trait on some args or such alongside serde) until after a while I discovered that explicit impl TransferHandles for ExampleInterfaceRequest, for both, which is just trivial boilerplate. The macro makes it explicit in the service definition what is passed.
  • Global configuration is now directly passed via env upon sidecar creation. (guarding the self-telemetry for example)

The changes to the sidecar crate have been tested and validated in the dd-trace-php repository.

I'm aware the PR is quite big, but everything needed to come together to do proper end-to-end testing.

Copy link
Contributor

@thedavl thedavl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing the re-factoring, I really like how you abstracted common logic. I left a couple questions regarding some stuff I wasn't clear on.

};

send_request(req, serialized_trace_payload, StatusCode::ACCEPTED).await
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this else block to me? Is it for sending traces to the trace agent from within the sidecar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes :-)
The Agent just takes individual collections of trace chunk, not a whole payload.

pub fn serialize_agent_payload(payload: pb::AgentPayload) -> anyhow::Result<Vec<u8>> {
pub fn serialize_proto_payload<T>(payload: T) -> anyhow::Result<Vec<u8>>
where
T: ::prost::Message,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a general rust question, but what's the purpose of the prefix :: in ::prost::Message? Is this a stylistic choice, as it seems like it would work without it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be probably removed, I believe, I copied it from the generated protobuf files (where it's done to avoid conflicts...).


let mut req = hyper::Request::builder()
.uri(target.url.clone())
.header(hyper::header::USER_AGENT, concat!("Tracer/", env!("CARGO_PKG_VERSION")))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this header used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Telemetry and profiler use the header to clearly distinguish which code is the sender, in case some things are observed on the receiving side so that it's obvious what code sends it.

thedavl
thedavl previously approved these changes Jul 13, 2023
Copy link
Contributor

@thedavl thedavl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good on the serverless end, thanks for the refactoring!

@bwoebi bwoebi force-pushed the bob/sidecar-traces branch 2 times, most recently from c17b1f1 to 2cbbe1e Compare July 18, 2023 17:21
@bwoebi bwoebi changed the base branch from bob/fix-telemetry-leaks to main July 21, 2023 15:23
@bwoebi bwoebi dismissed thedavl’s stale review July 21, 2023 15:23

The base branch was changed.

}
}

camel_ty.shrink_to_fit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? I'd think that in term of perf the reallocation is worse than few extra bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is macro-code though, so just compile time overhead.

@bwoebi bwoebi force-pushed the bob/sidecar-traces branch 2 times, most recently from 857284f to 33705f1 Compare August 3, 2023 13:40
Signed-off-by: Bob Weinand <[email protected]>
@bwoebi
Copy link
Contributor Author

bwoebi commented Aug 3, 2023

@paullegranddc Thanks for the review. I've addressed the comments in the newest commit. Also rebased on main to fix conflicts.

The only remaining question is: this CI failure https://github.com/DataDog/libdatadog/actions/runs/5751516424/job/15590430623?pr=192
Do you have any idea how I work around that?

@bwoebi bwoebi force-pushed the bob/sidecar-traces branch from 33705f1 to a0a697d Compare August 3, 2023 13:47
@paullegranddc
Copy link
Contributor

paullegranddc commented Aug 3, 2023

The only remaining question is: this CI failure https://github.com/DataDog/libdatadog/actions/runs/5751516424/job/15590430623?pr=192
Do you have any idea how I work around that

I think this is because you updated to nix 0.26 which calls to libc::memfd_create https://docs.rs/nix/latest/src/nix/sys/memfd.rs.html#56 and even though this function
centos 7 has a fairly old glibc (2.17), and memfd_create was only added in 2.28 so the linking process fails...

The maintainers are aware of the issue but it seems like they don't want to fix it nix-rust/nix#1972

Can you use nix 0.24 in the sidecar crate? (which is already used by the spawn_worker crate by the way)

@paullegranddc
Copy link
Contributor

Another option they propose in the nix repo thread is to use full LTO for all builds, so that the code calling to libc::memfd_create is removed as dead code before the linking process since it does everything in a single compilation unit.

Or we could disable the memfd code on centos like we do for non linux os using features?

@bwoebi bwoebi force-pushed the bob/sidecar-traces branch from 02b8f4e to 6363b02 Compare August 3, 2023 20:23
@bwoebi
Copy link
Contributor Author

bwoebi commented Aug 3, 2023

I'm not even using the memfd thing from nix. Trying to disable with features seemingly did not help, even though it's the fs feature if I see that correctly, which isn't used?

@bwoebi bwoebi force-pushed the bob/sidecar-traces branch 6 times, most recently from b697a6f to 604b76e Compare August 4, 2023 13:57
@bwoebi bwoebi force-pushed the bob/sidecar-traces branch from 604b76e to e9f2698 Compare August 4, 2023 14:00
paullegranddc
paullegranddc previously approved these changes Aug 4, 2023

// Ensure the next write hasn't started yet *and* the data is from the expected generation
if !new_data.meta.writing.load(Ordering::SeqCst)
&& new_generation == copied_data.meta.generation.load(Ordering::Acquire)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to check that no new generation has happened while you were writing, you should probably load the the atomic here instead of relying on the one read at the beginning of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was using the wrong data as source, needs to be new_data - the goal is checking whether the initial generation before I start copying is identical to the one after copying.

Signed-off-by: Bob Weinand <[email protected]>
@bwoebi bwoebi force-pushed the bob/sidecar-traces branch from 6794d7c to c46f5a3 Compare August 4, 2023 16:02
@bwoebi bwoebi merged commit 2a605f1 into main Aug 4, 2023
@bantonsson bantonsson deleted the bob/sidecar-traces branch March 7, 2024 07:14
duncanpharvey pushed a commit that referenced this pull request Mar 28, 2025
* Implemented tracing and agent sampling in sidecar

* Address CR feedback

Signed-off-by: Bob Weinand <[email protected]>

* Polyfill memfd on old glibc targets

Signed-off-by: Bob Weinand <[email protected]>

* Small nit from CR applied

Signed-off-by: Bob Weinand <[email protected]>

---------

Signed-off-by: Bob Weinand <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants