Skip to content

Is it immediate UB to cancel a thread and deallocate it's stack #405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chorman0773 opened this issue May 23, 2023 · 9 comments
Closed

Is it immediate UB to cancel a thread and deallocate it's stack #405

chorman0773 opened this issue May 23, 2023 · 9 comments

Comments

@chorman0773
Copy link
Contributor

System APIs pthread_exit and pthread_cancel are able to terminate a thread and deallocate it's stack without running destructors.

Because of APIs like Pin, this cannot be sound for Rust code. Is there any reason that this should be considered undefined behaviour immediately (if any rust frame is on its stack), or just unsound in the face of stack-pinning. Additionally, is there anything else beyond Pin that would cause this to be unsound (if it is defined).

@CAD97
Copy link

CAD97 commented May 24, 2023

Generally, Rust code is allowed to assume that stack locals' destructors run, so deallocating stack without running destructors is isomorphic to a forced unwind (longjmp) without running destructors, and is presumably UB since this is a guarantee from the language to the developers.

Before pinning, we had scoped threads and take_mut; various APIs that rely on the running of destructors (sometimes just of unwind-to-abort shims) for soundness in the face of other threads seeing the mess left behind if we don't get a chance to clean up.

Carving an open middle in language semantics is hard; you can't really have both that stack destructors are guaranteed to run before deallocation (i.e. stack pinning is sound) and that it's not UB if destructors aren't run; failure to prevent NonBehavior is exactly as meaningless as the execution of UB; you've exited the domain of the well-defined language and now all bets are off.

@digama0
Copy link

digama0 commented May 24, 2023

Carving an open middle in language semantics is hard; you can't really have both that stack destructors are guaranteed to run before deallocation (i.e. stack pinning is sound) and that it's not UB if destructors aren't run; failure to prevent NonBehavior is exactly as meaningless as the execution of UB; you've exited the domain of the well-defined language and now all bets are off.

I don't agree with this. Skipping destructors violates the safety condition on the affected types, hence longjmp or equivalents must be unsafe, but it does not mean that it has to be immediate UB unless the compiler also takes advantage of the running of destructors in some way (which may well exist but does not seem to be indicated by this line of investigation). The UB would happen when one of those types with a violated safety condition goes on to assume that its invariants still hold and does a UB later.

In practice, this still means that you (the user) can't meaningfully use longjmp safely unless you know you are not jumping over destructors, or you control all the code being jumped and know that it is safe to mem::forget those types, since if you jump over code out of your control then you have no idea if doing this will cause UB in any later operation.

@RalfJung
Copy link
Member

In the context of the Rust AM, asking whether thread canceling / setjmp-longjmp are UB doesn't even make sense since these operations literally cannot be expressed in the language. Calling such operations is akin to using inline assembly to modify state that is private to the abstract machine implementation: it is your responsibility to ensure that all invariants are preserved. If you fail to do that, you have UB -- but a somewhat different kind of UB than what Miri raises, since this UB is not triggered by the AM, it arises from violating the assumptions of the "AM to ASM" lowering proof.

We don't currently have a great way to write down these invariants; I guess the best we can do is tell programmers some things they can do and not let them do anything else. (Yes, this is axiomatic.)

Having said all this I guess it doesn't really help to answer the question... except for one observation: axiomatically speaking, the default is that things are UB, and we have to explicitly write down each guarantee ("you can mess with the ASM state in the following ways and it's okay"). We should probably be careful not to guarantee too much here until we have a better understanding of which obligations this imposes on the compiler.

@RalfJung
Copy link
Member

Why is this thread separate from #404? Both are basically the same question: removing Rust stack frames from the stack without giving the AM implementation a chance to do anything. We should answer these questions in a consistent way and also discuss them together, IMO.

@chorman0773
Copy link
Contributor Author

chorman0773 commented May 24, 2023 via email

@nbdd0121
Copy link

In glibc pthread_exit is implemented as forced unwind, and, if that failed, a longjmp. So they are exactly the same.

For implementations that just discard the stack completely, we can still conceptually model it as a longjmp to the point before the first Rust frame.

If we solve the partial frame deallocation problem then it essentially solves the total frame deallocation problem.

@RalfJung
Copy link
Member

RalfJung commented May 25, 2023 via email

@JakobDegen
Copy link
Contributor

Ah, the point about this actually being the same question as longjmp is a good one, I hadn't realized. I'm ok to close this issue then if we want

@RalfJung
Copy link
Member

Closing as duplicate of #404 then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants