Allow disabling segmented stacks. #8955

emberian · 2013-09-03T18:33:00Z

This is one of the very common options a standalone ("runtimeless") profile
would want to use. This disables all segmented stack support at the LLVM
level.

This is one of the very common options a standalone ("runtimeless") profile would want to use. This disables all segmented stack support at the LLVM level.

emberian · 2013-09-03T18:34:50Z

r? @alexcrichton

thestinger · 2013-09-03T20:38:24Z

I don't think this is the right approach, because it means all functions must be treated as unsafe. The __morestack function should be pluggable at compile-time to allow you to supply one not using TLS and aborting on out-of-stack.

emberian · 2013-09-03T21:23:46Z

@thestinger __morestack is pluggable afaik: just link to a library providing it. It's the prolog to before-calling it that's the problem.

alexcrichton · 2013-09-04T04:54:59Z

This is certainly a tricky topic. There's debate about what to do about stacks and what's necessary in various places, and I don't think that this should be merged as-is.

I wrote up my opinions in another bug about this issue. I think that additionally this can cause lots of problems in linking with various libraries.

It seems to me that you shouldn't be able to mix crates of split stacks and no split stacks. Additionally, no split stacks essentially means that you shouldn't be running with the runtime either, so it shouldn't link that in either. The discussion in that bug kinda petered out, but I think that this warrants more discussion to settle on a strategy for how to manage stacks. There'd also been some discussion in the #[stack] attribute bug, but not quite along the lines of this.

I would love to see something like this merged though, although it definitely needs to be done very carefully. Right now the specific things which I would want to see (or also just discussed about seeing) would be:

I think this should be specified as a crate attribute instead of a -Z flag
You shouldn't be able to mix crates that have different settings of split stacks.
This crate-level attribute should disable all fixed_stack_segment checks in ffi functions
Any functions encountered with fixed_stack_segment should possibly be warned/errored about
This still links crates without segmented stacks to librustrt and libmorestack which is dubious (mostly the libmorestack part)

A concrete use case for this would be useful to design around. Right now my best idea for this is a very basic kernel. You don't necessarily need stack checks in kernels, and you're definitely not using libstd at all.

thestinger · 2013-09-04T05:07:20Z

You do still need overflow checks if Rust is going to keep up the semblance of safety. A stack overrun is an exploitable vulnerability, so functions with them omitted are unsafe.

Segmented stacks and stack safety are separate topics. Rust always needs to provide safety from stack overruns in functions not marked unsafe. Stack growth is definitely not required for accurate accounting of stack usage per-thread/task.

emberian · 2013-09-04T13:18:20Z

@alexcrichton yup: my sample usecase driving this work is indeed a kernel.

In my environment I am going to use guard pages, but for MMU-less environments with small amounts of flash (cortex m0 comes to mind), the prolog is going to hurt code size a lot. Is there an alternative to something functionally equivalent to this patch?

I agree that omitting stack size checking is unsafe. Perhaps the attribute could be #[unsafe_no_stack_size_check]?

auroranockert · 2013-09-04T14:02:07Z

I'm not sure what my final opinion on this is, but I'll chime in with an anecdote.

I'm currently writing some Rust code for Cortex M3/M4 (thumb1/2-only ARM cores without MMUs) and I don't really care about stack, the stack simply grows until I overwrite the heap, and if I overflow, it will crash the entire system with or without segmented stacks, since if I run over, I will have run out of heap as well, no chance of recovery.

But the price is double, since without an MMU, checking for overflow is extra expensive as well, since we cannot simply use a guard page. But have to check the size of the heap vs. the size of the stack and see if it overflows total memory. (And with 256k flash, every bit is precious)

I don't want to mark everything unsafe, but I am well aware that stack overflows are an issue. In this case, I think disabling segmented stacks would make total sense.

thestinger · 2013-09-04T20:19:11Z

Overflowing the stack is an exploitable bug, and will not necessarily crash the system right away. Especially if you have multiple threads, the stacks will overrun into each other. Segmented stacks are not a requirement for stack safety - something Rust does have to provide as part of the contract in safe functions.

If you do have a guard page or multiple guard pages, you still need the compiler to know how large they are so it can insert checks into functions allocating more than that size. In my opinion, if we're going to provide the option to disable them, we're going to need to implement that along with a method working without guard pages (just __morestack with abort).

alexcrichton · 2013-09-05T06:57:10Z

From these discussions, I would be OK with the following course of action. We have two different modes of stacks. One is segmented stacks, one is stack-overflow checks. Segmented stacks would be as is today where there's a silent contract with the OS that the stack limit is stored in a very special and easily accessible location.

With stack-overflow checks, we would have a silent contract with whatever runtime is in place that there is a guard page at the end of the stack (triggering a fault if it is hit). The compiler will then insert calls to a function called __chkstk (or something like that) in any function which allocates more than a page of stack. This __chkstk function is not pre-defined (unless we have something like another profile of libstd/librustrt). The reason that this isn't part of every function is that it's way too much overhead for all functions to call this __chkstk function.

That way the use cases can be as follows:

Normal rust user - segmented stacks
Normal rust user who doesn't want segmented stacks - use the alternate version of libstd/librustrt which has tasks disabled but supplies a rust-maintained version of __chkstk
Kernel writers - disable segmented stacks and can implement __chkstk depending on how fancy their kernel is.
Embedded systems writers - disable segmented stacks and have __chkstk to be a no-op

That way everyone is always forced to think about stack protection and the stack being overflown. You can opt out of actually performing the check (__chkstk is a no-op), but you must explicitly do so more than once.

Even with this method I would want most of the requirements I laid out above (especially about mixing crates with different types of stacks), but those are implementation details.

What do you guys think of this?

emberian · 2013-09-05T13:46:25Z

@alexcrichton that looks great to me, except the following:

The compiler will then insert calls to a function called __chkstk (or something like that) in any function which allocates more than a page of stack.

Why does only checking when using more than a page make sense, and how would it be implemented? If you need 32 bytes of stack and you're near the page boundary, you're going to spill into the next page anyway. I don't really see why it's harmful.

alexcrichton · 2013-09-06T02:10:19Z

This is the idea of a guard page. Whenever a small function overflows the stack, it'll guarantee to trigger a segfault when writing to the guard page (which is when it's reallocated and the instruction is retried). Large function's aren't guaranteed to hit the guard page though so they need the check function.

Implementation-wise __chkstk would receive the amount of stack requested, and it will check the current stack limit and see if it's inbounds, if not it will allocate more stack, set up a new guard page, and return.

emberian · 2013-09-06T03:11:24Z

@alexcrichton that doesn't help if you don't have an MMU

thestinger · 2013-09-06T03:17:09Z

If you don't have an MMU, you need the thread (or task) local __morestack calling abort on overflow.

emberian · 2013-09-06T03:19:35Z

Ok, fair enough.

On Thu, Sep 5, 2013 at 11:17 PM, Daniel Micay [email protected]:

If you don't have an MMU, you need the thread (or task) local __morestackcalling abort on overflow.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/8955#issuecomment-23916560
.

alexcrichton · 2013-09-06T03:19:42Z

Hmm, so I would revise my strategy to there would be a strict/documented __morestack contract (where the stack limit is stored) and then there should be an option to not use libmorestack, but rather provide your own. That way if you don't have an MMU you can still have stack safety if you want.

emberian · 2013-09-06T03:21:31Z

So perhaps there is another function, __get_stack_size, that users also
implement if they need to? I know LLVM does it for us, is there a way to
configure what it does?

On Thu, Sep 5, 2013 at 11:19 PM, Alex Crichton [email protected]:

Hmm, so I would revise my strategy to there would be a strict/documented
__morestack contract (where the stack limit is stored) and then there
should be an option to not use libmorestack, but rather provide your own.
That way if you don't have an MMU you can still have stack safety if you
want.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/8955#issuecomment-23916609
.

alexcrichton · 2013-09-06T03:23:19Z

Sadly I don't think that a segmented-stack scheme could be performant at all if you double all function calls by default (by calling __get_stack_size), so to be reasonable I think that they'd have to be stored in defined locations.

I'm not actually sure where that's initialized for us. I think that we do it somewhere between rt::start and the main closure, but I wouldn't guarantee that.

emberian · 2013-09-06T03:39:22Z

I imagine inlining and constant propagation would take care of us.

On Thu, Sep 5, 2013 at 11:23 PM, Alex Crichton [email protected]:

Sadly I don't think that a segmented-stack scheme could be performant at
all if you double all function calls by default (by calling
__get_stack_size), so to be reasonable I think that they'd have to be
stored in defined locations.

I'm not actually sure where that's initialized for us. I think that we
do it somewhere between rt::start and the main closure, but I wouldn't
guarantee that.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/8955#issuecomment-23916696
.

alexcrichton · 2013-09-06T03:51:37Z

It depends, I think that would fall under the category of link-time-optimization because the __get_stack_size function would likely be in a library (not always), and rust currently doesn't do that.

emberian · 2013-09-06T03:52:24Z

Ok, sure.

On Thu, Sep 5, 2013 at 11:51 PM, Alex Crichton [email protected]:

It depends, I think that would fall under the category of
link-time-optimization because the __get_stack_size function would likely
be in a library (not always), and rust currently doesn't do that.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/8955#issuecomment-23917357
.

auroranockert · 2013-09-06T09:33:30Z

@alexcrichton: Wouldn't something like this make more sense?

Normal rust user - use fixed size (huge) stacks, for fast FFI, but checked for overflow
Normal rust user with small amounts of virtual memory (Some 32-bit systems, systems without overcommit, &c.) - enables segmented stacks, at the cost of slow FFI and some other stuff semi-randomly breaking (like some llvm intrinsics).
Kernel writers / Embedded systems writers / Others with special needs - Implements their own __chkstk.

nikomatsakis · 2013-09-06T15:12:22Z

fwiw, the usual strategy is not __chkstk but rather that the prologue stores a value once every N-kbytes where N is the size of a page.

alexcrichton · 2013-09-06T16:31:26Z

@jensnockert sounds reasonable to me, I think that's kinda what we're doing right now as well. The primary use case would still be segmented-stacks, although they just wouldn't be that segmented. You'd still want your program to continue if you overflow and you also want multiple threads so no __chkstk.

@nikomatsakis oh interesting! Although one thing I would be worried about would be thrashing the segfault handler if it didn't allocate enough stack. That could probably be easily measured/gotten around though.

emberian · 2013-09-25T20:17:21Z

Closing this and adding it to my todo list.

Allow disabling segmented stacks.

ebe3eb3

This is one of the very common options a standalone ("runtimeless") profile would want to use. This disables all segmented stack support at the LLVM level.

emberian closed this Sep 25, 2013

emberian mentioned this pull request May 24, 2014

New codegen option "no-split-stack" #14394

Closed

Allow disabling segmented stacks. #8955

Allow disabling segmented stacks. #8955

Uh oh!

Conversation

emberian commented Sep 3, 2013

Uh oh!

emberian commented Sep 3, 2013

Uh oh!

thestinger commented Sep 3, 2013

Uh oh!

emberian commented Sep 3, 2013

Uh oh!

alexcrichton commented Sep 4, 2013

Uh oh!

thestinger commented Sep 4, 2013

Uh oh!

emberian commented Sep 4, 2013

Uh oh!

auroranockert commented Sep 4, 2013

Uh oh!

thestinger commented Sep 4, 2013

Uh oh!

alexcrichton commented Sep 5, 2013

Uh oh!

emberian commented Sep 5, 2013

Uh oh!

alexcrichton commented Sep 6, 2013

Uh oh!

emberian commented Sep 6, 2013

Uh oh!

thestinger commented Sep 6, 2013

Uh oh!

emberian commented Sep 6, 2013

Uh oh!

alexcrichton commented Sep 6, 2013

Uh oh!

emberian commented Sep 6, 2013

Uh oh!

alexcrichton commented Sep 6, 2013

Uh oh!

emberian commented Sep 6, 2013

Uh oh!

alexcrichton commented Sep 6, 2013

Uh oh!

emberian commented Sep 6, 2013

Uh oh!

auroranockert commented Sep 6, 2013

Uh oh!

nikomatsakis commented Sep 6, 2013

Uh oh!

alexcrichton commented Sep 6, 2013

Uh oh!

emberian commented Sep 25, 2013

Uh oh!

Uh oh!