Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Proposal: using only catch_all to catch all exceptions #31

@aheejin

Description

@aheejin

TL;DR

The current form of separate catch instructions (catch i, catch j, ...,
catch_all) is very hard to generate from the compiler's side and possibly
detrimental for code size and performance. So I propose to merge all catch
blocks into one catch_all block that handles all tags (meaning both C++
exceptions and foreign exceptions).

Problem

Suppose we have this original C++ code.

try {
  ...
} catch (int e) {
  action 1;
} catch (...) {
  action 2;
}

action 1 and action 2 can be arbitrary user code.

Itanium C++ ABI
specifies that catch (...) clause should be able to catch not only C++
exceptions but also foreign exceptions that are not generated from one of C++
catch clauses. It may not be necessary that we should strictly follow the
Itanium ABI spec, it makes most sense for catch (...) to handle also foreign
exceptions anyway because that's the only way for a C++ programmer to specify
some action when it catches a foreign exception. That means, when there is
catch (...), we should generate a catch_all instruction. Then the generated
Wasm code will look like, in pseudocode,

block $label0
  try
    ...
  catch i
    if (int e)
      action 1
    else
      br $label0
  catch_all
    br $label0
  end
end
action 2

Here action 2 part is factored out so that it can be shared between catch i
and catch_all in order to prevent code duplications. But whether we duplicate
action 2 part of the code or factor it out, the requirement is that we should
be able to know which part of the code corresponds to the catch (...) so we
can generate correct code by factoring out or duplication.

Let's see another case. When the original C++ code is like

MyClass obj;
try {
  ...
} catch (int e) {
  action 1;
}

There is no catch (...) in this code, but that means we should generate
cleanup code to call the destructor for obj. And that cleanup code should
run regardless of whether we catch a C++ exception or a foreign exception. So,
it should be either duplicated or factored out as well:

block $label0
  catch <c++>
    if (int e)
      action 1
    else
      br $label0
  catch_all
    br $label0
  end
end
delete obj
rethrow

Now we face the same problem: we should know which part of the code corresponds
to corresponds to the cleanup code.

Separating code within catch (...) by examining and pattern matching LLVM IR
(or any other compiler's IR) is not always possible because code can be
transformed or optimized in many different ways. Windows EH developers once
tried it and failed. Windows EH
requires identifying not only catch (...) but also all the catch clauses, but
the problem here is inherently the same. Pattern matching cleanup code is also
not simple, because from the IR's point of view, they are just function
(desturctor) calls.

Can this be done if we demarcate these parts in clang (or more generally,
frontend code generation phase)? The answer looks, maybe yes, but it will be
much hard, and I'm not sure if it's worth it. Basically what we need is the way
to demarcate some code parts, and prevent any code entering or escaping from
that regions in all of the IR-level passes and backend level passes. Code
hoisting or sinking across the boundaries should not occur in any pass, and
instruction scheduling in backend should treat these boundaries as fences for
not only memory instructions but also all instructions. Windows EH developers
faced similar problems and came up with new specialized
instructions
,
but their objectives were different - they did this because they didn't use
Itanium ABI and they had to satisfy MSVC's spec -, it does not look possible to
reuse their approaches. Also, there will be more work that has to be done: such
as, matching each landing pad to its parent scope's cleanup code.

Even if it is possible by creating new instructions and doing more work on clang
side, it will also prevent code optimization opportunities, because it basically
separates certain parts of code and does not allow any optimization across their
border. For example, shared expressions may not be able to be merged.

Proposal

Considering the amount of work that needs to be done to satisfy the current spec
and the expected downside of code size and performance degradation, I think
having one catch_all instruction that handles all exceptions is the best way
to go. Actually we can do this even with the current spec by only using
catch_all and not using other catch tag instructions, but that brings
another point: is catch tag instruction ever useful?

To use only catch_all, there should be a way to tell if the current exception
is a C++ exception or not within a catch_all clause. While I think it can be
done by setting some variable within some libcxxabi functions (such as
__cxa_throw and __cxa_begin_catch), it would still be better if there is an
easy way to access the currently caught exception's tag within a catch_all
block. Maybe catch_all block can return the caught exception's tag.

Even if catch_all instruction does not put an exception object on top of Wasm
stack, there are ways we can relay an exception object from throw to
catch_all: one possible way is to use Wasm global. throw instruction sets a
Wasm global with the pointer to an exception object so within a catch_all
block we can retrieve it.

The only possible downside of this scheme is, when a foreign exception is thrown
and there is no cleanup code to run for a certain stack frame, anyway it should
stop at that frame because it is caught by catch_all instructions. But I
hardly imagine this case will be common enough to affect performance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions