-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Introduction
Have you ever wanted an interface (or abstract type) to be parameterized on a polymorphic variable to its implementer, but not its users? Have you ever wanted to parameterize over internal state, without revealing what that state contains to your users? This is what existential types allow you to do.
Let's look at an example.
interface class IDataFlow<TLocalState>
{
...
TLocalState Start { get; }
void Next(TLocalState);
List<string> GetCurrentDataFlowDiagnostics();
}What does this type do? Let's say it walks a data flow tree and produces diagnostics. It does so incrementally -- the user can step through the tree by first calling Start, then moving forward by calling Next and passing the current state. They can't actually view the current state represented by TLocalState, but that's OK, they don't need to! They can view the diagnostics at any point, do some computation, then decide to continue or quit. In this sense, TLocalState is a black box to the user.
However, TLocalState is anything but a black box to an implementer. They care very much that this type is generic; different data flow analyses may have wildly different state implementations depending on what they're tracking. Moreover, since they provide the state, it's not a black box to them, they'll have a concrete type in hand. For instance,
class DefiniteAssignment : IDataFlow<BitVector> { ... }The implementer of DefiniteAssignment would be able to use BitVector statically in this implementation.
But this presents a problem. This is how a user interacts with IDataFlow:
class C
{
void DoDataFlowAnalysis<T>(IDataFlow<T> df)
{ ... }
}Note that the worker method must be generic, even though the type parameter is useless to them. Moreover, they have to flow this parameter around through their whole program until the entry point where they actually pass a specific data flow analysis. And if they have multiple methods that do different analyses, they have to do this for each one, on every method.
So what would we rather have? Probably something like this:
interface IDataFlow<protected TLocalState> { ... }
class DefiniteAssignment : IDataFlow<BitVector> { ... }
class C
{
void M(IDataFlow df) { ... }
}Now TLocalState is an existential type.
Details
Of course, the question is what is this and how does it work?
Now may be the time for a quick digression into theory to explain the background of existential types. The name comes from logic, where you can "existentially quantify" a variable, like there exists an x such that x^3 == 9. Similarly, you can also "universally quantify" a variable: for all x, x * 0 == 0.
Due to the Curry-Howard correspondence, the systems of logic and the typing calculus have parallels with one another. In the case of the C# type system, universal quantification equates to generic polymorphism. When we declare a method M<T>, this could be seen as the declaration of a function mapping from all possible Ts to a specialized M for each T. In other words, we're quantifying the method M for all possible Ts.
If we already have an analogy for universal quantification, there's a question of what existential quantification looks like. In the previous example, if we said there's an instance of M<T> for all Ts, existential quantification would say that there's a type U<T> for some specific T. Existential types are a way of working with a type variable that's set to something but you don't know what it is.
With that brief divergence done, we should now talk about how this feature would be implemented. For this, I'm going to turn to a simpler example. Say we have a counter interface:
interface ICounter<protected T>
{
T Start { get; }
void Next(T current);
bool Done { get; }
}Here's how we could use it:
void M(ICounter ic)
{
var x = ic.Start;
while (!ic.Done)
{
x = ic.Next(x);
}
}This immediately provokes some questions, most importantly, what is the type of x? The answer is, it's ic.T. This is some type that exists on ic. Note, it doesn't exist on ICounter -- it's not static. You can't take another instance of ICounter and unify the two T's. However, it does unify with the other T on ic; the value returned from ic.Next is the same type as the one returned from ic.Start. How exactly we want to represent this in the language is an open question. Here are the specific problems to solve:
- Will we have special syntax for this type?
- Do we need to introduce a binding in some way?
- The type clearly can't be returned or assigned to a field as is, because there is no way to mention it, so is it closer to an anonymous type?
Once we decide how to use it on the interface, we also have to decide how to implement that interface. In this case, it seems little different from an ordinary type parameter:
class Counter : ICounter<int>
{
int Start => 0;
int Next(int current) => current + 1;
bool Done => current == 42;
}Lowering
If that is how the user sees it, we also have to deal with how the CLR sees it. Luckily, we have a theorem in logic that says we can transform existential types into an encoding of universal types. I've used that theorem as a base for a transformation for C#.
Lowering from above code:
using System;
interface ICounter<T>
{
T Start { get; }
T Next(T current);
bool Done(T current);
}
class Counter : ICounter<int>, ICounterWrapper
{
public int Start => 0;
public int Next(int current) => current + 1;
public bool Done(int current) => current == 42;
void ICounterWrapper.Unwrap<TWitness>(TWitness witness)
=> witness.Invoke(this);
}
interface ICounterWitness
{
void Invoke<T>(ICounter<T> counter);
}
interface ICounterWrapper
{
void Unwrap<TWitness>(TWitness w)
where TWitness : ICounterWitness;
}
struct MWitness : ICounterWitness
{
public void Invoke<T>(ICounter<T> ic)
{
var x = ic.Start;
while (!ic.Done(x))
{
x = ic.Next(x);
}
}
}
class C
{
void M(ICounterWrapper wrapper)
{
wrapper.Unwrap(new MWitness());
}
}In this transformation, there are two "intermediate" interfaces that are created to transform the existential type production into a universal type production. In addition, the code which uses ICounter<protected T> has been lowered into a synthesized struct, MWitness, that executes the given code. The scope of code currently hoisted into the struct is the whole method in this example, but it it's an open question what the proper hoisting scope is -- and how this interacts with other code hoisting, like nested functions, async methods, and iterator methods.
Similarly, there are other questions like how this interacts with other language and runtime features, especially reflection, but I believe the feature is at least possible to implement on the current CLR.