Skip to content

Lightweight extension types #1426

Closed
Closed
@eernstg

Description

@eernstg

[Jun 16 2021: Note the proposal for extension types and the newer proposal for views.]

[Feb 24th 2021: This issue is getting too long. Please use separate issues to discuss subtopics of this feature.]

[Editing: Note that updates are described at the end, search for 'Revisions'.]

Cf. #40, #42, and #1474, this issue contains a proposal for how to support static extension types in Dart as a minimal enhancement of the static extension methods that Dart already supports.

In this proposal, a static extension type is a zero-cost abstraction mechanism that allows developers to replace the set Sinstance of available operations on a given object o (that is, the instance members of its type) by a different set Sextension of operations (the members declared by the specific extension type).

One possible perspective is that an extension type corresponds to an abstract data type: There is an underlying representation, but we wish to restrict the access to that representation to a set of operations that are completely independent of the operations available on the representation. In other words, the extension type ensures that we only work with the representation in specific ways, even though the representation itself has an interface that allows us to do many other (wrong) things.

It would be straightforward to achieve this by writing a class C with members Sextension as a wrapper, and working on the wrapper object new C(o) rather than accessing o and its methods directly.

However, creation of wrapper objects takes time and space, and in the case where we wish to work on an entire data structure we'd need to wrap each object as we navigate the data structure. For instance, we'd need to wrap every node in a tree if we wish to traverse a tree and maintain the discipline of using Sextension on each node we visit.

In contrast, the extension type mechanism is zero-cost in the sense that it does not use a wrapper object, it enforces the desired discipline statically.

Examples

extension ListSize<X> on List<X> {
  int get size => length;
  X front() => this[0];
}

void main() {
  ListSize<String> xs = <String>['Hello']; // OK, upcast.
  print(xs); // OK, `toString()` available on Object?.
  print("Size: ${xs.size}. Front: ${xs.front()}"); // Available members.
  xs[0]; // Error, no `operator []`.

  List<ListSize<String>> ys = [xs]; // OK.
  List<List<String>> ys2 = ys; // Error, downcast.
  ListSize<ListSize<Object>> ys3 = ys; // OK.
  ys[0].front(); // OK.
  ys3.front().front(); // OK.
  ys as List<List<String>>; // `ys` is promoted, succeeds at run time.
  // We may wish to lint promotion out of extension types.
}

A major application would be generated extension types, handling the navigation of dynamic object trees (such as JSON, using something like numeric types, String, List<dynamic>, Map<String, dynamic>), with static type dynamic, but assumed to satisfy a specific schema.

Here's a tiny core of that, based on nested List<dynamic> with numbers at the leaves:

extension TinyJson on Object? {
  Iterable<num> get leaves sync* {
    var self = this;
    if (self is num) {
      yield self;
    } else if (self is List<dynamic>) {
      for (Object? element in self) {
        yield* element.leaves;
      }
    } else {
      throw "Unexpected object encountered in TinyJson value";
    }
  }
}

void main() {
  TinyJson tiny = <dynamic>[<dynamic>[1, 2], 3, <dynamic>[]];
  print(tiny.leaves);
}

Proposal

Syntax

This proposal does not introduce new syntax.

Note that the enhancement sections below do introduce new syntax.

Static analysis

Assume that E is an extension declaration of the following form:

extension Ext<X1 extends B1, .. Xm extends Bm> on T { // T may contain X1 .. Xm.
  ... // Members
}

It is then allowed to use Ext<S1, .. Sm> as a type: It can occur as the declared type of a variable or parameter, as the return type of a function or getter, as a type argument in a type, or as the on-type of an extension.

In particular, it is allowed to create a new instance where one or more extension types occur as type arguments.

When m is zero, Ext<S1, .. Sm> simply stands for Ext, a non-generic extension. When m is greater than zero, a raw occurrence Ext is treated like a raw type: Instantiation to bound is used to obtain the omitted type arguments.

We say that the static type of said variable, parameter, etc. is the extension type Ext<S1, .. Sm>, and that its static type is an extension type.

If e is an expression whose static type is the extension type Ext<S1, .. Sm> then a member access like e.m() is treated as Ext<S1, .. Sm>(e as T).m() where T is the on-type corresponding to Ext<S1, .. Sm>, and similarly for instance getters and operators. This rule also applies when a member access implicitly has the receiver this.

That is, when the type of an expression is an extension type, all method invocations on that expression will invoke an extension method declared by that extension, and similarly for other member accesses. In particular, we can not invoke an instance member when the receiver type is an extension type.

For the purpose of checking assignability and type parameter bounds, an extension type Ext<S1, .. Sm> with type parameters X1 .. Xm and on-type T is considered to be a proper subtype of Object?, and a proper supertype of [S1/X1, .. Sm/Xm]T.

That is, the underlying on-type can only be recovered by an explicit cast, and there are no non-trivial supertypes. So an expression whose type is an extension type is in a sense "in prison", and we can only obtain a different type for it by forgetting everything (going to a top type), or by means of an explicit cast.

When U is an extension type, it is allowed to perform a type test, o is U, and a type check, o as U. Promotion of a local variable x based on such type tests or type checks shall promote x to the extension type.

Note that promotion only occurs when the type of o is a top type. If o already has a non-top type which is a subtype of the on-type of U then we'd use a fresh variable U o2 = o; and work with o2.

There is no change to the type of this in the body of an extension E: It is the on-type of E. Similarly, extension methods of E invoked in the body of E are subject to the same treatment as previously, which means that extension methods of the enclosing extension can be invoked implicitly, and it is even the case that extension methods are given higher priority than instance methods on this, also when this is implicit.

Dynamic semantics

At run time, for a given instance o typed as an extension type U, there is no reification of U associated with o.

By soundness, the run-time type of o will be a subtype of the on-type of U.

For a given instance of a generic type G<.. U ..> where U is an extension type, the run-time representation of the generic type contains a representation of the on-type corresponding to U at the location where the static type has U. Similarly for function types.

This implies that void Function(Ext) is represented as void Function(T) at run-time when Ext is an extension with on-type T. In other words, it is possible to have a variable of type void Function(T) that refers to a function object of type void Function(Ext), which seems to be a soundness violation. However, we consider such types to be the same type at run time, which is in any case the finest distinction that we can maintain. There is no soundness issue, because the added discipline of an extension type is voluntary, it is still sound as long as we treat the underlying object according to the on-type.

A type test, o is U, and a type check, o as U, where U is an extension type, is performed at run time as a type test and type check on the corresponding on-type.

Enhancements

The previous section outlines a core proposal. The following sections introduce a number of enhancements that were discussed in the comments on this issue.

Prevent implicit invocations: Keyword 'type'.

Consider the type int. This type is likely to be used as the on-type of many different extension types, because it allows a very lightweight object to play the role as a value with a specific interpretation (say, an Age in years or a Width in pixels). Different extension types are not assignable to each other, so we'll offer a certain protection against inconsistent interpretations.

If we have many different extension types with the same or overlapping on-types then it may be impractical to work with: Lots of extension methods are applicable to any given expression of that on-type, and they are not intended to be used at all, each of them should only be used when the associated interpretation is valid.

So we need to support the notion of an extension type whose methods are never invoked implicitly. One very simple way to achieve this is to use a keyword, e.g., type. The intuition is that an 'extension type' is used as a declared type, and it has no effect on an expression whose static type matches the on-type. Here's the rule:

An extension declaration may start with extension type rather than extension. Such an extension is not applicable for any implicit extension method invocations.

<extensionDeclaration> ::=
    'extension' 'type'? <typeIdentifier> <typeParameters>? 'on' <type>
    '{' (<metadata> <extensionMemberDefinition>)* '}'

For example:

extension type Age on int {
  Age get next => this + 1;
}

void main() {
  int i = 42;
  i.next; // Error, no such method.
  Age age = 42;
  age.next; // OK.
}

Allow instance member access: show, hide.

The core proposal in this issue disallows invocations of instance methods of the on-type of a given extension type. This may be helpful, especially in the situation where the main purpose of the extension type is to ensure that the underlying data is processed in a particular, disciplined manner, whereas the on-type allows for many other operations (that may violate some invariants that we wish to maintain).

However, it may also be useful to support invocation of some or all instance members on a receiver whose type is an extension type. For instance, there may be some read-only methods that we can safely call on the on-type, because they won't violate any invariants associated with the extension type. We address this need by introducing hide and show clauses on extension types.

An extension declaration may optionally have a show and/or a hide clause after the on clause.

<extensionDeclaration> ::=
    'extension' 'type'? <typeIdentifier> <typeParameters>?
        'on' <type> <extensionShowHide>
    '{' (<metadata> <extensionMemberDefinition>)* '}'

<extensionShowHide> ::= <extensionShow>? <extensionHide>?

<extensionShow> ::= 'show' <extensionShowHideList>
<extensionHide> ::= 'hide' <extensionShowHideList>

<extensionShowHideList> ::= <extensionShowHideElement> |
    <extensionShowHideList> ',' <extensionShowHideElement>

<extensionShowHideElement> ::= <type> | <identifier>

We use the phrase extension show/hide part, or just show/hide part when no doubt can arise, to denote a phrase derived from <extensionShowHide>. Similarly, an <extensionShow> is known as an extension show clause, and an <extensionHide> is known as an extension hide clause, similarly abbreviated to show clause and hide clause.

The show/hide part specifies which instance members of the on-type are available for invocation on a receiver whose type is the given extension type.

A compile-time error occurs if an extension does not have the type keyword, and it has a hide or a show clause.

If the show/hide part is empty, no instance members except the ones declared for Object? can be invoked on a receiver whose static type is the given extension type.

If the show/hide part is a show clause listing some identifiers and types, invocation of an instance member is allowed if its basename is one of the given identifiers, or it is the name of a member of the interface of one of the types. Instance members declared for object can also be invoked.

If the show/hide part is a hide clause listing some identifiers and types, invocation of an instance member is allowed if it is in the interface of the on-type and not among the given identifiers, nor in the interface of the specified types.

If the show/hide part is a show clause followed by a hide clause then the available instance members is computed by first considering the show clause as described above, and then removing instance members from that set based on the hide clause as described above.

A compile-time error occurs if a hide or show clause contains an identifier which is not the basename of an instance member of the on-type. A compile-time error occurs if a hide or show clause contains a type which is not among the types that are implemented by the on-type of the extension.

A type in a hide or show clause may be raw (that is, an identifier or qualified identifier denoting a generic type, but no actual type arguments). In this case the omitted type arguments are determined by the corresponding superinterface of the on-type.

For example:

extension type MyInt on int show num, isEven hide floor {
  int get twice => 2 * this;
}

void main() {
  MyInt m = 42;
  m.twice; // OK, in the extension type.
  m.isEven; // OK, a shown instance member.
  m.ceil(); // OK, a shown instance member.
  m.toString(); // OK, an `Object?` member.
  m.floor(); // Error, hidden.
}

Invariant enforcement through introduction: Protected extension types

In some cases, it may be convenient to be able to create a large object structure with no language-level constraints imposed, and later working on that object structure using an extension type. For instance, a JSON value could be modeled by an object structure containing instances of something like int, bool, String, List<dynamic>, and Map<String, dynamic>, and there may be a schema which specifies a certain regularity that this object structure should have. In this case it makes sense to use the approach of the original proposal in this issue: The given object structure is created (perhaps by a general purpose JSON deserializer) without any reference to the schema, or the extension type. Later on the object structure is processed, using an extension type which corresponds to the schema.

However, in other cases it may be helpful to constrain the introduction of objects of the given extension types, such that it is known from the outset that if an expression has a type U which is an extension type then it was guaranteed to have been given that type in a situation where it satisfied some invariants. If the underlying representation object (structure) is mutable, the extension type members should be written in such a way that they preserve the given invariants.

We introduce the notion of extension type constructors to handle this task.

An extension declaration with the type keyword can start with the keyword protected. In this case we say that it is a protected extension type. A protected extension type can declare one or more non-redirecting factory constructors. We use the phrase extension type constructor to denote such constructors.

An instance creation expression of the form Ext<T1, .. Tk>(...) or Ext<T1, .. Tk>.name(...) is used to invoke these constructors, and the type of such an expression is Ext<T1, .. Tk>.

During static analysis of the body of an extension type constructor, the return type is considered to be the on-type of the enclosing extension type declaration.

In particular, it is a compile-time error if it is possible to reach the end of an extension type constructor without returning anything.

A protected extension type is a proper subtype of the top types and a proper supertype of Never.

In particular, there is no subtype relationship between a protected extension type and the corresponding on-type.

When E (respectively E<X1, .. Xk>) is a protected extension type, it is a compile-time error to perform a downcast or promotion where the target type is E (respectively E<T1, .. Tk>).

The rationale is that an extension type that justifies any constructors will need to maintain some invariants, and hence it is not helpful to allow implicit introduction of any value of that type with no enforcement of the invariants at all.

For example:

protected extension type nat on int {
  factory nat(int value) =>
      value >= 0 ? value : throw "Attempt to create an invalid nat";
}

void main() {
  nat n1 = 42; // Error.
  var n2 = nat(42); // OK at compile time and at run time.
}

The run-time representation of a type argument which is a protected extension type E resp. E<T1, .. Tk> is an identification of E resp. E<T1, .. Tk>.

In particular, it is not the same as the run-time representation of the corresponding on-type. This is necessary in order to maintain that the on-type and the protected extension type are unrelated.

For example:

class IntBox {
  int i;
  IntBox(this.i);
}

protected extension type EvenIntBox on IntBox {
  factory EvenIntBox(int i) => i % 2 == 0 ? IntBox(i) : throw "Invalid EvenIntBox";
  void next() => this.i += 2;
}

void main() {
  var evenIntBox = EvenIntBox(42);
  evenIntBox.next(); // Methods of `EvenIntBox` maintain the invariant.
  var intBox = evenIntBox as IntBox; // OK statically and dynamically.
  intBox.i++; // Invariant of `evenIntBox` violated!

  var evenIntBoxes = [evenIntBox]; // Type `List<EvenIntBox>`.
  evenIntBoxes[0].next(); // Elements typed as `EvenIntBox`, maintain invariant.
  List<IntBox> intBoxes = evenIntBoxes; // Compile-time error.
  intBoxes = evenIntBoxes as dynamic; // Run-time error.
}

Boxing

It may be helpful to equip each extension type with a companion class whose instances have a single field holding an instance of the on-type, so it's a wrapper with the same interface as the extension type.

Let E be an extension type with keyword type. The declaration of E is implicitly accompanied by a declaration of a class CE with the same type parameters and members as E, subclass of Object, and with a final field whose type is the on-type of E, and with a single argument constructor setting that field. The class can be denoted in code by E.class. An implicitly induced getter E.class get box returns an object that wraps this.

In the case where it would be a compile-time error to declare such a member named box, the member is not induced.

The latter rule helps avoiding conflicts in situations where box is a non-hidden instance member, and it allows developers to write their own declaration of box if needed.

Non-object entities

If we introduce any non-object entities in Dart (that is, entities that cannot be assigned to a variable of type Object?, e.g., external C / JavaScript / ... entities, or non-boxed tuples, etc), then we may wish to allow for extension types whose on-type is a non-object type.

This should not cause any particular problems: If the on-type is a non-object type then the extension type will not be a subtype of Object.

Discussion

It would be possible to reify extension types when they occur as type arguments of a generic type.

This might help ensuring that the associated discipline of the extension type is applied to the elements in, say, a list, even in the case where that list is obtained under the type dynamic, and a type test or type cast is used to confirm that it is a List<U> where U is an extension type.

However, this presumably implies that the cast to a plain List<T> where T is the on-type corresponding to U should fail; otherwise the protection against accessing the elements using the underlying on-type will easily be violated. Moreover, even if we do make this cast fail then we could cast each element in the list to T, thus still accessing the elements using the on-type rather than the more disciplined extension type U.

We cannot avoid the latter if there is no run-time representation of the extension type in the elements in the list, and that is assumed here: For example, if we have an instance of int, and it is accessed as extension MyInt on int, the dynamic representation will be a plain int, and not some wrapped entity that contains information that this particular int is viewed as a MyInt. It seems somewhat inconsistent if we maintain that a List<MyInt> cannot be viewed as a List<int>, but a MyInt can be viewed as an int.

As for promotion, we could consider "promoting to a supertype" when that type is an extension type: Assume that U is an extension type with on-type T, and the type of a promotable local variable x is T or a subtype thereof; x is U could then demote x to have type U, even though is tests normally do not demote. The rationale would be that the treatment of x as a U is conceptually more "informative" and "strict" than the treatment of x as a T, which makes it somewhat similar to a downcast.

Note that we can use extension types to handle void:

extension void on Object? {}

This means that void is an "epsilon-supertype" of all top types (it's a proper supertype, but just a little bit). It is also a subtype of Object?, of course, so that creates an equivalence class of "equal" types spelled differently. That's a well-known concept today, so we can handle that (and it corresponds to the dynamic semantics).

This approach does not admit any member accesses for a receiver of type void, and it isn't assignable to anything else without a cast. Just like void of today.

However, compared to the treatment of today, we would get support for voidness preservation, i.e., it would no longer be possible to forget voidness without a cast in a number of higher order situations:

List<Object?> objects = <void>[]; // Error.

void f(Object? o) { print(o); }
Object? Function(void) g = f; // Error, for both types in the signature.

Revisions

  • Feb 8, 2021: Introduce 'protected' extension type terminology. Change the proposed subtype relationship for protected extension types. Remove proposal 'alternative 1' where protected extension types are completely unrelated to other types. Add reification of protected extension types as part of the proposal (making 'alternative 2' part of the section on protected extension types).

  • Feb 5, 2021: Add some enhancement mechanism proposals, based on the discussion below: Keyword type prevents implicit invocations; construction methods; show/hide; non-objects.

  • Feb 1, 2021: Initial version.

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-typesfeatureProposed language feature that solves one or more problems

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions