-
Notifications
You must be signed in to change notification settings - Fork 93
Add Support for Indexers and Ranges #605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[I raised the questions below on https://github.com/dotnet/docs/issues/30512, and got replies, which I need to factor in to this Draft PR.] I'm transforming this proposal into a Draft PR for the C# Standard, and have several questions: Q1. Optional Members in Index and Range: The proposal suggests that some members in these types are optional, such as Range's StartAt, EndAt, and All. Which members are required for these types? Why not require all those provided by MS's implementation? Specifically, the MS proposal states:
MS's feedback on my Q was:
As a result, I'm omitting any mention of the optional members Q2. Range indexer setter: What if anything should we say about the implicit and explicit setter for a Range indexer? Certainly, one can define a setter for a user-defined type; however, it is not obvious as to what such a setter would do, especially since it must be used on the left-hand side of assignment taking a right-hand side of the same type as the index returns. In the case of type BitArray that would mean something like ba1[range1] = ba2, or perhaps ba1[range1] = ba2[range2]. As far as I can determine, the operations one might like to implement using such a setter are probably best implemented via a named method. In any event, for a compiler-generated Range indexer, attempting to use its setter results in the error message “CS0131 The left-hand side of an assignment must be a variable, property or indexer,” which suggests the generated indexer has no setter. If that is the case, we should say that, perhaps by stressing that the result of a Range indexer is not a variable, so as such, it can't be used on the lhs of an assignment. MS's feedback on my Q was:
I chose to say nothing about the setter, which means that attempting to use such a setter results in unspecified behavior. |
When adapting the MS proposal, I invented the term indexable sequence. The rationale for that name choice follows. Various MS-hosted on-line pages use the term sequence. This word is already used quite a bit in the C# spec, in both a general sense as well as being defined in the context of query expressions. §11.17 Query expressions|§11.17.1 General states:
That definition is not applicable to indexes and ranges! The MS-provided proposal uses collection; however, that implies enumerable support, which is not required by indexes and ranges. (BTW, although it is used a lot in the C# spec, the term collection is not defined!) As such, rather than overload an existing term or invent a completely different one, I came up with indexable sequence. |
The MS proposal, section “Implicit Range support,” provided a very detailed discussion of how to transform a pair of Indexes into a call to |
c810a4a
to
867ba90
Compare
|
||
An index is represented by a read-only variable of the value type `System.Index`. A range is represented by a read-only variable of value type `System.Range`, which contains a start and end index. A slice of an array is represented by a (possibly empty) array. The representation of a slice of a user-defined type is determined by the implementer of that type. | ||
|
||
For an indexable sequence of length *N*, elements can be accessed using indexes 0 through *N-1*, which are relative to the start. Elements can also be accessed relative to the end via the `^` index-from-end operator (§index-from-end-operator). `^0` denotes the (non-existent) element just beyond the end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of indexable sequence, both original and revised versions, do not fix the type of the index. Here the type is fixed to being an integer number (it does not state the actual type).
- The value of `P` is checked to be valid. If the value of `P` is `null`, a `System.NullReferenceException` is thrown and no further steps are executed. | ||
- The value of each expression in the *argument_list* is checked against the actual bounds of each dimension of the array instance referenced by `P`. If one or more values are out of range, a `System.IndexOutOfRangeException` is thrown and no further steps are executed. | ||
- The location of the array element given by the index expression(s) is computed, and this location becomes the result of the array access. | ||
- The index expressions of the *argument_list* are evaluated in order, from left to right. Following evaluation of each index expression, for expressions not having type `System.Index` or `System.Range`, an implicit conversion ([§10.2](conversions.md#102-implicit-conversions)) to one of the following types is performed: `int`, `uint`, `long`, `ulong`. The first type in this list for which an implicit conversion exists is chosen. For instance, if the index expression is of type `short` then an implicit conversion to `int` is performed, since implicit conversions from `short` to `int` and from `short` to `long` are possible. For an index expression having type `System.Index`, the Index is transformed into the corresponding `int` index using `System.Index.GetOffset`, which calculates the offset from the start of the collection using the specified collection length. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this mesh with the “behave as if” behaviour specified in the new clause Implicit Index support (classes.md, line 5594)?
This version mentions GetOffset
, but are we requiring that it be called or that an implementation may “behave as if” it is? Elsewhere in the Standard there are passages where behaviour is defined to be the “equivalent of executing the following code” (or words to that effect), would that pattern be better here?
|
||
- For an index expression not having type `System.Range`: | ||
- If evaluation of an index expression, the subsequent implicit conversion, or Index transformation causes an exception, then no further index expressions are evaluated, and no further steps are executed. | ||
- The value of `P` is checked to be valid. If the value of `P` is `null`, a `System.NullReferenceException` is thrown and no further steps are executed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check on P
being valid is occurring after the conversion of Index
to int
which requires P
…
- The value of `P` is checked to be valid. If the value of `P` is `null`, a `System.NullReferenceException` is thrown and no further steps are executed. | ||
- The value of each expression in the *argument_list* is checked against the actual bounds of each dimension of the array instance referenced by `P`. If one or more values are out of range, a `System.IndexOutOfRangeException` is thrown and no further steps are executed. | ||
- The location of the array element given by the index expression(s) is computed, and this location becomes the result of the array access. | ||
- The index expressions of the *argument_list* are evaluated in order, from left to right. Following evaluation of each index expression, for expressions not having type `System.Index` or `System.Range`, an implicit conversion ([§10.2](conversions.md#102-implicit-conversions)) to one of the following types is performed: `int`, `uint`, `long`, `ulong`. The first type in this list for which an implicit conversion exists is chosen. For instance, if the index expression is of type `short` then an implicit conversion to `int` is performed, since implicit conversions from `short` to `int` and from `short` to `long` are possible. For an index expression having type `System.Index`, the Index is transformed into the corresponding `int` index using `System.Index.GetOffset`, which calculates the offset from the start of the collection using the specified collection length. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the valiant attempt to interweave Index
and Range
is currently a bit hard to follow, well at least I've had to keep re-untangling it as I check the semantics. I think separating out integral, Index
and Range
cases a bit more would help, I might back it back before the meeting to make a suggestion…
An implementation shall behave as if it provides a non-virtual instance indexer member with a single parameter of type `System.Index` for any type that meets the following criteria: | ||
- The type is countable ([§15.7.1](classes.md#1571-general)). | ||
- The type has an accessible instance indexer taking an argument of type `int` as its only argument. | ||
- The type does not have an accessible instance indexer taking a `System.Index` as its only argument, or as its first argument with the remaining arguments being optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think by this list, the following should not error, but it does:
var dict = new X();
dict[^1] = "";
class X : Dictionary<int, string>
{
}
Co-authored-by: Nigel-Ecma <[email protected]>
Co-authored-by: Nigel-Ecma <[email protected]>
For an array access, the *primary_no_array_creation_expression* of the *element_access* shall be a value of an *array_type*. Furthermore, the *argument_list* of an array access shall not contain named arguments. The number of expressions in the *argument_list* shall be the same as the rank of the *array_type*, and each expression shall be of type `int`, `uint`, `long`, or `ulong`, or shall be implicitly convertible to one or more of these types. | ||
|
||
The result of evaluating an array access is a variable of the element type of the array, namely the array element selected by the value(s) of the expression(s) in the *argument_list*. | ||
For single-dimension array access the argument expression can also be of type `System.Index` or `System.Range`, or implicitly convertible to one of these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an unexpected example:
public struct Int32OrRange
{
public static implicit operator int(Int32OrRange index) => 0;
public static implicit operator Range(Int32OrRange index) => new Range(0, 0);
}
string[] array = { };
Int32OrRange x = new();
var result = array[x];
That compiles, and uses the conversion to int. If you remove the conversion, it still compiles, and uses the conversion to Range. I'd have expected the first to compile because neither int
nor Range
is better than the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked with the compiler team. The roslyn implementation ranks single argument indexers in the following order:
- int
- uint
- Int64
- UInt64
- Index
- Range
I wouldn't expect that to change, so we'll have to come up with normative rules that match that behavior.
Co-authored-by: Joseph Musser <[email protected]>
- `s .. e` is transformed by the implementation to `new System.Range(s, e)`. | ||
- `s ..` is transformed by the implementation to `new System.Range(s, ^0)`. | ||
- `.. e` is transformed by the implementation to `new System.Range(0, e)`. | ||
- `..` is transformed by the implementation to `new System.Range(0, ^0)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“transformed by the implementation” is wrong here as:
- there is no requirement that the implementation does this transformation (and Roslyn does not always do so); and
- the Standard uses phrasing like “has the same meaning as” or “is equivalent to” (e.g. §12.8.8, §13.9.5)
@Nigel-Ecma IIRC you were going to propose a new structure for this (in a new PR, I believe?) - do you think you'll have anything ready in time for us to review by the next meeting? Do let me know if I'm completely wrong on this. |
@jskeet – You are correct, but I doubt I’ll have anything ready for the next meeting on this one. |
For the next TG2 call agenda, I propose the following two 15-minute sessions, in order, for this PR, even if Nigel won't have a formal presentation ready:
|
Removing the "meeting: priority" label as #1369 is now the proposal. |
Closed in favor of #1369 |
Before you look at the detailed edits, it likely will be useful to read the following: