Skip to content

Add support for Index & Range #1369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: draft-v8
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .github/workflows/dependencies/GrammarTestingEnv.tgz
Binary file not shown.
4 changes: 4 additions & 0 deletions standard/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -784,6 +784,10 @@
- [§23.8.3](unsafe-code.md#2383-fixed-size-buffers-in-expressions) Fixed-size buffers in expressions
- [§23.8.4](unsafe-code.md#2384-definite-assignment-checking) Definite assignment checking
- [§23.9](unsafe-code.md#239-stack-allocation) Stack allocation
- [§24](ranges.md#24-ranges-and-slicing) Ranges and Slicing
- [§24.1](ranges.md#241-general) General
- [§24.2](ranges.md#242-the-index-type) The Index type
- [§24.3](ranges.md#243-the-range-type) The Range type
- [§A](grammar.md#annex-a-grammar) Grammar
- [§A.1](grammar.md#a1-general) General
- [§A.2](grammar.md#a2-lexical-grammar) Lexical grammar
Expand Down
4 changes: 3 additions & 1 deletion standard/arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,9 @@ Elements of arrays created by *array_creation_expression*s are always initialize

## 17.4 Array element access

Array elements are accessed using *element_access* expressions ([§12.8.12.2](expressions.md#128122-array-access)) of the form `A[I₁, I₂, ..., Iₓ]`, where `A` is an expression of an array type and each `Iₑ` is an expression of type `int`, `uint`, `long`, `ulong`, or can be implicitly converted to one or more of these types. The result of an array element access is a variable, namely the array element selected by the indices.
Array elements are accessed using the *array access* variant of *element_access* expressions ([§12.8.12.2](expressions.md#128122-array-access)) of the form `A[I₁, I₂, ..., Iₓ]`, where `A` is an expression of an array type and each `Iₑ` is an expression of type `int`, `uint`, `long`, `ulong`, or can be implicitly converted to one or more of these types. The result of an array access is a variable reference (§9.5) to the array element selected by the indices.

Array elements of single-dimensional arrays, can also be accessed using an array access expression where the sole index, `I₁`, is an expression of type `Index`, `Range`, or can be implicitly converted to one or both of these types. If `I₁` is of type `Index`, or has been implicitly converted to it, then the result of the array access is a variable reference to the array element selected by the index value. If `I₁` is of type `Range`, or has been implicitly converted to it, then the result of the element access is a new array formed from a shallow copy of the array elements with indicies in the `Range` value, maintaining the element order.

The elements of an array can be enumerated using a `foreach` statement ([§13.9.5](statements.md#1395-the-foreach-statement)).

Expand Down
3 changes: 2 additions & 1 deletion standard/clauses.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@
"attributes.md"
],
"UnsafeClauses": [
"unsafe-code.md"
"unsafe-code.md",
"ranges.md"
],
"Annexes": [
"grammar.md",
Expand Down
203 changes: 168 additions & 35 deletions standard/expressions.md

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions standard/lexical-structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -1010,11 +1010,11 @@ Punctuators are for grouping and separating.

```ANTLR
operator_or_punctuator
: '{' | '}' | '[' | ']' | '(' | ')' | '.' | ',' | ':' | ';'
| '+' | '-' | ASTERISK | SLASH | '%' | '&' | '|' | '^' | '!' | '~'
| '=' | '<' | '>' | '?' | '??' | '::' | '++' | '--' | '&&' | '||'
| '->' | '==' | '!=' | '<=' | '>=' | '+=' | '-=' | '*=' | '/=' | '%='
| '&=' | '|=' | '^=' | '<<' | '<<=' | '=>' | '??='
: '{' | '}' | '[' | ']' | '(' | ')' | '.' | ',' | ':' | ';'
| '+' | '-' | '*' | '/' | '%' | '&' | '|' | '^' | '!' | '~'
| '=' | '<' | '>' | '?' | '??' | '::' | '++' | '--' | '&&' | '||'
| '->' | '==' | '!=' | '<=' | '>=' | '+=' | '-=' | '*=' | '/=' | '%='
| '&=' | '|=' | '^=' | '<<' | '<<=' | '=>' | '??=' | '..'
;

right_shift
Expand Down
291 changes: 291 additions & 0 deletions standard/ranges.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,291 @@
# 24 Extended Indexing and Slicing

> <span style="color:red">**Review Note:** This new chapter, currently (§24), is placed here temporarily to avoid text changes due to renumbering occurring in chapters & clauses otherwise unaffected by the PR. It’s final placement is not yet determined, however between the Arrays ([§17](arrays.md#17-arrays)) and Interfaces ([§18](interfaces.md#18-interfaces)) chapters might be suitable – other placements can be suggested during review. It can be relocated later with just a simple edit to `clauses.json`.</span>

## 24.1 General

This chapter introduces a model for *extended indexable* and *sliceable* *collection* types built on:

- The types introduced in this chapter, `System.Index` (§24.2) and `System.Range` (§24.3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these list items all start with lower-case letters? (Ditto later lists.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jskeet – Let’s defer to @RexJaeschke on this one. This particular list is a collection of definitions each of which (in next commit as I got a couple wrong) a sentence, so ends in a full-stop. Some have sub-bullets prefixed by a colon and semi-colon (first-level) or comma (second-level) separated, these sub-bullets therefore start with lowercase. So there is a sort of logic to it…

Copy link
Contributor

@RexJaeschke RexJaeschke Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our rule is to begin a numbered or unnumbered list item with an uppercase letter. Exceptions are:

  1. If the first word is fenced as bold, italic, or bold-italic.
  2. If the first word is fenced as a code fragment.
  3. Items in the list of terms in the "Terms and definitions" clause all begin with lowercase letters, as that is their usual spelling.

I do see, however, that we have violations of that approach in a number of existing clauses.

- The pre-defined unary `^` (§hat-operator) and binary `..` (§range-operator) operators; and
- The *element_access* expression.

Under the model a type is classified as:

- a *collection* if it represents a group of *element*s
- an *extended indexable* collection if it supports an *element_access* expression which has a single argument expression of type `Index` which returns and/or sets a single element of the type, either by value or by reference; and
- an *extended sliceable* if it supports an *element_access* expression which has a single argument expression of type `Range` which returns a *slice* of the elements of the type by value.

> *Note*: The model does not require that a slice, unlike an element, of the type can be set, but a type may support it as an extension of the model. *end note*

The model is supported for single-dimensional arrays (§12.8.12.2) and strings (§string-access).

The model can be supported by any class, struct or interface type which provides appropriate indexers (§15.9) which implement the model semantics.

Implicit support for the model is provided for types which do not directly support it but which provide a certain *pattern* of members (§24.4). This support is pattern-based, rather than semantic-based, as the semantics of the type members upon which it is based are *assumed* – the language does not enforce, or check, the semantics of these type members.

### 24.1.1 Definitions

For the purposes of this chapter the following terms are defined:

- A ***collection*** is a type which represents a group of ***element***s
- A ***countable*** collection is one which provides a ***countable property*** an `int` valued instance property whose value is the number of elements currently in the group. This property must be named either `Length` or `Count`, the former is chosen if both exist.
- A ***sequence*** or ***indexable*** type is a collection:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to include "sequence" here? I tend to think of IEnumerable<T> as a sequence, without the ability to do anything other than iterate over it. If we're primarily interested in indexing, can we just use "indexable type"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll leave this one for discussion. IIRC sequence came up in the source material and/or the PR of terms? I can see that in this clause where there are indexers and a type Index using sequence does reduces the number of index-based terms causing confusion, but that doesn’t mean we have to use it. These definitions are prefixed “For the purposes of this clause the following terms are defined” for this sort of reason, and leaves open a possible revision of terms later.

- which is countable;
- where every element can be accessed using an *element_access* expression with a single required `int` argument, the ***from-start index***, additional optional arguments are allowed;
- a sequence is ***modifiable*** if every element can also be set using an *element_access* expression;
- an element’s from-start index is the number of elements before it in the sequence, for a sequence containing *N* elements:
- the first and last elements have indicies of 0 and *N*-1 respectively, and
- the ***past-end*** index, an index which represents a hypothetical element after the last one, has the value *N*;
- A ***from-end index*** represents an element’s position within a sequence relative to the last element. For a sequence containing *N* elements the first, last and past-end indicies are *N*, 1 and 0 respectively.
- A ***range*** is a contiguous run of zero or more indicies starting at any index within a sequence.
- A ***slice*** is the collection of elements within a range.
- A ***sliceable*** collection is one which:
- is countable;
- provides a method `Slice` which takes two `int` parameters specifying a range, being a starting index and a count of elements respectively, and returns a new slice constructed from the elements in the range.

The above definitions are extended for uses of `Index` and `Range` as follows:

- A type is also a *sequence* if an *element_access* expression taking a single required `Index` argument, rather than an `int` argument, is supported. Where a distinction is required the type is termed ***extended indexable***.
- A type is also *sliceable* if an *element_access* expression taking a single required `Range` argument, rather a `Slice` method, is supported. Where a distinction is required the type is termed ***extended sliceable***.

Whether a type is classified as countable, indexable, or sliceable is subject to the constraints of member accessibility (§7.5) and therefore dependent on where the type is being used.

> *Example*: A type where the countable property and/or the indexer are `protected` is only a sequence to members of itself and any derived types. *end example*

The required members for a type to qualify as a sequence or sliceable may be inherited.

> *Example*: In the following code
>
> ```CSharp
> public class A
> {
> public int Length { get { … } }
> }
>
> public class B : A
> {
> public int this(int index) { … }
> }
>
> public class C : B
> {
> public int[] Slice(int index, int count) { … }
> }
> ```
>
> The type `A` is countable, `B` is a sequence, and `C` is sliceable and a sequence.
>
> *end example*

*Note*:

- A type can be sliceable without being indexable due to the lack of an (accessible) indexer.
- For a type to be sliceable and/or indexable requires the type to be countable.
- While the elements of a sequence are ordered by *position* within the sequence the elements themselves need not be ordered by their value, or even orderable.

*end note*

## 24.2 The `Index` type

The `System.Index` type represents an *abstract* index which is either a *from-start index* or a *from-end index*.

```CSharp
public readonly struct Index : IEquatable<Index>
{
public int Value { get; }
public bool IsFromEnd { get; }

public Index(int value, bool fromEnd = false);

public static implicit operator Index(int value);
public int GetOffset(int length);
public bool Equals(Index other);
}
```

`Index` values are constructed from an `int`, specifying the positive offset, and a `bool`, indicating whether the offset is from the end (`true`) or start (`false`). If the specified offset is negative an `ArgumentOutOfRangeException` is thrown.

> *Example*
>
> ```CSharp
> Index first = new Index(0, false); // first element index
> var last = new Index(1, true); // last element index
> var past = new Index(0, true); // past-end index
>
> Index invalid = new Index(-1); // throws ArgumentOutOfRangeException
> ```
>
> *end example*

There is an implicit conversion from `int` to `Index` which produces from-start indicies, and a language-defined unary operator `^` (§hat-operator) from `int` to `Index` which produces from-end indicies.

> *Example*
>
> Using implicit conversions and the unary `^` operator the above examples may be written:
>
> ```CSharp
> Index first = 0; // first element index
> var last = ^1; // last element index
> var past = ^0; // past-end index
> ```
>
> *end example*

The method `GetOffset` converts from an abstract `Index` value to a concrete `int` index value for a sequence of the specified `length`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly put these three into the same paragraph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jskeet – In the next commit I’ve put the first two in the same para, I think that this method doesn’t sanity check should stand out. Good compromise?


If the `Index` value, `I`, is from-end this method returns the same value as `length - I.Value`, otherwise it returns the same value as `I.Value`.

This method does **not** check that the return value is in the valid range of `0` through `length-1` inclusive.

> *Note:* No checking is specified as the expected use of the result is to index into a sequence with `length` elements, and that indexing operation is expected to perform the appropriate checks. *end note*

`Index` implements `IEquatable<Index>` and values may be compared for equality. However `Index` values are not ordered and no other comparison operations are provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we note that two Index values can resolve to the same concrete int index value when applied to the same countable collection, but still be unequal due to one being from-start and the other being from-end? Maybe it's obvious enough. (Or maybe we just state that equality is based on both the Value and IsFromEnd properties being equal.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jskeet – Hmm, its comparing abstract indices so it should be obvious… but if you’d like to craft a Note I see no harm. Or I could spell it out as you suggest. Another one for the meeting.


> *Note:* `Index` values are unordered as they are abstract indicies, it is in general impossible to determine whether a from-end index comes before or after a from start index without reference to a sequence length. Once converted to concrete indicies, e.g. by `GetOffset`, those concrete indicies are comparable. *end note*

`Index` values may be directly used in the *argument_list* of an *element_access* expression (§12.8.12):

- which is statically bound (§12.3.2) and:
- it is an array access and the target is a single-dimensional array (§12.8.12.2);
- it is a string access (§string-access); or
- it is an indexer access and the target type conforms to a sequence pattern for which implicit `Index` support is specified (§24.4.2).
- which is statically or dynamically bound (§12.3.2) and:
- it is an indexer access and the target type has an indexer with parameters of `Index` type (§12.8.12.3);

## 24.3 The `Range` type

The `System.Range` type represents the abstract range of `Index`es from a `Start` index up to, but not including, an `End` index.

```CSharp
public readonly struct Range : IEquatable<Index>
{
public Index Start { get; }
public Index End { get; }

public Range(Index start, Index end);

public (int Offset, int Length) GetOffsetAndLength(int length);
public bool Equals(Range other);
}
```

`Range` values are constructed from two `Index` values.

> *Example*
>
> The following examples use the implicit conversion from `int` to `Index` (introduced above) and the `^` (§hat-operator) operator to create the `Index` values for each `Range`:
>
> ```CSharp
> var firstQuad = new Range(0, 4); // the indicies from `0` to `3`
> // int values impicitly convert to `Index`
> var nextQuad = new Range(4, 8); // the indicies from `4` to `7`
> var wholeSeq = new Range(0, ^0); // the indicies from `0` to `N-1` where `N` is the
> // length of the sequence wholeSeq is used with
> var dropFirst = new Range(1, ^0); // the indicies from `1` to `N-1`
> var dropLast = new Range(0, ^1); // the indicies from `0` to `N-2`
> ```
>
> *end example*

The language-defined operator `..` (§range-operator) creates a `Range` value from `Index` values.

> *Example*
>
> Using the `..` the above examples may be written:
>
> ```CSharp
> var firstQuad = 0..4; // the indicies from `0` to `3`
> var nextQuad = 4..8; // the indicies from `4` to `7`
> var wholeSeq = 0..^0; // the indicies from `0` to `N-1`
> var dropFirst = 1..^0; // the indicies from `1` to `N-1`
> var dropLast = 0..^1; // the indicies from `0` to `N-2`
> ```
>
> *end example*

The operands of `..` are optional, the first defaults to `0`, the second defaults to `^0`.

> *Example*
>
> Four of the above examples can also be written:
>
> ```CSharp
> var firstQuad = ..4; // the indicies from `0` to `3`
> var wholeSeq = ..; // the indicies from `0` to `N-1`
> var dropFirst = 1..; // the indicies from `1` to `N-1`
> var dropLast = ..^1; // the indicies from `0` to `N-2`
> ```
>
> *end example*

The method `GetOffsetAndLength` converts an abstract `Range` value to a tuple value consisting of a concrete `int` index and a number of elements, applicable to a sequence with `length` elements. If the `Range` value is invalid with respect to sequence with `length` elements this method throws `ArgumentOutOfRangeException`.

> *Example*
>
> Using the variables defined above with `GetOffSetAndLength(6)`:
>
> ```CSharp
> var (ix0, len0) = firstQuad.GetOffsetAndLength(6); // ix0 = 0, len0 = 4
> var (ix1, len1) = nextQuad.GetOffsetAndLength(6); // throws ArgumentOutOfRangeException
> // as range crosses sequence end
> var (ix2, len2) = wholeSeq.GetOffsetAndLength(6); // ix2 = 0, len2 = 6
> var (ix3, len3) = dropFirst.GetOffsetAndLength(6); // ix3 = 1, len3 = 5
> var (ix4, len4) = dropLast.GetOffsetAndLength(6); // ix4 = 0, len4 = 5
> ```

`Range` implements `IEquatable<Range>` and values may be compared for equality. However `Range` values are not ordered and no other comparison operations are provided.

> *Note:* `Range` values are unordered both as they are abstract and there is no unique ordering relation. Once converted to a concrete start and length, e.g. by `GetOffsetAndLength`, and ordering relation could be defined. *end note*

`Range` values can be directly used in the *argument_list* of an *element_access* expression (§12.8.12):

- which is statically bound (§12.3.2) and:
- it is an array access and the target is a single-dimensional array (§12.8.12.2);
- it is a string access (§string-access); or
- it is an indexer access (§12.8.12.3) and the target type conforms to a sequence pattern for which implicit `Range` support is specified (§24.4.3).
- which is statically or dynamically bound (§12.3.2) and:
- it is an indexer access and the target type has an indexer with parameters of `Range` type (§12.8.12.3).

## 24.4 Pattern-based implicit support for `Index` and `Range`

### 24.4.1 General

If a statically bound (§12.3.2, §12.8.12.1) *element_access* expression (§12.8.12) of the form `E[A]`; where `E` has type `T` and `A` is a single expression implicitly convertible at compile-time to `Index` or `Range`; fails to be identified as:

- an array access (§12.8.12.2),
- a string access (§string-access), or
- or and indexer access (§12.8.12.3) as `T` provides no suitable accessible indexer

then pattern-based implicit support for the expression is provided if `T` conforms to a pattern. If `T` does not conform to the pattern then a compile-time error occurs.

### 24.4.2 Implicit `Index` Support

If in any context a statically bound (§12.3.2, §12.8.12.1) *element_access* expression (§12.8.12) of the form `E[A]`; where `E` has type `T` and `A` is a single expression implicitly convertible to `Index`; is not valid (§24.4.1) then if in the same context:

- `T` provides accessible members qualifying it as a *sequence* (§24.1.1); and
- the expression `E[0]` is valid

then the expression `E[A]` shall be implicitly supported.

Without otherwise constraining implementations of this Standard the order of evaluation of the expression shall be equivalent to:

1. `E` is evaluated;
2. `A` is evaluated;
3. the countable property of `T` is evaluated, if required by the implementation;
4. the get or set accessor of the `int` based indexer of `T` that would be used by `E[0]` in the same context is invoked.

### 24.4.3 Implicit `Range` Support

If in any context a statically bound (§12.3.2, §12.8.12.1) *element_access* expression (§12.8.12) of the form `E[A]`; where `E` has type `T` and `A` is a single expression implicitly convertible to `Range`; is not valid (§24.4.1) then if in the same context:

- `T` provides accessible members qualifying it as both *countable* and *sliceable* (§24.1.1)

then the expression `E[A]` shall be implicitly supported.

Without otherwise constraining implementations of this Standard the order of evaluation of the expression shall be equivalent to:

1. `E` is evaluated;
2. `A` is evaluated;
3. the countable property of `T` is evaluated, if required by the implementation;
4. the `Slice` method of `T` is invoked.
Loading
Loading