carbon-language · zygoloid · Mar 30, 2021 · Apr 1, 2021 · Apr 1, 2021 · fowles
@@ -66,5 +66,6 @@ request:
 -   [0253 - 2021 Roadmap](p0253.md)
     -   [0253 - Decision](p0253_decision.md)
 -   [0285 - if/else](p0285.md)
+-   [0423 - Evolution strategies](p0423.md)
 
 <!-- endproposals -->
@@ -0,0 +1,361 @@
+# Evolution strategies
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/423)
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Problem](#problem)
+-   [Background](#background)
+    -   [Example: lexical structure](#example-lexical-structure)
+    -   [Example: interfaces](#example-interfaces)
+-   [Proposal](#proposal)
+-   [Details](#details)
+    -   [Strategy: point change with transparent migration](#strategy-point-change-with-transparent-migration)
+        -   [Summary](#summary)
+        -   [Details](#details-1)
+        -   [Timeline](#timeline)
+        -   [Applicability](#applicability)
+        -   [Advantages](#advantages)
+        -   [Disadvantages](#disadvantages)
+        -   [Example](#example)
+    -   [Strategy: incremental change](#strategy-incremental-change)
+        -   [Summary](#summary-1)
+        -   [Details](#details-2)
+        -   [Timeline](#timeline-1)
+        -   [Applicability](#applicability-1)
+        -   [Advantages](#advantages-1)
+        -   [Disadvantages](#disadvantages-1)
+        -   [Example](#example-1)
+    -   [Guidance](#guidance)
+    -   [Consequences](#consequences)
+-   [Alternatives considered](#alternatives-considered)
+    -   [Non-strategy: simultaneous migration](#non-strategy-simultaneous-migration)
+
+<!-- tocstop -->
+
+## Problem
+
+Carbon aims to support language evolution. From the language goals:
+
+> _Support maintaining and evolving the language itself for decades._ We will
+> not get the design of most language features correct on our first, second, or
+> 73rd try. As a consequence, there must be a built-in plan and ability to move
+> Carbon forward at a reasonable pace and with a reasonable cost.
+> Simultaneously, an evolving language must not leave software behind to
+> languish, but bring software forward. This requirement should not imply
+> compatibility, but instead some migratability, likely tool-assisted.
+
+However, the specifics of how this migration will work have not been
+established, and having an idea of how evolutionary changes will be made is
+necessary in order to design the language to accommodate such changes.
+
+## Background
+
+### Example: lexical structure
+
+We expect the lexical structure of the Carbon language to change over time, in
+various ways. For example:
+
+-   New kinds of tokens might be added, such as regular expression literals.
+-   New tokens of existing kinds might be added, such as new keywords or new
+    operators.
+-   Existing character sequences might be split into tokens differently. For
+    example, if a `<-` token is added, the expression `x<-3` might form a
+    different token sequence.
+
+The Carbon philosophy is to evolve towards the best language Carbon can be,
+rather than compromising for compatibility, so we should assume that we will
+sometimes want to make lexical changes that affect a large amount of existing
+code.
+
+There are choices we could make now that would make anticipated lexical
+extensions easier. For example, we could require that all sequences of
+operator-like characters are always lexed as a single operator token, even if
+that token is meaningless, and that would allow us to add operators in the
+future as a non-breaking change.
+
+### Example: interfaces
+
+The set of methods on an interface should be expected to change over time. If a
+method were to be added with no evolution strategy in mind, existing
+implementations will initially not implement it, meaning they no longer conform
+to the interface; if we permit such types to conform to the interface
+regardless, then users of the interface risk calling a method that is not
+actually implemented.
+
+In order to allow for Carbon code to evolve, we need to provide a path by which
+such evolution can occur.
+
+## Proposal
+
+This proposal presents a collection of concrete strategies for making changes to
+the language and to libraries, along with basic guidance for when to use which
+strategy, and how to design language features to minimize evolutionary problems.
+The list in this proposal is not intended to be exhaustive, but is instead
+intended to provide a baseline set of approved strategies.
+
+I propose the creation of a new Principle document based on the contents of this
+document. In addition, some further minor changes to course-correct prior
+proposals are given in the [Consequences](#consequences) section below.
+
+## Details
+
+### Strategy: point change with transparent migration
+
+#### Summary
+
+-   Simultaneously make a change and provide a correct and fast migration tool.
+-   Builds of an un-migrated package perform a migration to temporary files and
+    then build the resulting migrated package.
+-   Package maintainers run the migration tool and check in the result,
+    including a marker to say the package has been migrated, when they're ready.
+
+#### Details
+
+This strategy allows Carbon sources to adopt changes at their own pace, within
+reason, by permitting un-migrated and migrated source files to coexist in the
+same build. Some state would be tracked in the package configuration file(s) to
+indicate which migrations have already been performed.
+
+Because migration is performed transparently as part of a build, the toolchain
+never sees unmigrated source code; as far as it is concerned, all input source
+code is written in the latest language using the latest interfaces.
+
+As with regular build actions, migration of dependency packages can be cached,
+so the cost of performing the migration is only paid when updating the
+dependency, not on every build.
+
+#### Timeline
+
+A language change would progress as follows:
+
+-   At time T-1, the Carbon toolchain does not support the new language feature,
+    and Carbon packages do not indicate they have been migrated to use it.
+-   At time T, the Carbon toolchain introduces support for the new feature. All
+    existing code continues to build by way of an implicit auto-upgrade.
+-   At time T+X, a package migrates to the new version and performs a release.
+    Dependent packages continue to build with Carbon toolchains from time T
+    onwards, but earlier toolchains no longer work.
+-   At time T+Y, once the Carbon ecosystem has largely migrated, the Carbon
+    toolchain removes the automigration support. This may be months or years
+    later.
+
+Note that under this model, new features can be used as soon as they are
+implemented, but doing so imposes downstream constraints on acceptable toolchain
+versions.
+
+#### Applicability
+
+This approach is only applicable if a migration tool can be built that is both
+correct in all cases and acceptably fast. It is unlikely to be acceptable for
+the sequence of migration steps performed on a package to substantially slow
+down the build of that package. However, this will likely cover all lexical
+changes, most syntactic changes, and also many semantic changes where the old
+semantics can be recovered by different syntax.
+
+We should make the facilities of this approach available to user code, by
+allowing a package to expose automigration tools that will be transparently
+applied to its dependents.
+
+#### Advantages
+
+-   New functionality can be provided and adopted with no delay.
+-   The timeframe for adopting a change is very loose.
+-   There is no required ordering between a package adopting a change and its
+    dependents adopting the change.
+-   There is no need to make language changes to prepare for this strategy,
+    beyond ensuring that all existing code can be automatically migrated.
+
+#### Disadvantages
+
+-   Build-time diagnostics and runtime semantics will reflect the result of the
+    migration tool, which may be surprising when relating diagnostics or
+    behavior back to the original source of an un-migrated package. For example,
+    source snippets in diagnostics may refer to code that doesn't match the
+    original source, and debug information may refer to generated files instead
+    of originals.
+-   Migration tools may not work correctly on invalid code, such as code under
+    active development, potentially resulting in build errors that are unrelated
+    to any source errors, and potentially surprising output from tooling. For
+    example, after a language syntax change, an autocomplete tool may suggest
+    completions using the new language syntax even when editing an unmigrated
+    source file.
+-   If a change is released with an incorrect migration tool, builds may break.
+    This is somewhat different from the expected fragility of new compiler
+    features, because unchanged code is expected to be affected more frequently.
+
+Most of the disadvantages can be mitigated by ensuring that packages under
+maintenance are migrated early.
+
+#### Example
+
+We decide that we want to replace `var type name` with `var name : type`. A
+migration tool is built to perform the refactoring, and the toolchain is updated
+to parse the new syntax instead of the old syntax. The updated toolchain and
+migration tool are released together.
+
+All subsequent builds using the new toolchain first migrate the source code to
+the new syntax, and then pass it to the new toolchain, which only understands
+the new syntax.
+
+### Strategy: incremental change
+
+#### Summary
+
+-   Make step-by-step progress, alternating between making a change that is
+    compatible with current usage and updating current usage to avoid removed
+    functionality and adopt added functionality.
+-   Changes that modify the meaning of existing code may result in several such
+    steps.
+
+#### Details
+
+In this approach, we avoid making backwards-incompatible changes immediately.
+Instead, every backwards-incompatible change has a transition period in which we
+expect Carbon source to be migrated. The backwards-incompatible change is then
+only made once the transition period has elapsed.
+
+We divide the change up into a sequence of steps, where each step is one of the
+following:
+
+-   An _addition_, that strictly increases the set of valid input programs,
+    without changing the meaning of any program already in the set. For example,
+    this might include recognizing a new token that was previously invalid.
+-   A _removal_, that strictly decreases the set of valid input programs,
+    without changing the meaning of any program in the set.
+
+Additions are performed directly, with no transition period required. Removals
+are performed by announcing the intent to remove, introducing diagnostic
+messages for uses of functionality that is pending removal, producing tools to
+transition uses of the removed functionality, and then after a suitable
+transition time, performing the removal.
+
+In order to navigate from the current state to the desired end state by a
+sequence of additions and removals, intermediate scaffolding functionality that
+is present in neither state may be necessary. For example, when changing the
+meaning of a function parameter, it may be necessary to temporarily add a
+scaffolding function with a new name, migrate some or all existing callers to
+the new function, change the original function, and then migrate back.
+
+#### Timeline
+
+A language addition would progress as follows:
+
+-   At time T-1, the Carbon toolchain does not support the new language feature.
+-   At time T, the Carbon toolchain supports the new feature, and source code
+    can start to use it.
+
+Library additions would follow a similar path, with the change being made in the
+library rather than in the toolchain.
+
+Use of an added feature imposes a version constraint: once a package uses a
+feature, anyone compiling it or its dependents would need a suitably recent
+version of the toolchain or the package introducing the change.
+
+A language removal would progress as follows:
+
+-   At time T, the intent to remove the feature is announced, and the Carbon
+    toolchain starts producing warnings when encountering uses of the feature.
+    Over subsequent releases, the severity of these warnings increases.
+-   At time T+K, the feature is removed from the Carbon toolchain.
+
+Library removals would follow a similar path, with the change being made in the
+library rather than in the toolchain; this necessitates there being a mechanism
+by which library authors can request diagnostics for use of certain
+functionality.
+
+Note that removals take time before they become active under this model. If we
+can anticipate such changes and prepare for them, we can in some cases avoid the
+need for the first step and the scaffolding feature.
+
+#### Applicability
+
+This approach is applicable to most -- or perhaps all -- changes, but may
+require multiple steps for certain kinds of change, requiring a long time for a
+migration to complete.
+
+#### Advantages
+
+-   At every stage, all source code across all packages is written using the
+    same language rules and the same library interfaces.
+-   The code being built and run is exactly the code in the source files.
+-   This strategy has wide applicability.
+
+#### Disadvantages
+
+-   Changes in which a removal must complete before some addition is performed
+    can potentially take a long time when following this strategy.
+-   Introducing and removing scaffolding requires additional work that is not
+    fundamental to the change being made.
+-   Manual migration will be required in some cases.
+
+#### Example
+
+In order to support changes to an interface, we allow newly-added methods to be
+marked as `upcoming`. This indicates that the method is not required, and indeed
+cannot be called (except by other `upcoming` functionality), but can be
+implemented. Then the addition of an interface method can be staged as follows:
-implemented. Then the addition of an interface method can be staged as follows:
+implemented. Then the addition of an interface method for which no default implementation is possible can be staged as follows:
-implemented. Then the addition of an interface method can be staged as follows:
+implemented. Then the addition of an interface method for which no default implementation is possible can be staged as follows:
+
+-   A method is introduced, declared `upcoming`. This is an addition, as
+    strictly more programs become valid.
+-   The intent to remove the `upcoming` marker is announced -- in this case,
+    implicitly, as all `upcoming` markers indicate an intent to remove the
+    marker. The removal period for this `upcoming` marker begins.
+-   Over time, the method is implemented by all implementers of the interface.
+-   The `upcoming` marker is removed. This is a removal, as it results in
+    strictly fewer programs being valid.
+-   Once the removal is complete, the new method can be used. This is an
+    addition, that in this instance occurs concurrently with the completion of
+    the removal phase and the removal of the `upcoming` marker.
+
+### Guidance
+
+The primary driver for any change should be the intended end state. While the
+migration path to a goal should be a consideration, and may sway our decision
+between options that otherwise provide similar value, we should prefer using
+more expensive migration strategies over selecting an inferior end state.
+
+When a choice of strategies is available, purely additive changes should be
+preferred over point changes, and point changes should be preferred over
+incremental changes.
+
+Language features should, where possible, be designed to reduce the necessity of
+incremental changes for anticipated future evolution. For example, if the
+spelling of an identifier is visible through reflection, then adding a keyword
+may require use of the incremental strategy to rename existing uses, as a
+fully-correct migration tool can't be built in general. However, if a raw
+identifier syntax is introduced, then the same change can be a point change,
+where the migration tool replaces all existing uses with semantically-identical
+raw identifiers.
+
+### Consequences
+
+We anticipate that all lexical changes can be accommodated by the point change
+strategy. Therefore there is no requirement to reserve any lexical space to
+prepare for future changes.
+
+Therefore, we will no longer require whitespace after the `//` introducing a
+comment, nor will we disallow decimal digits to follow a `\0` escape sequence.
+
+This strategy also subsumes the approach described in
+[proposal 93](https://github.com/carbon-language/carbon-lang/pull/93), with
+package-wide migration instead of file-at-a-time migration, leaving only the
+addition of raw identifier syntax, which is still justified both as a vehicle
+for ensuring that correct migration is always possible and that identifiers that
+are keywords in Carbon but not keywords in C++ can be expressed.
+
+## Alternatives considered
+
+### Non-strategy: simultaneous migration
+
+A number of strategies that require making simultaneous changes to multiple
+packages, or to the toolchain and third-party packages, are possible. We
+consider such strategies to be untenable.