metamath · wlammen · Aug 27, 2025 · Aug 27, 2025 · Aug 27, 2025 · Aug 27, 2025
diff --git a/set.mm b/set.mm
@@ -596266,6 +596266,363 @@ rephrased as a mutual exclusivity of propositional expressions (no two of
       KLMNO $.
   $}
 
+$(
+=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
+  Bootstrapping set theory with classes
+=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
+$)
+
+  ${
+    $d x A $.  $d x B $.
+    $( A Metamath program is a text-based tool.  Its input consists of (mostly)
+       human-readable and editable text stored in *.mm files.  For automatic
+       processing, these files are structured in a precise way.  This enables
+       automated analysis of their contents, verification of proofs, proof
+       assistance, and the generation of output files - for example, the HTML
+       page of this dummy theorem.  A set.mm file contains many such
+       instructions serving these purposes.
+
+       The ideas presented here focus only on specific theorems: ~ df-cleq ,
+       ~ df-clel , and ~ df-clab .  Accordingly, only the technical concepts
+       necessary to explain these are introduced, along with a selection of
+       supporting theorems.
+
+       The above theorems are central to a bootstrapping process that
+       introduces objects (called classes) into set.mm.  We will first examine
+       how Metamath in general creates basic new ideas from scratch, and then
+       look at how these methods are applied specifically to classes.
+
+       This page provides a brief explanation of the general concepts behind
+       structured text such as that found in set.mm.  As noted above, set.mm is
+       written in a form that is readable by humans.  However, for Metamath
+       this is not sufficient: the system must also verify and process set.mm
+       automatically in order to establish a high degree of confidence in its
+       correctness.
+
+       When a text exhibits a sufficiently regular structure, its form can be
+       described by a set of syntax rules, known as a grammar.  Such rules are
+       composed of terminal symbols (fixed, literal elements) and non-terminal
+       symbols, which can recursively be expanded using the grammar's _rewrite_
+       rules.  A program component that applies a grammar to a text is called a
+       parser.  The parser decomposes the text into smaller components, called
+       _syntactic units_, while often maintaining contextual information.
+       These units are then handed over to subsequent components for further
+       processing, such as populating a database or generating output.
+
+       In these pages, we restrict our attention to strictly formatted material
+       consisting of formulas with logical and mathematical content.  These
+       syntactic units are embedded in higher-level structures such as
+       chapters, together with commands that, for example, control the HTML
+       output.
+
+       Conceptually, the parsing process can be viewed as consisting of two
+       stages.  The top-level stage applies a simple built-in grammar to
+       identify its structural units.  Each unit is a text region marked on
+       both sides with predefined tokens, beginning with the escape character
+       "$".  Text regions containing logical or mathematical formulas are then
+       passed to a second-stage parser, which applies a different grammar.
+       Unlike the first, this grammar is not built-in but is dynamically
+       constructed.
+
+       In what follows, we will ignore the first stage of parsing, since its
+       role is only to extract the relevant material embedded within text
+       primarily intended for human readers.
+
+       (Contributed by Wolf Lammen, 18-Aug-2025.) $)
+    ax-wl-cleq-0 $a |- ( A = B <-> A. x ( x e. A <-> x e. B ) ) $.
+  $}
+
+  ${
+    $d x A $.  $d x B $.
+    $( The parser that processes strictly formatted Metamath text with logical
+       or mathematical content constructs its grammar dynamically.  New syntax
+       rules are added on the fly, whenever they are needed to parse upcoming
+       formulas.  Only a minimal set of built-in rules - those required to
+       introduce new grammar rules - is predefined; everything else must be
+       supplied by set.mm itself as syntactic units identified by the
+       first-stage parser.  Text regions beginning with the tokens "$c" or
+       "$v", for example, provide such grammar extensions.
+
+       We will outline the extension process used when introducing an entirely
+       new concept that cannot build on any prior material.  As a simple
+       example, we will trace the steps involved in defining formulas, the
+       expressions used for hypotheses and theorems - a concept first-order
+       logic needs from the very beginning to articulate its ideas.
+
+       1.  Introduce a type code to mark formulas used in theorems.  The
+       constant "wff" is reserved for this purpose.  It abbreviates
+       "well-formed formula", a term that already suggests that such formulas
+       must be syntactically valid and therefore parseable to enable automatic
+       processing.
+
+       2.  Introduce variables with unique names such as ` ph ` , ` ps ` , ...,
+       intended to represent formulas.  In grammar terms, they correspond to
+       non-terminal symbols.  These variables are marked with the type "wff",
+       ensuring that only well-formed formulas can be assigned to them.
+       Variables are fundamental in Metamath, as they enable substitution
+       during proofs: variables can be consistently replaced by arbitrary
+       formulas of type wff.  In grammar terms, this corresponds to applying a
+       rewrite rule.  At this step, however, we can only substitute one
+       variable for another, which is sufficient only for very elementary
+       theorems such as ~ idi .
+
+       3.  Add rewrite rules to the grammar to enable the construction of
+       formulas.  Each such rule is assigned the type "wff".  Typically, a
+       rewrite rule describes an operator (a constant in the grammar) applied
+       to one or more variables, possibly of different types (though at this
+       stage we only have "wff").  These variables are non-terminal elements.
+       To allow for complex formulas, each variable can be consistently
+       replaced by any wff - whether involving the same operator or different
+       ones introduced by other rewrite rules.  A replacement is not mandatory,
+       but if it introduces variables again, some may recursively be replaced.
+
+       If an operator takes two variables of type wff, it is called a binary
+       connective in logic.  The first such operator we encounter is
+       ` ( ph -> ps ) ` .  Based on its token, its intended meaning is material
+       implication, though this interpretation is not fixed from the outset.
+
+       4.  Once the biconditional connective is available for formulas, new
+       connectives can be defined by giving replacement formulas that rely
+       solely on previously introduced material.
+
+       At the very beginning, however, this is not possible for well-formed
+       formulas, since little or no prior material exists.  Instead, the
+       semantics of an expression such as ` ( ph -> ps ) ` are progressively
+       constrained by axioms - that is, theorems without proof.  The first such
+       axiom for material implication is ~ ax-mp , with additional axioms
+       following later.
+
+       (Contributed by Wolf Lammen, 21-Aug-2025.) $)
+    ax-wl-cleq-1 $a |- ( A = B <-> A. x ( x e. A <-> x e. B ) ) $.
+  $}
+
+  ${
+    $d x A $.  $d x B $.
+    $( In ~ ax-wl-cleq-1 we saw how the basic notion of a well-formed formula
+       is introduced in set.mm.  A similar process is used to add the notion of
+       an object to Metamath.  This process is somewhat more complex, since two
+       variants are established in parallel: sets and the broader notion of
+       classes, which encompass sets (see 2c. below).  Classes, as it will turn
+       out, serve mainly as a convenient shorthand to simplify formulas and
+       their proofs.  Ultimately, only sets - some of the classes - are
+       intended to exist as actual objects within set theory.
+
+       In the early stages of set.mm it is not even clear that objects must be
+       sets at all.  The universe under consideration could, in principle,
+       consist of anything - vegetables, text strings, and so on.  For this
+       reason, the type code "setvar", used for objects, is something of a
+       misnomer.  Its final meaning - and the name that goes with it - only
+       becomes justified in later developments.
+
+       We will revisit the four basic steps presented in ~ ax-wl-cleq-1 , this
+       time focusing on objects and paying special attention to the additional
+       complexities that arise from extending sets to classes.
+
+       1a.  Reserve a type code for classes, specifically the grammar constant
+       "class".
+
+       1b.  Reserve a type code for sets, "setvar".  The name itself indicates
+       that this type code will never be assigned to a formula that actually
+       describes a specific set.
+
+       2a.  Introduce variables with unique names such as ` A ` , ` B ` , ...,
+       intended to represent classes.  These class variables are assigned the
+       type "class", ensuring that only formulas describing classes can be
+       associated with them.  Variables facilitate substitution during proofs:
+       a class variable can be consistently replaced by any formula of type
+       class.
+
+       2b.  Variables with unique names such as ` a ` , ` b ` , ... are
+       intended to be of type "setvar".  During a substitution in a proof, a
+       variable of type "setvar" can only be replaced with another variable of
+       the same type.  No specific object can be assigned this type, even if it
+       is known to be a set.
+
+       3a.  Add the rewrite rule ~ cv to set.mm, so variables of type "setvar"
+       appear also have type "class".  In this way, a variable of "setvar" type
+       can serve as a substitute for a class variable.
+
+       3b.  Add the rewrite rules ~ wcel and ~ wceq to set.mm, allowing the
+       formulas ` A e. B ` and ` A = B ` to be valid wff.  Since variables of
+       type "setvar" can be substituted for class variables, ` x e. y ` and
+       ` x = y ` are also provably valid wff.  Only these specific formulas
+       play a role in the early stages of set.mm and are therefore treated as
+       primitive at that point.  The underlying universe may not yet contain
+       sets, and the notion of a class is not even be required.
+
+       There are additional mixed-type formulas, such as ` x e. A ` ,
+       ` A e. x ` , ` x = A ` , and ` A = x ` .  When the theory is eventually
+       narrowed down to sets and classes, the results from the early stages
+       remain valid and extend naturally to these mixed cases.  In discussions,
+       each case will sometimes be examined individually.
+
+       4.  In the early stages of set.mm, the formulas ` x = y ` and ` x e. y `
+       cannot be derived from earlier material, and definitions are not
+       possible.  Instead, their basic properties are established through
+       axioms, namely ~ ax-6 to ~ ax-9 , ~ ax-12 (in the form of ~ ax12v ), and
+       ax-ext .  A less prominent role plays ~ ax-13 , which becomes relevant
+       in results only focussing on minimal disjoint variable conditions.
+
+       (Contributed by Wolf Lammen, 25-Aug-2025.) $)
+    ax-wl-cleq-2 $a |- ( A = B <-> A. x ( x e. A <-> x e. B ) ) $.
+  $}
+
+  ${
+    $d x A $.  $d x B $.
+    $( At the point where ~ df-cleq is introduced, the foundations of set
+       theory with classes are being laid.  A central property of objects is
+       what elements they contain, expressed by the membership operator
+       ` e. ` .  Quantification ( A. x ) applies only to objects that a
+       variable of kind setvar can hold.  These objects will henceforth be
+       called sets.  Some classes may not be sets; these are called proper
+       classes.  It remains open at this stage whether membership can involve
+       them.
+
+       The formula given in ~ df-cleq (restated below) asserts that two classes
+       are equal if and only if they have exactly the same sets as elements.
+
+       Equality, however, is not automatically identical with
+       indistinguishability; this is the case only if equality is indeed the
+       finest-grained relation between classes.  Since quantification over
+       classes is not possible, no analogue of ~ ax12v can be formulated.  In
+       particular, classes may in principle contain additional elements beyond
+       what set variables can express.
+
+       A noteworthy special case is the mixed equation ` x = A ` .  Together
+       with implicit substitution, this admits a direct solution - namely
+       theorem ~ ceqsal1t .  Remarkably, this theorem is valid without assuming
+       any axioms about equality between classes.
+
+       Theorem ~ df-cleq is already partly derivable if all class variables are
+       replaced by variables of type "setvar".  In that case, it reduces to an
+       instance of ~ axextb .  If one or both class variables are retained,
+       however, both sides of the biconditional involve well-formed formulas
+       whose meaning is not yet defined.
+
+       Up to this point, the only material involving class variables consists
+       of the syntax definitions ~ wceq and ~ wcel .  Axioms are therefore
+       required to progressively refine the semantics of classes until provable
+       results coincide with our intended conception of set theory.  This
+       refinement process is explained in Step 4 of ~ ax-wl-cleq-1 .
+
+       From this perspective, ~ df-cleq is in fact an axiom in disguise and
+       would more appropriately be named ax-cleq.
+
+       This axiom alone already suffices to prove basic properties of equality:
+       transivity ( ~ eqtr ), reflexivity ( ~ eqid ) and symmetry ( ~ eqcom ).
+       Moreover, the class-level version of ~ ax-ext (the backward direction of
+       ~ df-cleq ) holds.
+
+       The mixed type equality ` x = A ` deserves closer attention.  Since ~ cv
+       allows us to regard ` x ` as a class, the equation ` x = A ` states that
+       ` x ` and ` A ` have the same sets.  Because ` x ` only holds sets, the
+       conclusion is that - assuming equality is indeed the finest-grained
+       relation - ` A ` must be indistinguishable from that set.
+
+       At first glance, one might think that ` A = B ` is defined by the
+       right-hand side of the biconditional.  This would make ` x e. A ` , i.e.
+       membership of a set in a class, the more primitive concept, from which
+       equality of classes could be derived.  Such a viewpoint would be
+       coherent if the properties of membership could be fully determined by
+       other axioms.  In practice, however, the more direct and fruitful
+       approach is not to construct class equality from membership, but to
+       treat equality itself as axiomatic.
+
+       (Contributed by Wolf Lammen, 25-Aug-2025.) $)
+    ax-wl-cleq $a |- ( A = B <-> A. x ( x e. A <-> x e. B ) ) $.
+  $}
+
+  ${
+    $d x A $.  $d x B $.
+    $( The formula in ~ df-clel (restated below) essentially states that only
+       sets can be members of classes.  This restriction refines class
+       membership in such a way that equal classes become genuinely
+       indistinguishable (see ~ ax-wl-cleq ), if only membership matters.
+
+       But when is a class itself a set?
+
+       We use equality to express this.  If a class ` A ` has exactly the same
+       sets as some set variable ` x ` , then by ~ df-cleq the class and the
+       set are equal.  Assuming again that equality captures all relevant
+       properties of classes, the formula ` x = A ` is the decisive criterion
+       showing that ` A ` is in fact a set.
+
+       Equivalently, it suffices to require the existence of such a set
+       variable: ` E. x x = A ` .
+
+       From ~ df-cleq and ~ df-clel we can derive the class-level counterparts
+       of ~ ax-8 (see ~ eleq2 ) and ~ ax-9 (see ~ eleq1 ).  This demonstrates
+       again that equal classes cannot be distinguished by the membership
+       relation.
+
+       From this perspective, ~ df-clel is in fact an axiom in disguise and
+       would more appropriately be named ax-clel.
+
+       If in ~ df-clel we replace the class variable ` A ` with a set variable
+       ` z ` , the auxiliary variable ` x ` can be eliminated, leaving only the
+       trivial result ` ( z e. B <-> z e. B ) ` .  Thus, ~ df-clel by itself
+       does not determine when a set is a member of a class.
+
+       In total our axiomization leaves open what ` x e. A ` ( ` x ` , ` A `
+       disjoint) or ` x e. B ` means.  All other fundamental formulas in set
+       theory with classes ( ` A ` not a set variable, ` A e. B ` , ` x = B `
+       ` A = B ` ) can be reduced solely on the basic formulas ` x e. A ` or
+       ` x e. B ` .
+
+       If an axiomatization leaves such a fundamental formula like ` x e. A `
+       unspecified, we could in principle define it bi-conditionally by any
+       formula we like - for example, the trivial ` T. ` .  This, however, is
+       not the approach we take.  Instead, an appropriate class builder such as
+       ~ df-clab fills this gap.
+
+       (Contributed by Wolf Lammen, 26-Aug-2025.) $)
+    ax-wl-clel $a |- ( A e. B <-> E. x ( x = A /\ x e. B ) ) $.
+  $}
+
+  $( We now introduce the notion of class abstraction, which allows us to
+     describe more specific classes, in contrast to class variables that can
+     stand for any class indiscriminately.
+
+     A new syntactic form is introduced for class abstractions,
+     ` { y | ph } ` , read as "the class of sets ` y ` such that
+     ` ph ( y ) ` ".  This form is assigned the type "class" in ~ cab , so it
+     can consistently substitute for a class variable during the replacement
+     phase of syntactic construction.
+
+     Later, when defining specific classes (the definiendum), we will usually
+     rely on equality, with the definiens appearing on the right-hand side.
+
+     However, at this early stage of set theory with classes, we cannot yet do
+     this: the right-hand side of an equation may only contain a class or set
+     variable, which is too restrictive to allow meaningful definitions.
+
+     Instead, the formula in ~ df-clab (restated below) is used to characterize
+     class abstraction in one particular situation: by specifying which sets
+     actually belong to it.  Since ~ ax-wl-clel already established that the
+     axiomatization of classes determines their behavior entirely in terms of
+     the fundamental expression ` x e. A ` , the definition ` x e. { y | ph } `
+     automatically fixes both equality and membership relations involving class
+     abstractions and other forms of classes.
+
+     The Metamath definition checker usually requires the specific
+     pattern " ` |- { x | ph } = ... ` " for a formula to be accepted as a
+     proper definition.  Although ~ df-clab does not follow this pattern, it
+     still allows the elimination of class abstractions.  This eliminability is
+     the essential property of a definition, and thus ~ df-clab can
+     legitimately be regarded as one.  The inability to define class
+     abstractions directly by an equation is only a limitation if one intends
+     to introduce further primitive operations on classes besides equality and
+     membership.  In such cases, one must separately ensure that class
+     abstraction remains compatible with those extensions.  In the present
+     framework, however, set theory with classes reduces entirely to the two
+     fundamental notions: membership and equality.
+
+     For further material on the elimination of class abstractions, see BJ's
+     work beginning with ~ eliminable1 .
+
+     (Contributed by Wolf Lammn, 28-Aug-2025.) $)
+  wl-df-clab $p |- ( x e. { y | ph } <-> [ x / y ] ph ) $=
+    ( df-clab ) ABCD $.
 
 $(
 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=