MPE Home Metamath Proof Explorer < Previous   Next >
Nearby theorems
Mirrors  >  Home  >  MPE Home  >  Th. List  >  conventions Structured version   Visualization version   GIF version

Theorem conventions 26650
Description:

Here are some of the conventions we use in the Metamath Proof Explorer (aka "set.mm"), and how they correspond to typical textbook language (skipping the many cases where they are identical). For conventions related to labels, see conventions-label 26651.

  • Notation. Where possible, the notation attempts to conform to modern conventions, with variations due to our choice of the axiom system or to make proofs shorter. However, our notation is strictly sequential (left-to-right). For example, summation is written in the form Σ𝑘𝐴𝐵 (df-sum 14265) which denotes that index variable 𝑘 ranges over 𝐴 when evaluating 𝐵. Thus, Σ𝑘 ∈ ℕ (1 / (2↑𝑘)) = 1 means 1/2 + 1/4 + 1/8 + ... = 1 (geoihalfsum 14453). The notation is usually explained in more detail when first introduced.
  • Axiomatic assertions ($a). All axiomatic assertions ($a statements) starting with " " have labels starting with "ax-" (axioms) or "df-" (definitions). A statement with a label starting with "ax-" corresponds to what is traditionally called an axiom. A statement with a label starting with "df-" introduces new symbols or a new relationship among symbols that can be eliminated; they always extend the definition of a wff or class. Metamath blindly treats $a statements as new given facts but does not try to justify them. The mmj2 program will justify the definitions as sound as discussed below, except for 4 definitions (df-bi 196, df-cleq 2603, df-clel 2606, df-clab 2597) that require a more complex metalogical justification by hand.
  • Proven axioms. In some cases we wish to treat an expression as an axiom in later theorems, even though it can be proved. For example, we derive the postulates or axioms of complex arithmetic as theorems of ZFC set theory. For convenience, after deriving the postulates, we reintroduce them as new axioms on top of set theory. This lets us easily identify which axioms are needed for a particular complex number proof, without the obfuscation of the set theory used to derive them. For more, see mmcomplex.html. When we wish to use a previously-proven assertion as an axiom, our convention is that we use the regular "ax-NAME" label naming convention to define the axiom, but we precede it with a proof of the same statement with the label "axNAME" . An example is complex arithmetic axiom ax-1cn 9873, proven by the preceding theorem ax1cn 9849. The metamath.exe program will warn if an axiom does not match the preceding theorem that justifies it if the names match in this way.
  • Definitions (df-...). We encourage definitions to include hypertext links to proven examples.
  • Statements with hypotheses. Many theorems and some axioms, such as ax-mp 5, have hypotheses that must be satisfied in order for the conclusion to hold, in this case min and maj. When presented in summarized form such as in the Theorem List (click on "Nearby theorems" on the ax-mp 5 page), the hypotheses are connected with an ampersand and separated from the conclusion with a big arrow, such as in " 𝜑 & (𝜑𝜓) => 𝜓". These symbols are _not_ part of the Metamath language but are just informal notation meaning "and" and "implies".
  • Discouraged use and modification. If something should only be used in limited ways, it is marked with "(New usage is discouraged.)". This is used, for example, when something can be constructed in more than one way, and we do not want later theorems to depend on that specific construction. This marking is also used if we want later proofs to use proven axioms. For example, we want later proofs to use ax-1cn 9873 (not ax1cn 9849) and ax-1ne0 9884 (not ax1ne0 9860), as these are proven axioms for complex arithmetic. Thus, both ax1cn 9849 and ax1ne0 9860 are marked as "(New usage is discouraged.)". In some cases a proof should not normally be changed, e.g., when it demonstrates some specific technique. These are marked with "(Proof modification is discouraged.)".
  • New definitions infrequent. Typically, we are minimalist when introducing new definitions; they are introduced only when a clear advantage becomes apparent for reducing the number of symbols, shortening proofs, etc. We generally avoid the introduction of gratuitous definitions because each one requires associated theorems and additional elimination steps in proofs. For example, we use < and for inequality expressions, and use ((sin‘(i · 𝐴)) / i) instead of (sinh‘𝐴) for the hyperbolic sine.
  • Minimizing axioms and the axiom of choice. We prefer proofs that depend on fewer and/or weaker axioms, even if the proofs are longer. In particular, we prefer proofs that do not use the axiom of choice (df-ac 8822) where such proofs can be found. The axiom of choice is widely accepted, and ZFC is the most commonly-accepted fundamental set of axioms for mathematics. However, there have been and still are some lingering controversies about the Axiom of Choice. Therefore, where a proof does not require the axiom of choice, we prefer that proof instead. E.g., our proof of the Schroeder-Bernstein Theorem (sbth 7965) does not use the axiom of choice. In some cases, the weaker axiom of countable choice (ax-cc 9140) or axiom of dependent choice (ax-dc 9151) can be used instead. Similarly, any theorem in first order logic (FOL) that contains only set variables that are all mutually distinct, and has no wff variables, can be proved *without* using ax-10 2006 through ax-13 2234, by invoking ax10w 1993 through ax13w 2000. We encourage proving theorems *without* ax-10 2006 through ax-13 2234 and moving them up to the ax-4 1728 through ax-9 1986 section.
  • Alternative (ALT) proofs. If a different proof is significantly shorter or clearer but uses more or stronger axioms, we prefer to make that proof an "alternative" proof (marked with an ALT label suffix), even if this alternative proof was formalized first. We then make the proof that requires fewer axioms the main proof. This has the effect of reducing (over time) the number and strength of axioms used by any particular proof. There can be multiple alternatives if it makes sense to do so. Alternative (*ALT) theorems should have "(Proof modification is discouraged.) (New usage is discouraged.)" in their comment and should follow the main statement, so that people reading the text in order will see the main statement first. The alternative and main statement comments should use hyperlinks to refer to each other (so that a reader of one will become easily aware of the other).
  • Alternative (ALTV) versions. If a theorem or definition is an alternative/variant of an already existing theorem resp. definition, its label should have the same name with suffix ALTV. Such alternatives should be temporary only, until it is decided which alternative should be used in the future. Alternative (*ALTV) theorems or definitions are usually contained in mathboxes. Their comments need not to contain "(Proof modification is discouraged.) (New usage is discouraged.)". Alternative statements should follow the main statement, so that people reading the text in order will see the main statement first.
  • Old (OLD) versions or proofs. If a proof, definition, axiom, or theorem is going to be removed, we often stage that change by first renaming its label with an OLD suffix (to make it clear that it is going to be removed). Old (*OLD) statements should have "(Proof modification is discouraged.) (New usage is discouraged.)" and "Obsolete version of ~ xxx as of dd-mmm-yyyy." (not enclosed in parentheses) in the comment. An old statement should follow the main statement, so that people reading the text in order will see the main statement first. This typically happens when a shorter proof to an existing theorem is found: the existing theorem is kept as an *OLD statement for one year. When a proof is shortened automatically (using Metamath's minimize_with command), then it is not necessary to keep the old proof, nor to add credit for the shortening.
  • Variables. Propositional variables (variables for well-formed formulas or wffs) are represented with lowercase Greek letters and are normally used in this order: 𝜑 = phi, 𝜓 = psi, 𝜒 = chi, 𝜃 = theta, 𝜏 = tau, 𝜂 = eta, 𝜁 = zeta, and 𝜎 = sigma. Individual setvar variables are represented with lowercase Latin letters and are normally used in this order: 𝑥, 𝑦, 𝑧, 𝑤, 𝑣, 𝑢, and 𝑡. Variables that represent classes are often represented by uppercase Latin letters: 𝐴, 𝐵, 𝐶, 𝐷, 𝐸, and so on. There are other symbols that also represent class variables and suggest specific purposes, e.g., 0 for poset zero (see p0val 16864) and connective symbols such as + for some group addition operation. (See prdsplusgval 15956 for an example of the use of +). Class variables are selected in alphabetical order starting from 𝐴 if there is no reason to do otherwise, but many assertions select different class variables or a different order to make their intended meaning clearer.
  • Turnstile. "", meaning "It is provable that," is the first token of all assertions and hypotheses that aren't syntax constructions. This is a standard convention in logic. For us, it also prevents any ambiguity with statements that are syntax constructions, such as "wff ¬ 𝜑".
  • Biconditional (). There are basically two ways to maximize the effectiveness of biconditionals (): you can either have one-directional simplifications of all theorems that produce biconditionals, or you can have one-directional simplifications of theorems that consume biconditionals. Some tools (like Lean) follow the first approach, but set.mm follows the second approach. Practically, this means that in set.mm, for every theorem that uses an implication in the hypothesis, like ax-mp 5, there is a corresponding version with a biconditional or a reversed biconditional, like mpbi 219 or mpbir 220. We prefer this second approach because the number of duplications in the second approach is bounded by the size of the propositional calculus section, which is much smaller than the number of possible theorems in all later sections that produce biconditionals. So although theorems like biimpi 205 are available, in most cases there is already a theorem that combines it with your theorem of choice, like mpbir2an 957, sylbir 224, or 3imtr4i 280.
  • Substitution. "[𝑦 / 𝑥]𝜑" should be read "the wff that results from the proper substitution of 𝑦 for 𝑥 in wff 𝜑." See df-sb 1868 and the related df-sbc 3403 and df-csb 3500.
  • Is-a-set. "𝐴 ∈ V" should be read "Class 𝐴 is a set (i.e. exists)." This is a convention based on Definition 2.9 of [Quine] p. 19. See df-v 3175 and isset 3180. However, instead of using 𝐼 ∈ V in the antecedent of a theorem for some variable 𝐼, we now prefer to use 𝐼𝑉 (or another variable if 𝑉 is not available) to make it more general. That way we can often avoid needing extra uses of elex 3185 and syl 17 in the common case where 𝐼 is already a member of something. For hypotheses ($e statement) of theorems (mostly in inference form), however, 𝐴 ∈ V is used rather than 𝐴𝑉 (e.g. difexi 4736). This is because 𝐴 ∈ V is almost always satisfied using an existence theorem stating "... ∈ V", and a hard-coded V in the $e statement saves a couple of syntax building steps that substitute V into 𝑉. Notice that this does not hold for hypotheses of theorems in deduction form: Here still (𝜑𝐴𝑉) should be used rather than (𝜑𝐴 ∈ V).
  • Converse. "𝑅" should be read "converse of (relation) 𝑅" and is the same as the more standard notation R^{-1} (the standard notation is ambiguous). See df-cnv 5046. This can be used to define a subset, e.g., df-tan 14641 notates "the set of values whose cosine is a nonzero complex number" as (cos “ (ℂ ∖ {0})).
  • Function application. "(𝐹𝑥)" should be read "the value of function 𝐹 at 𝑥" and has the same meaning as the more familiar but ambiguous notation F(x). For example, (cos‘0) = 1 (see cos0 14719). The left apostrophe notation originated with Peano and was adopted in Definition *30.01 of [WhiteheadRussell] p. 235, Definition 10.11 of [Quine] p. 68, and Definition 6.11 of [TakeutiZaring] p. 26. See df-fv 5812. In the ASCII (input) representation there are spaces around the grave accent; there is a single accent when it is used directly, and it is doubled within comments.
  • Infix and parentheses. When a function that takes two classes and produces a class is applied as part of an infix expression, the expression is always surrounded by parentheses (see df-ov 6552). For example, the + in (2 + 2); see 2p2e4 11021. Function application is itself an example of this. Similarly, predicate expressions in infix form that take two or three wffs and produce a wff are also always surrounded by parentheses, such as (𝜑𝜓), (𝜑𝜓), (𝜑𝜓), and (𝜑𝜓) (see wi 4, df-or 384, df-an 385, and df-bi 196 respectively). In contrast, a binary relation (which compares two _classes_ and produces a _wff_) applied in an infix expression is _not_ surrounded by parentheses. This includes set membership 𝐴𝐵 (see wel 1978), equality 𝐴 = 𝐵 (see df-cleq 2603), subset 𝐴𝐵 (see df-ss 3554), and less-than 𝐴 < 𝐵 (see df-lt 9828). For the general definition of a binary relation in the form 𝐴𝑅𝐵, see df-br 4584. For example, 0 < 1 (see 0lt1 10429) does not use parentheses.
  • Unary minus. The symbol - is used to indicate a unary minus, e.g., -1. It is specially defined because it is so commonly used. See cneg 10146.
  • Function definition. Functions are typically defined by first defining the constant symbol (using $c) and declaring that its symbol is a class with the label cNAME (e.g., ccos 14634). The function is then defined labeled df-NAME; definitions are typically given using the maps-to notation (e.g., df-cos 14640). Typically, there are other proofs such as its closure labeled NAMEcl (e.g., coscl 14696), its function application form labeled NAMEval (e.g., cosval 14692), and at least one simple value (e.g., cos0 14719).
  • Factorial. The factorial function is traditionally a postfix operation, but we treat it as a normal function applied in prefix form, e.g., (!‘4) = 24 (df-fac 12923 and fac4 12930).
  • Unambiguous symbols. A given symbol has a single unambiguous meaning in general. Thus, where the literature might use the same symbol with different meanings, here we use different (variant) symbols for different meanings. These variant symbols often have suffixes, subscripts, or underlines to distinguish them. For example, here "0" always means the value zero (df-0 9822), while "0g" is the group identity element (df-0g 15925), "0." is the poset zero (df-p0 16862), "0𝑝" is the zero polynomial (df-0p 23243), "0vec" is the zero vector in a normed complex vector space (df-0v 26837), and "0" is a class variable for use as a connective symbol (this is used, for example, in p0val 16864). There are other class variables used as connective symbols where traditional notation would use ambiguous symbols, including "1", "+", "", and "". These symbols are very similar to traditional notation, but because they are different symbols they eliminate ambiguity.
  • ASCII representation of symbols. We must have an ASCII representation for each symbol. We generally choose short sequences, ideally digraphs, and generally choose sequences that vaguely resemble the mathematical symbol. Here are some of the conventions we use when selecting an ASCII representation.
    We generally do not include parentheses inside a symbol because that confuses text editors (such as emacs). Greek letters for wff variables always use the first two letters of their English names, making them easy to type and easy to remember. Symbols that almost look like letters, such as , are often represented by that letter followed by a period. For example, "A." is used to represent , "e." is used to represent , and "E." is used to represent . Single letters are now always variable names, so constants that are often shown as single letters are now typically preceded with "_" in their ASCII representation, for example, "_i" is the ASCII representation for the imaginary unit i. A script font constant is often the letter preceded by "~" meaning "curly", such as "~P" to represent the power class 𝒫.
    Originally, all setvar and class variables used only single letters a-z and A-Z, respectively. A big change in recent years was to allow the use of certain symbols as variable names to make formulas more readable, such as a variable representing an additive group operation. The convention is to take the original constant token (in this case "+" which means complex number addition) and put a period in front of it to result in the ASCII representation of the variable ".+", shown as +, that can be used instead of say the letter "P" that had to be used before.
    Choosing tokens for more advanced concepts that have no standard symbols but are represented by words in books, is hard. A few are reasonably obvious, like "Grp" for group and "Top" for topology, but often they seem to end up being either too long or too cryptic. It would be nice if the math community came up with standardized short abbreviations for English math terminology, like they have more or less done with symbols, but that probably won't happen any time soon.
    Another informal convention that we've somewhat followed, that is also not uncommon in the literature, is to start tokens with a capital letter for collection-like objects and lower case for function-like objects. For example, we have the collections On (ordinal numbers), Fin, Prime, Grp, and we have the functions sin, tan, log, sup. Predicates like Ord and Lim also tend to start with upper case, but in a sense they are really collection-like, e.g. Lim indirectly represents the collection of limit ordinals, but it can't be an actual class since not all limit ordinals are sets. This initial capital vs. lower case letter convention is sometimes ambiguous. In the past there's been a debate about whether domain and range are collection-like or function-like, thus whether we should use Dom, Ran or dom, ran. Both are used in the literature. In the end dom, ran won out for aesthetic reasons (Norm Megill simply just felt they looked nicer).
  • Typography conventions. Class symbols for functions (e.g., abs, sin) should usually not have leading or trailing blanks in their HTML/Latex representation. This is in contrast to class symbols for operations (e.g., gcd, sadd, eval), which usually do include leading and trailing blanks in their representation. If a class symbol is used for a function as well as an operation (according to the definition df-ov 6552, each operation value can be written as function value of an ordered pair), the convention for its primary usage should be used, e.g. (iEdg‘𝐺) versus (𝑉iEdg𝐸) for the edges of a graph 𝐺 = ⟨𝑉, 𝐸.
  • Number construction independence. There are many ways to model complex numbers. After deriving the complex number postulates we reintroduce them as new axioms on top of set theory. This lets us easily identify which axioms are needed for a particular complex number proof, without the obfuscation of the set theory used to derive them. This also lets us be independent of the specific construction, which we believe is valuable. See mmcomplex.html for details. Thus, for example, we don't allow the use of ∅ ∉ ℂ, as handy as that would be, because that would be construction-specific. We want proofs about to be independent of whether or not ∅ ∈ ℂ.
  • Minimize hypotheses (except for construction independence and number theorem domains). In most cases we try to minimize hypotheses, that is, we eliminate or reduce what must be true to prove something, so that the proof is more general and easier to use. There are exceptions. For example, we intentionally add hypotheses if they help make proofs independent of a particular construction (e.g., the contruction of complex numbers ). We also intentionally add hypotheses for many real and complex number theorems to expressly state their domains even when they aren't strictly needed. For example, we could show that (𝐴 < 𝐵𝐵𝐴) without any other hypotheses, but in practice we also require proving at least some domains (e.g., see ltnei 10040). Here are the reasons as discussed in https://groups.google.com/g/metamath/c/2AW7T3d2YiQ/m/iSN7g87t3ikJ:
    1. Having the hypotheses immediately shows the intended domain of applicability (is it , *, ω, or something else?), without having to trace back to definitions.
    2. Having the hypotheses forces its use in the intended domain, which generally is desirable.
    3. The behavior is dependent on accidental behavior of definitions outside of their domains, so the theorems are non-portable and "brittle".
    4. Only a few theorems can have their hypotheses removed in this fashion due to happy coincidences for our particular set-theoretical definitions. The poor user (especially a novice learning real number arithmetic) is going to be confused not knowing when hypotheses are needed and when they are not. For someone who hasn't traced back the set-theoretical foundations of the definitions, it is seemingly random and isn't intuitive at all.
    5. The consensus of opinion of people on this group seemed to be against doing this.
  • Natural numbers. There are different definitions of "natural" numbers in the literature. We use (df-nn 10898) for the set of positive integers starting from 1, and 0 (df-n0 11170) for the set of nonnegative integers starting at zero.
  • Decimal numbers. Numbers larger than nine are often expressed in base 10 using the decimal constructor df-dec 11370, e.g., 4001 (see 4001prm 15690 for a proof that 4001 is prime).
  • Theorem forms. We will use the following descriptive terms to categorize theorems:
    • A theorem is in "closed form" if it has no $e hypotheses (e.g., unss 3749). The term "tautology" is also used, especially in propositional calculus. This form was formerly called "theorem form" or "closed theorem form".
    • A theorem is in "deduction form" (or is a "deduction") if it has one or more $e hypotheses, and the hypotheses and the conclusion are implications that share the same antecedent. More precisely, the conclusion is an implication with a wff variable as the antecedent (usually 𝜑), and every hypothesis ($e statement) is either:
      1. an implication with the same antecedent as the conclusion, or
      2. a definition. A definition can be for a class variable (this is a class variable followed by =, e.g. the definition of 𝐷 in lhop 23583) or a wff variable (this is a wff variable followed by ); class variable definitions are more common.
      In practice, a proof of a theorem in deduction form will also contain many steps that are implications where the antecedent is either that wff variable (usually 𝜑) or is a conjunction (𝜑 ∩ ...) including that wff variable (𝜑). E.g. a1d 25, unssd 3751.
    • A theorem is in "inference form" (or is an "inference") if it has one or more $e hypotheses, but is not in deduction form, i.e. there is no common antecedent (e.g., unssi 3750).
    Any theorem whose conclusion is an implication has an associated inference, whose hypotheses are the hypotheses of that theorem together with the antecedent of its conclusion, and whose conclusion is the consequent of that conclusion. When both theorems are in set.mm, then the associated inference is often labeled by adding the suffix "i" to the label of the original theorem (for instance, con3i 149 is the inference associated with con3 148). The inference associated with a theorem is easily derivable from that theorem by a simple use of ax-mp 5. The other direction is the subject of the Deduction Theorem discussed below. We may also use the term "associated inference" when the above process is iterated. For instance, syl 17 is an inference associated with imim1 81 because it is the inference associated with imim1i 61 which is itself the inference associated with imim1 81.
    "Deduction form" is the preferred form for theorems because this form allows us to easily use the theorem in places where (in traditional textbook formalizations) the standard Deduction Theorem (see below) would be used. We call this approach "deduction style". In contrast, we usually avoid theorems in "inference form" when that would end up requiring us to use the deduction theorem.
    Deductions have a label suffix of "d", especially if there are other forms of the same theorem (e.g., pm2.43d 51). The labels for inferences usually have the suffix "i" (e.g., pm2.43i 50). The labels of theorems in "closed form" would have no special suffix (e.g., pm2.43 54). When an inference is converted to a theorem by eliminating an "is a set" hypothesis, we sometimes suffix the closed form with "g" (for "more general") as in uniex 6851 vs. uniexg 6853.
  • Deduction theorem. The Deduction Theorem is a metalogical theorem that provides an algorithm for constructing a proof of a theorem from the proof of its corresponding deduction (its associated inference). See for instance Theorem 3 in [Margaris] p. 56. In ordinary mathematics, no one actually carries out the algorithm, because (in its most basic form) it involves an exponential explosion of the number of proof steps as more hypotheses are eliminated. Instead, in ordinary mathematics the Deduction Theorem is invoked simply to claim that something can be done in principle, without actually doing it. For more details, see mmdeduction.html. The Deduction Theorem is a metalogical theorem that cannot be applied directly in metamath, and the explosion of steps would be a problem anyway, so alternatives are used. One alternative we use sometimes is the "weak deduction theorem" dedth 4089, which works in certain cases in set theory. We also sometimes use dedhb 3343. However, the primary mechanism we use today for emulating the deduction theorem is to write proofs in deduction form (aka "deduction style") as described earlier; the prefixed 𝜑 mimics the context in a deduction proof system. In practice this mechanism works very well. This approach is described in the deduction form and natural deduction page mmnatded.html; a list of translations for common natural deduction rules is given in natded 26652.
  • Recursion. We define recursive functions using various "recursion constructors". These allow us to define, with compact direct definitions, functions that are usually defined in textbooks with indirect self-referencing recursive definitions. This produces compact definition and much simpler proofs, and greatly reduces the risk of creating unsound definitions. Examples of recursion constructors include recs(𝐹) in df-recs 7355, rec(𝐹, 𝐼) in df-rdg 7393, seq𝜔(𝐹, 𝐼) in df-seqom 7430, and seq𝑀( + , 𝐹) in df-seq 12664. These have characteristic function 𝐹 and initial value 𝐼. (Σg in df-gsum 15926 isn't really designed for arbitrary recursion, but you could do it with the right magma.) The logically primary one is df-recs 7355, but for the "average user" the most useful one is probably df-seq 12664- provided that a countable sequence is sufficient for the recursion.
  • Extensible structures. Mathematics includes many structures such as ring, group, poset, etc. We define an "extensible structure" which is then used to define group, ring, poset, etc. This allows theorems from more general structures (groups) to be reused for more specialized structures (rings) without having to reprove them. See df-struct 15697.
  • Undefined results and "junk theorems". Some expressions are only expected to be meaningful in certain contexts. For example, consider Russell's definition description binder iota, where (℩𝑥𝜑) is meant to be "the 𝑥 such that 𝜑" (where 𝜑 typically depends on x). What should that expression produce when there is no such 𝑥? In set.mm we primarily use one of two approaches. One approach is to make the expression evaluate to the empty set whenever the expression is being used outside of its expected context. While not perfect, it makes it a bit more clear when something is undefined, and it has the advantage that it makes more things equal outside their domain which can remove hypotheses when you feel like exploiting these so-called junk theorems. Note that Quine does this with iota (his definition of iota evaluates to the empty set when there is no unique value of 𝑥). Quine has no problem with that and we don't see why we should, so we define iota exactly the same way that Quine does. The main place where you see this being systematically exploited is in "reverse closure" theorems like 𝐴 ∈ (𝐹𝐵) → 𝐵 ∈ dom 𝐹, which is useful when 𝐹 is a family of sets. (by this we mean it's a set set even in a type theoretic interpretation.) The second approach uses "(New usage is discouraged.)" to prevent unintentional uses of certain properties. For example, you could define some construct df-NAME whose usage is discouraged, and prove only the specific properties you wish to use (and add those proofs to the list of permitted uses of "discouraged" information). From then on, you can only use those specific properties without a warning. Other approaches often have hidden problems. For example, you could try to "not define undefined terms" by creating definitions like ${ $d 𝑦𝑥 $. $d 𝑦𝜑 $. df-iota $a (∃!𝑥𝜑 → (℩𝑥𝜑) = {𝑥𝜑}) $. $}. This will be rejected by the definition checker, but the bigger theoretical reason to reject this axiom is that it breaks equality - the metatheorem (𝑥 = 𝑦 P(x) = P(y) ) fails to hold if definitions don't unfold without some assumptions. (That is, iotabidv 5789 is no longer provable and must be added as an axiom.) It is important for every syntax constructor to satisfy equality theorems *unconditionally*, e.g., expressions like (1 / 0) = (1 / 0) should not be rejected. This is forced on us by the context free term language, and anything else requires a lot more infrastructure (e.g., a type checker) to support without making everything else more painful to use. Another approach would be to try to make nonsensical statements syntactically invalid, but that can create its own complexities; in some cases that would make parsing itself undecidable. In practice this does not seem to be a serious issue. No one does these things deliberately in "real" situations, and some knowledgeable people (such as Mario Carneiro) have never seen this happen accidentally. Norman Megill doesn't agree that these "junk" consequences are necessarily bad anyway, and they can significantly shorten proofs in some cases. This database would be much larger if, for example, we had to condition fvex 6113 on the argument being in the domain of the function. It is impossible to derive a contradiction from sound definitions (i.e. that pass the definition check), assuming ZFC is consistent, and he doesn't see the point of all the extra busy work and huge increase in set.mm size that would result from restricting *all* definitions. So instead of implementing a complex system to counter a problem that does not appear to occur in practice, we use a significantly simpler set of approaches.
  • Organizing proofs. Humans have trouble understanding long proofs. It is often preferable to break longer proofs into smaller parts (just as with traditional proofs). In Metamath this is done by creating separate proofs of the separate parts. A proof with the sole purpose of supporting a final proof is a lemma; the naming convention for a lemma is the final proof's name followed by "lem", and a number if there is more than one. E.g., sbthlem1 7955 is the first lemma for sbth 7965. Also, consider proving reusable results separately, so that others will be able to easily reuse that part of your work.
  • Limit proof size. It is often preferable to break longer proofs into smaller parts, just as you would do with traditional proofs. One reason is that humans have trouble understanding long proofs. Another reason is that it's generally best to prove reusable results separately, so that others will be able to easily reuse them. Finally, the "minimize" routine can take much longer with very long proofs. We encourage proofs to be no more than 200 essential steps, and generally no more than 500 essential steps, though these are simply guidelines and not hard-and-fast rules. Much smaller proofs are fine! We also acknowledge that some proofs, especially autogenerated ones, should sometimes not be broken up (e.g., because breaking them up might be useless and inefficient due to many interconnections and reused terms within the proof). In Metamath, breaking up longer proofs is done by creating multiple separate proofs of separate parts. A proof with the sole purpose of supporting a final proof is a lemma; the naming convention for a lemma is the final proof's name followed by "lem", and a number if there is more than one. E.g., sbthlem1 7955 is the first lemma for sbth 7965.
  • Hypertext links. We strongly encourage comments to have many links to related material, with accompanying text that explains the relationship. These can help readers understand the context. Links to other statements, or to HTTP/HTTPS URLs, can be inserted in ASCII source text by prepending a space-separated tilde (e.g., " ~ df-prm " results in " df-prm 15224"). When metamath.exe is used to generate HTML it automatically inserts hypertext links for syntax used (e.g., every symbol used), every axiom and definition depended on, the justification for each step in a proof, and to both the next and previous assertion.
  • Hypertext links to section headers. Some section headers have text under them that describes or explains the section. However, they are not part of the description of axioms or theorems, and there is no way to link to them directly. To provide for this, section headers with accompanying text (indicated with "*" prefixed to mmtheorems.html#mmdtoc entries) have an anchor in mmtheorems.html whose name is the first $a or $p statement that follows the header. For example there is a glossary under the section heading called GRAPH THEORY. The first $a or $p statement that follows is cuhg 25819, which you can see two lines down. To reference it we link to the anchor using a space-separated tilde followed by the space-separated link mmtheorems.html#cuhg, which will become the hyperlink mmtheorems.html#cuhg. Note that no theorem in set.mm is allowed to begin with "mm" (enforced by "verify markup" in the metamath program). Whenever the software sees a tilde reference beginning with "http:", "https:", or "mm", the reference is assumed to be a link to something other than a statement label, and the tilde reference is used as is. This can also be useful for relative links to other pages such as mmcomplex.html.
  • Bibliography references. Please include a bibliographic reference to any external material used. A name in square brackets in a comment indicates a bibliographic reference. The full reference must be of the form KEYWORD IDENTIFIER? NOISEWORD(S)* [AUTHOR(S)] p. NUMBER - note that this is a very specific form that requires a page number. There should be no comma between the author reference and the "p." (a constant indicator). Whitespace, comma, period, or semicolon should follow NUMBER. An example is Theorem 3.1 of [Monk1] p. 22, The KEYWORD, which is not case-sensitive, must be one of the following: Axiom, Chapter, Compare, Condition, Corollary, Definition, Equation, Example, Exercise, Figure, Item, Lemma, Lemmas, Line, Lines, Notation, Part, Postulate, Problem, Property, Proposition, Remark, Rule, Scheme, Section, or Theorem. The IDENTIFIER is optional, as in for example "Remark in [Monk1] p. 22". The NOISEWORDS(S) are zero or more from the list: from, in, of, on. The AUTHOR(S) must be present in the file identified with the htmlbibliography assignment (e.g., mmset.html) as a named anchor (NAME=). If there is more than one document by the same author(s), add a numeric suffix (as shown here). The NUMBER is a page number, and may be any alphanumeric string such as an integer or Roman numeral. Note that we _require_ page numbers in comments for individual $a or $p statements. We allow names in square brackets without page numbers (a reference to an entire document) in heading comments. If this is a new reference, please also add it to the "Bibliography" section of mmset.html. (The file mmbiblio.html is automatically rebuilt, e.g., using the metamath.exe "write bibliography" command.)
  • Acceptable shorter proofs Shorter proofs are welcome, and any shorter proof we accept will be acknowledged in the theorem's description. However, in some cases a proof may be "shorter" or not depending on how it is formatted. This section provides general guidelines.

    Usually we automatically accept shorter proofs that (1) shorten the set.mm file (with compressed proofs), (2) reduce the size of the HTML file generated with SHOW STATEMENT xx / HTML, (3) use only existing, unmodified theorems in the database (the order of theorems may be changed, though), and (4) use no additional axioms. Usually we will also automatically accept a _new_ theorem that is used to shorten multiple proofs, if the total size of set.mm (including the comment of the new theorem, not including the acknowledgment) decreases as a result.

    In borderline cases, we typically place more importance on the number of compressed proof steps and less on the length of the label section (since the names are in principle arbitrary). If two proofs have the same number of compressed proof steps, we will typically give preference to the one with the smaller number of different labels, or if these numbers are the same, the proof with the fewest number of characters that the proofs happen to have by chance when label lengths are included.

    A few theorems have a longer proof than necessary in order to avoid the use of certain axioms, for pedagogical purposes, and for other reasons. These theorems will (or should) have a "(Proof modification is discouraged.)" tag in their description. For example, idALT 23 shows a proof directly from axioms. Shorter proofs for such cases won't be accepted, of course, unless the criteria described continues to be satisfied.

  • Input format. The input is in ASCII with two-space indents. Tab characters are not allowed. Use embedded math comments or HTML entities for non-ASCII characters (e.g., "&eacute;" for "é").
  • Information on syntax, axioms, and definitions. For a hyperlinked list of syntax, axioms, and definitions, see mmdefinitions.html. If you have questions about a specific symbol or axiom, it is best to go directly to its definition to learn more about it. The generated HTML for each theorem and axiom includes hypertext links to each symbol's definition.
  • Reserved symbols: 'LETTER. Some symbols are reserved for potential future use. Symbols with the pattern 'LETTER are reserved for possibly representing characters (this is somewhat similar to Lisp). We would expect '\n to represent newline, 'sp for space, and perhaps '\x24 for the dollar character.
  • Language and spelling. It is preferred to use American English for comments and symbols, e.g. we use "neighborhood" instead of the British English "neighbourhood". An exception is the word "analog", which can be either a noun or an adjective. Furthermore, "analog" has the confounding meaning "not digital", whereas "analogue" is often used in the sense something that bears analogy to something else also in American English. Therefore, "analogue" is used for the noun and "analogous" for the adjective in set.mm.
  • Comments and layout. As for formatting of the file set.mm, and in particular formatting and layout of the comments, the foremost rule is consistency. The first sections of set.mm, in particular Part 1 "Classical first-order logic with equality" can serve as a model for contributors. Some formatting rules are enforced when using the Metamath program's "WRITE SOURCE" command with the "REWRAP" option. Here are a few other rules, which are not enforced, but that we try follow:
    • The file set.mm should have a double blank line before each section header, and at no other places. In particular, there are no triple blank lines. If there is a "@( Begin $[ ... $] @)" comment (where "@" is actually "$") before the section header, then the double blank line should go before that comment.
    • The header comments should be spaced as those of Part 1, namely, with a blank line before and after the comment, and an indentation of two spaces.
    • Header comments are not rewrapped by the Metamath program [as of 24-Oct-2021], but similar spacing and wrapping should be used as for other comments: double spaces after a period ending a sentence, line wrapping with line width of 79, and no trailing spaces at the end of lines.


The challenge of varying mathematical conventions

We try to follow mathematical conventions, but in many cases different texts use different conventions. In those cases we pick some reasonably common convention and stick to it. We have already mentioned that the term "natural number" has varying definitions (some start from 0, others start from 1), but that is not the only such case. A useful example is the set of metavariables used to represent arbitrary well-formed formulas (wffs). We use an open phi, φ, to represent the first arbitrary wff in an assertion with one or more wffs; this is a common convention and this symbol is easily distinguished from the empty set symbol. That said, it is impossible to please everyone or simply "follow the literature" because there are many different conventions for a variable that represents any arbitrary wff. To demonstrate the point, here are some conventions for variables that represent an arbitrary wff and some texts that use each convention:
  • open phi φ (and so on): Tarski's papers, Rasiowa & Sikorski's The Mathematics of Metamathematics (1963), Monk's Introduction to Set Theory (1969), Enderton's Elements of Set Theory (1977), Bell & Machover's A Course in Mathematical Logic (1977), Jech's Set Theory (1978), Takeuti & Zaring's Introduction to Axiomatic Set Theory (1982).
  • closed phi ϕ (and so on): Levy's Basic Set Theory (1979), Kunen's Set Theory (1980), Paulson's Isabelle: A Generic Theorem Prover (1994), Huth and Ryan's Logic in Computer Science (2004/2006).
  • Greek α, β, γ: Duffy's Principles of Automated Theorem Proving (1991).
  • Roman A, B, C: Kleene's Introduction to Metamathematics (1974), Smullyan's First-Order Logic (1968/1995).
  • script A, B, C: Hamilton's Logic for Mathematicians (1988).
  • italic A, B, C: Mendelson's Introduction to Mathematical Logic (1997).
  • italic P, Q, R: Suppes's Axiomatic Set Theory (1972), Gries and Schneider's A Logical Approach to Discrete Math (1993/1994), Rosser's Logic for Mathematicians (2008).
  • italic p, q, r: Quine's Set Theory and Its Logic (1969), Kuratowski & Mostowski's Set Theory (1976).
  • italic X, Y, Z: Dijkstra and Scholten's Predicate Calculus and Program Semantics (1990).
  • Fraktur letters: Fraenkel et. al's Foundations of Set Theory (1973).


Distinctness or freeness

Here are some conventions that address distinctness or freeness of a variable:
  • 𝑥𝜑 is read " 𝑥 is not free in (wff) 𝜑"; see df-nf 1701 (whose description has some important technical details). Similarly, 𝑥𝐴 is read 𝑥 is not free in (class) 𝐴, see df-nfc 2740.
  • "$d x y $." should be read "Assume x and y are distinct variables."
  • "$d x 𝜑 $." should be read "Assume x does not occur in phi $." Sometimes a theorem is proved using 𝑥𝜑 (df-nf 1701) in place of "$d 𝑥𝜑 $." when a more general result is desired; ax-5 1827 can be used to derive the $d version. For an example of how to get from the $d version back to the $e version, see the proof of euf 2466 from df-eu 2462.
  • "$d x A $." should be read "Assume x is not a variable occurring in class A."
  • "$d x A $. $d x ps $. $e |- (𝑥 = 𝐴 → (𝜑𝜓)) $." is an idiom often used instead of explicit substitution, meaning "Assume psi results from the proper substitution of A for x in phi."
  • " (¬ ∀𝑥𝑥 = 𝑦 → ..." occurs early in some cases, and should be read "If x and y are distinct variables, then..." This antecedent provides us with a technical device (called a "distinctor" in Section 7 of [Megill] p. 444) to avoid the need for the $d statement early in our development of predicate calculus, permitting unrestricted substitutions as conceptually simple as those in propositional calculus. However, the $d eventually becomes a requirement, and after that this device is rarely used.

There is a general technique to replace a $d x A or $d x ph condition in a theorem with the corresponding 𝑥𝐴 or 𝑥𝜑; here it is. T[x, A] where , and you wish to prove 𝑥𝐴 T[x, A]. You apply the theorem substituting 𝑦 for 𝑥 and 𝐴 for 𝐴, where 𝑦 is a new dummy variable, so that $d y A is satisfied. You obtain T[y, A], and apply chvar to obtain T[x, A] (or just use mpbir 220 if T[x, A] binds 𝑥). The side goal is (𝑥 = 𝑦 → ( T[y, A] T[x, A] )), where you can use equality theorems, except that when you get to a bound variable you use a non-dv bound variable renamer theorem like cbval 2259. The section mmtheorems32.html#mm3146s also describes the metatheorem that underlies this.

Standard Metamath verifiers do not distinguish between axioms and definitions (both are $a statements). In practice, we require that definitions (1) be conservative (a definition should not allow an expression that previously qualified as a wff but was not provable to become provable) and be eliminable (there should exist an algorithmic method for converting any expression using the definition into a logically equivalent expression that previously qualified as a wff). To ensure this, we have additional rules on almost all definitions ($a statements with a label that does not begin with ax-). These additional rules are not applied in a few cases where they are too strict (df-bi 196, df-clab 2597, df-cleq 2603, and df-clel 2606); see those definitions for more information. These additional rules for definitions are checked by at least mmj2's definition check (see mmj2 master file mmj2jar/macros/definitionCheck.js). This definition check relies on the database being very much like set.mm, down to the names of certain constants and types, so it cannot apply to all Metamath databases... but it is useful in set.mm. In this definition check, a $a-statement with a given label and typecode passes the test if and only if it respects the following rules (these rules require that we have an unambiguous tree parse, which is checked separately):

  1. The expression must be a biconditional or an equality (i.e. its root-symbol must be or =). If the proposed definition passes this first rule, we then define its definiendum as its left hand side (LHS) and its definiens as its right hand side (RHS). We define the *defined symbol* as the root-symbol of the LHS. We define a *dummy variable* as a variable occurring in the RHS but not in the LHS. Note that the "root-symbol" is the root of the considered tree; it need not correspond to a single token in the database (e.g., see w3o 1030 or wsb 1867).
  2. The defined expression must not appear in any statement between its syntax axiom () and its definition, and the defined expression must not be used in its definiens. See df-3an 1033 for an example where the same symbol is used in different ways (this is allowed).
  3. No two variables occurring in the LHS may share a disjoint variable (DV) condition.
  4. All dummy variables are required to be disjoint from any other (dummy or not) variable occurring in this labeled expression.
  5. Either (a) there must be no non-setvar dummy variables, or (b) there must be a justification theorem. The justification theorem must be of form ( definiens root-symbol definiens' ) where definiens' is definiens but the dummy variables are all replaced with other unused dummy variables of the same type. Note that root-symbol is or =, and that setvar variables are simply variables with the setvar typecode.
  6. One of the following must be true: (a) there must be no setvar dummy variables, (b) there must be a justification theorem as described in rule 5, or (c) if there are setvar dummy variables, every one must not be free. That is, it must be true that (𝜑 → ∀𝑥𝜑) for each setvar dummy variable 𝑥 where 𝜑 is the definiens. We use two different tests for non-freeness; one must succeed for each setvar dummy variable 𝑥. The first test requires that the setvar dummy variable 𝑥 be syntactically bound (this is sometimes called the "fast" test, and this implies that we must track binding operators). The second test requires a successful search for the directly-stated proof of (𝜑 → ∀𝑥𝜑) Part c of this rule is how most setvar dummy variables are handled.

Rule 3 may seem unnecessary, but it is needed. Without this rule, you can define something like cbar $a wff Foo x y $. ${ $d x y $. df-foo $a |- ( Foo x y <-> x = y ) $. $} and now "Foo x x" is not eliminable; there is no way to prove that it means anything in particular, because the definitional theorem that is supposed to be responsible for connecting it to the original language wants nothing to do with this expression, even though it is well formed.

A justification theorem for a definition (if used this way) must be proven before the definition that depends on it. One example of a justification theorem is vjust 3174. The definition df-v 3175 V = {𝑥𝑥 = 𝑥} is justified by the justification theorem vjust 3174 {𝑥𝑥 = 𝑥} = {𝑦𝑦 = 𝑦}. Another example of a justification theorem is trujust 1477; the definition df-tru 1478 (⊤ ↔ (∀𝑥𝑥 = 𝑥 → ∀𝑥𝑥 = 𝑥)) is justified by trujust 1477 ((∀𝑥𝑥 = 𝑥 → ∀𝑥𝑥 = 𝑥) ↔ (∀𝑦𝑦 = 𝑦 → ∀𝑦𝑦 = 𝑦)).

Here is more information about our processes for checking and contributing to this work:

  • Multiple verifiers. This entire file is verified by multiple independently-implemented verifiers when it is checked in, giving us extremely high confidence that all proofs follow from the assumptions. The checkers also check for various other problems such as overly long lines.
  • Maximum text line length is 79 characters. You can fix comment line length by running the commands scripts/rewrap or metamath 'read set.mm' 'save proof */c/f' 'write source set.mm/rewrap' quit . As a general rule, a math string in a comment should be surrounded by backquotes on the same line, and if it is too long it should be broken into multiple adjacent mathstrings on multiple lines. Those commands don't modify the math content of statements. In statements we try to break before the outermost important connective (not including the typecode and perhaps not the antecedent). For examples, see sqrtmulii 13974 and absmax 13917.
  • Discouraged information. A separate file named "discouraged" lists all discouraged statements and uses of them, and this file is checked. If you change the use of discouraged things, you will need to change this file. This makes it obvious when there is a change to anything discouraged (triggering further review).
  • LRParser check. Metamath verifiers ensure that $p statements follow from previous $a and $p statements. However, by itself the Metamath language permits certain kinds of syntactic ambiguity that we choose to avoid in this database. Thus, we require that this database unambiguously parse using the "LRParser" check (implemented by at least mmj2). (For details, see mmj2 master file src/mmj/verify/LRParser.java). This check counters, for example, a devious ambiguous construct developed by saueran at oregonstate dot edu posted on Mon, 11 Feb 2019 17:32:32 -0800 (PST) based on creating definitions with mismatched parentheses.
  • Proposing specific changes. Please propose specific changes as pull requests (PRs) against the "develop" branch of set.mm, at: https://github.com/metamath/set.mm/tree/develop
  • Community. We encourage anyone interested in Metamath to join our mailing list: https://groups.google.com/forum/#!forum/metamath.

(Contributed by DAW, 27-Dec-2016.) (New usage is discouraged.)

Hypothesis
Ref Expression
conventions.1 𝜑
Assertion
Ref Expression
conventions 𝜑

Proof of Theorem conventions
StepHypRef Expression
1 conventions.1 1 𝜑
Colors of variables: wff setvar class
This theorem is referenced by: (None)
  Copyright terms: Public domain W3C validator