Skip to content

Scoping

Nathan Carter edited this page Mar 14, 2021 · 4 revisions

This is the third of the four main algorithms discussed on the Algorithms page.

Before explaining how it works, we must begin with several definitions.

Declarations

Definition (Declaration): A Declaration is a type of Output Expression that attempts to declare a set of identifiers as having their scope begin at the point in the Output Tree where the Declaration sits. Declarations can come in one of two types, constant declarations and variable declarations. These function differently in some validation and scoping algorithms, but both are intended to take previously undeclared identifiers and mark them as declared.

We do not fully specify what the interior configuration of a Declaration Structure must be at this point. We simply assume that they will contain at least enough information to specify the list of identifiers they are attempting to declare and whether they declare them as constants or variables. Other contents are also possible.

Definition (Valid/Invalid Declaration): A Declaration $D$ for an identifier $x$ is valid if $x$ was not declared by any Structure accessible to $D$. It is invalid otherwise.

Example 1

Let us consider an example Structure hierarchy containing several declarations.

  • The document root
    • An Output Structure $S_1$
      • An Output Structure $S_2$
      • A Declaration $S_3$ of identifiers $x,y,z$ as variables
      • An Output Structure $S_4$
      • A Declaration $S_5$ of the identifier $c$ as a constant
      • An Output Structure $S_6$
        • A Declaration $S_7$ of the identifiers $w,x$ as variables
    • A Declaration $S_8$ of the identifiers $w,x$ as constants

The Declaration of $x$ in $S_3$ is valid, because there are no declarations preceding it. Thus the scope of the variable $x$ begins at $S_3$ and continues to the end of $S_1$, including covering $S_3$, $S_4$, $S_5$, $S_6$, and $S_7$, but not $S_8$. The same holds for the declarations of $y$ and $z$ in $S_2$.

The Declaration of $c$ in $S_5$ is valid, because the declaration preceding it does not declare $c$, thus $c$ is undeclared immediately before $S_5$. The scope of the declaration of $c$ lasts from there to the end of $S_1$, thus including $S_5$, $S_6$, and $S_7$, but not $S_8$.

The Declaration of $w$ in $S_7$ is valid for the same reason as the previous. Its scope is only $S_7$ itself, since it has no later siblings.

The Declaration of $x$ in $S_7$ is invalid, because $S_3$ already declared $x$ and is accessible to $S_7$, written $S_3\prec S_7$.

Both declarations in $S_8$ are valid because, although the identifiers in question were declared in some earlier $S_i\ll S_8$, they were not declared in any $S_i\prec S_8$. The scope of each declaration is just $S_8$.

Explicit and Implicit

Definition (Explicit Declaration): A variable or constant declared by a Declaration Structure, as defined above, is said to be explicitly declared, in contrast to the new method of variable declaration we introduce below.

In particular, all declarations in Example 1 above are explicit.

The following definition assumes that Output Structures may bind variables, and that the usual definitions of free and bound variables apply in such Structure trees.

Definition (Implicit Declaration): A variable (not a constant) $x$ is implicitly declared in a Structure $S$ if there is an immediate child structure $C$ of $S$ that is an Output Expression, that contains $x$ free, and that is not in the scope of any Declaration Structure of $x$.

It may help to view implicit declarations as if they were invisible explicit declarations inserted as the first child of the parent structure $S$, that is, before the first sibling of the child $C$ that contains $x$ free.

Example 2

Consider the same Structure hierarchy in Example 1, but with three modifications, marked with arrows below.

  • The document root
    • An Output Structure $S_1$
      • An Output Expression $S_2$, the statement $\forall x,z>x$ ⬅️
      • A Declaration $S_3$ of identifiers $x,y,z$ as variables
      • An Output Expression $S_4$, the equation $2+k=15$ ⬅️
      • A Declaration $S_5$ of the identifier $c$ as a constant
      • An Output Structure $S_6$
        • A Declaration $S_7$ of the identifiers $w,x$ as variables
    • A Declaration $S_8$ of the identifiers $w,x$ as constants
    • An Output Expression $S_9$, $\forall x, x+1>x$ ⬅️

All Declarations would be marked valid or invalid in exactly the same way as they were in Example 1, with one exception: The declaration of $z$ in $S_3$ in Example 2 is invalid, because the appearance of $z$ free in $S_2$ implicitly declares $z$ as a variable throughout the entire interior of $S_1$, even before $S_2$ begins. Although $x$ also appears in $S_2$, it does not appear free. If $S_2$ had, instead, appeared after $S_3$, it would not have been an implicit declaration of $z$ because $z$ would already have been declared.

There is no change to how $S_8$ is validated because the only changes have been to scopes sitting inside $S_1$, and $S_8$ is outside of $S_1$. The expression $S_9$ does not contain any declarations or free variables.

Note that an implicit variable declaration can never be invalid. Furthermore, if a structure $S$ has a child $C$ containing $x$ free and a grandchild $G$ containing $x$ free, with $G$ not inside of $C$, then the scope of $x$ is the entire interior of $S$, regardless of whether $C\ll G$ or $G\ll C$. That is, implicit declarations never conflict with one another.

Definition (scope of an identifier): Given any identifier $x$ appearing in some structure $S$ in the OT, we can ask its scope. The result will be one of three things:

  1. If the identifier is bound, then there are two possibilities:
    • If the identifier is validly declared to be a constant by some Declaration $D\prec S$, then the scope of $x$ is that given by $D$ and the attempt in $S$ to bind $x$ is invalid.
    • Otherwise, $x$ is a variable and its scope is the scope of the quantifier that binds it.
  2. If the identifier is free, then there are two possibilities:
    • If it is validly declared by some Declaration $D\prec S$, then $x$ is either a variable or constant, as given by $D$, and its scope is also that given by $D$.
    • If it is not validly declared by some Declaration $D\prec S$, then by the definition of Implicit Declaration, it is implicitly declared and its scope is as given in that definition.

Marking Scopes

The Scoping phase takes as input a newly created Output Tree and marks each Structure within it in the following ways. (The example that follows may help illuminate these principles.)

  • If the Structure is not an Output Expression (and thus probably functions as some piece of document or proof structure), then it will be given an attribute called its "implicit declarations" list. This will contain the list of identifiers implicitly declared within that Structure. This list will often be empty, for Structures that contain no children that implicitly declare variables by using them free.
  • If the Structure is a Declaration, then it will be marked with a list of "declaration failures," that is, the subset of the identifiers that it invalidly declares.
  • If the Structure is an Expression, then it will be marked with a list of "binding failures," that is, the subset of the identifiers that it attempts to bind with a quantifier (or other binding expression) and yet that are explicitly declared as constants by some Declaration accessible to the Expression.

If we marked all scopes according to these rules in the Structure hierarchy shown in Example 2, we would get the following results. Here we use JSON-like notation for lists of identifiers.

Structure Implicit Declarations Declaration Failures Binding Failures
$S_1$ [ "z" ] [ ] [ ]
$S_2$ [ ] [ ] [ ]
$S_3$ [ ] [ "z" ] [ ]
$S_4$ [ ] [ ] [ ]
$S_5$ [ ] [ ] [ ]
$S_6$ [ ] [ ] [ ]
$S_7$ [ ] [ "x" ] [ ]
$S_8$ [ ] [ ] [ ]
$S_9$ [ ] [ ] [ "x" ]

We say that a structure "successfully declares" or "successfully" binds an identifier if it attempts to declare it and that attempt is not marked as a failure by the marking procedure given above.

We do not describe the marking algorithm in detail here, but there is a rather straightforward implementation of it. While that implementation is slightly fiddly, it does not require great invention to create.

Clone this wiki locally