#
TaPL Notes -- Chapter 3 Untyped Arithmetic Expressions

原文链接 https://hsfzxjy.github.io/2019-01-21-tapl-notes-untyped-arithmetic-expressions/

注：以下为加速网络访问所做的原文缓存，经过重新格式化，可能存在格式方面的问题，或偶有遗漏信息，请以原文为准。

This chapter develop a small language of numbers and booleans, serving as a straightforward vehicle for the introduction of serveral fundamental concepts like, abstract syntax tree, evaluation and runtime errors.

## Syntax

### BNF-like Definition

Terms of this language are defined as below:

```
t ::=
true
false
if t then t else t
0
succ t
pred t
iszero t
```

where `t`

is called **meta-variable**. At every point where the symbol `t`

appears, we may substitute any terms.

### Inductive Definition

The set of terms is the smallest set $T$ s.t.

- $\{0, true, false\} \subseteq T$
- if $t_1 \in T$, then ${succ\ t_1, pred\ t_1, iszero\ t_1} \subseteq T$
- if $t_1, t_2, t_3 \in T$, then $if\ t_1\ then\ t_2\ else\ t_3 \in T$

### Concrete Definition

For each natural number $i$, define a set $S_i$ as follows:

- $S_0 = \phi$
- $S_{i+1} = \{0, true, false\} \cup \{succ\ t_1, pred\ t_1, iszero\ t_1 | t_1 \in S_i\} \cup \{if\ t_1\ then\ t_2\ else\ t_3 | t_j \in S_i\}$
- $S = \cup_i S_i$

**Three definition above are equivalent.**

## Semantics

The semantics of an language is how its terms are evaluated. Three approaches had been proposed to formalize semantics:

**Operational Semantics (used here)**specifies the behavior of a programming language by defining a simple abstract machine for it. This machine is "abstract" in the sense that it uses the terms of the language as its machine code, rather than some low-level microprocessor instruction set. For simple languages, a*state*of the machine is just a term, and the machine's behavior is defined by a straition function that for each state, either gives the next state, or declares that the machine has halted. (字节码语义，Compare with languages like Python or Java)**Denotational Semantics**. The meaning of a term is taken to be some mathematical object, instead of a sequence of machine states. Giving denotational semantics for a language consists of finding a collection of semantic domains and then defining an interpretation function mapping terms into elements of these domains.**Axiomatic semantics**(omitted)

### Semantics with only Booleans

Firstly we consider a simpler situation where only booleans get involved.

The evaluation process can be summarized as three rules:

```
if true then t2 else t3 -> t2 [E-IFTRUE]
if false then t2 else t3 -> t3 [E-IFFALSE]
t1 -> t1' || if t1 then t2 else t3 -> if t1' else t2 then t3 [E-IF]
```

A rule consists of one *conclusion* and zero or more *premises*. For example, in rule `E-IF`

, `t1 -> t1'`

is the premise and `if t1 then t2 else t3 -> if t1' else t2 then t3`

the conclusion.

Note that in textbook premises are written above conclusion with a horizontal line in the middle, but here I use a notation that is more convenient to type in.

A subset of terms should be defined as *possible final results of evaluation*, which are called **values**. Here they are just the constants `true, false, 0`

.

Note that $\rightarrow$ can be viewed as a binary relation over $T$, i.e., a subset of $T \times T$.

The third rule

`E-IF`

specifies the evaluation order of an expression, i.e., clauses are always evaluated after their guards.

**DEFINITION instance of an inference rule** An instance of an inference rule is obtained by consistently replacing each metavariable by the same term in the rule's conclusion and all its premises (if any). e.g., `if true then true else (if false then false else false) -> true`

is an instance if `E-IFTRUE`

.

**DEFINITION satisfy** A rule is satisfied by a relation if, for each instance of the rule, either the conclusion is in the relation or **one of the premises is not** (to ensure evaluation can proceed).

**DEFINITION one-step evaluation relation** denoted as $\rightarrow$, is the smallest binary relation on terms satisfying the three rules. When the pair `(t, t')`

is in the relation, we say that `t -> t'`

is *derivable*. ("smallest" implies that `t -> t'`

is derivable iff it is justified by the rules).

**DEFINITION normal form** A term $t$ is in normal form if no evaluation rule applies to it.

Facts:

- Every value is in normal form.
- When only booleans involved, every term in normal form is a value.
- Every term can be evaluated in values.

### Semantics with Booleans and Numbers

Now we let numbers in. Several new rules should be added:

```
t1 -> t1' || succ t1 -> succ t1' [E-SUCC]
pred 0 -> 0 [E-PREDZERO]
pred (succ nv1) -> nv1 [E_PREDSUCC]
t1 -> t1' || pred t1 -> pred t1' [E-PRED]
iszero 0 -> true [E-ISZERO]
iszero (succ nv1) -> false [E-ISZEROSUCC]
t1 -> t1' || iszero t1 -> iszero t1'[E-ISZERO]
```

and new values:

```
v ::=
...
nv
nv ::=
0
succ nv # positive integer
```

**DEFINITION stuck term** A closed term is stuck if it is in normal form but not a value. e.g. `if 0 then true else true`

or `iszero false`

.

Stuckness gives us a simple notion of run-time error for this simple machine. Intuitively, it characterizes the situation where the operational semantics does not know what to do because the program has reached a "meaningless state".

Stuckness can be prevented by introducing a new term called $wrong$ and augment the opereational semantics with rules that explicitly generate $wrong$ ain all the situations where the present semantics gets stuck.

### Evaluation Theorems

If $t \rightarrow t1$ and $t \rightarrow t2$, then $t1 = t2$. (Can be proved by performing induction on the structure of

`t`

)

### Other Forms of Evaluation Relation

**DEFINITION mutli-step evaluation** $\rightarrow^*$ is the reflexive, transitive closure of one-step evaluation.

If $t \rightarrow^\star t1$ and $t \rightarrow^\star t2$, where $t1$ and $t2$ are in normal form, then $t1 = t2$.

For every term $t$ there is some normal form $t'$ s.t. $t \rightarrow^\star t'$.

**DEFINITION big-step evaluation** (omitted) formulates the notion of "this term evaluates to that final value".