Notes inspired from PLAI by Shriram Krishnamurthi. These are a bit all over the place, there is no real structure to them.
Credits to Benjamin C. Pierce for the banner
Introduction
The study of types and type systems serves to identify abuse of types before executing a program.
Currently, illegal expressions such as (+ 1337 (lambda (x) x))
get caught in the interpreter, and to get them caught in the parser wouldn’t take much.
But what about:
|
|
We cannot know what the user will input, thus we can’t know if it will halt with an error before running the program. This problem is related to the halting problem.
What is a type?
- A type is a property of a program that we can establish before running the program.
- Types capture the intuition that we would like to predict a program’s behavior without running it.
- Overall, types are labels for expressions in a language, and they record what kind of value that expression will return.
With just an interpreter, to reject (+ 3 (lambda (x) x))
we didn’t need to know what (lambda (x) x)
would return, we just needed to know that +
is a function that accepts two values of type number
and that (lambda (x) x)
is of type function
, thus +
shouldn’t accept the later.
Two competing forces of designing type systems
- Having more information makes it possible to draw richer conclusions about a program’s behavior, which would allow us to reject fewer valid programs and rejecting more invalid ones.
- Acquiring more information is difficult.
- It may place restrictions on the programming language (like Rust for example, it is hard to create self-referential data like graphs or trees).
- It may make the program more expensive to compute.
- It may force the user to annotate more parts of the program.
- It may hit the limits of computability (Halting problem, etc…).
Type Judgments
Type judgments are a collection of rules for type systems and describe how to determine the type of an expression.
Examples:
$n:\ number$, reading “any numeral $n$ is $number$.
$(lambda\ (i)\ b):\ function$, “any function (lambda (i) b) is $function$”
But we need a type for the identifier $i$, for this we use a type environment, it will capture all of our assumptions about types.
For the environment, we use the Greek letter Gamma ($\Gamma$), and a turnstile ($\vdash$). In general the form for such judgments is:
$$\Gamma \vdash e : t$$
Which reads: “$\Gamma$ proves that $e$ has type $t$”
Example:
$$\Gamma \vdash n:\ number$$
$$(lambda\ (i)\ b):\ function$$
$$\Gamma \vdash i: \ \Gamma (i)$$
(the last rule says that the type of identifier $i$ is whatever type it is bound to in the environment)
Rule form:
$$ \frac{\text{input judgments}} {\text{resulting judgment}} $$
Example: creating a lambda term
$$ \frac{\Gamma,a: \ \tau_1 \vdash b:\ \tau_2} {\Gamma\vdash(\lambda a : \tau_2.b): (\tau_1 \rightarrow \tau_2)} $$
Our language:
Addition (also other binary operations…):
$$ \frac{\Gamma\vdash l :\ number \qquad \Gamma \vdash r: \ number} {\Gamma\vdash (+ \ l \ r): \ number} $$
Function Application
$$ \frac{\Gamma\vdash f: \ function \qquad \Gamma\vdash a:\ \tau_a \quad \cdots} {\Gamma\vdash(f \ a): \ ???} $$
$\tau_a$ is the type of expression $a$, and we often use $\tau$ to name an unknown type.
Given these constraints, we must now define our new language:
|
|
An example expression for reference:
|
|
Which has an outer judgment judgment:
$$ \frac{\Gamma\vdash f:\ (\tau_1 \rightarrow \tau_2) \qquad \Gamma\vdash a:\ \tau_1} {\Gamma\vdash (f \ a):\ \tau_2} $$
Guarding against mistakes and cheating
How can we be sure that the programmer will not lie and annotate a function with a type that is wrong? We can implement type checking rules. Such as:
$$ \frac{\Gamma[i\leftarrow\tau_1]\vdash b:\ \tau_2} {\Gamma\vdash (lambda\ (i:\ \tau_1):\ \tau_2 \ b):\ (\tau_1\rightarrow\tau_2)} $$
This rule means that we will consume the programmer’s expression only if the body has type $\tau_2$ when we extend the environment ($\Gamma$ is the env) with $i$ bound to $\tau_1$
An important relationship between type judgments for function declaration and application:
- When typing the declaration, we assume the argument will have the right type and guarantee that the body or the result will have the promised type.
- When typing a function application we guarantee the argument has the type the function demands, and assume the result will have the type the function promises.
Stackable type judgments
Type judgments are indeed stackable, take this expression’s judgment as an example:
|
|
$$ \frac {0\vdash 2:\ number \qquad \cfrac{0\vdash 5:\ number \qquad 0\vdash 7:\ number} {0\vdash(+\ 5\ 7):\ number}} {0\vdash(+ \ 2 \ (+\ 5 \ 7 )):\ number} $$
This is a type judgment tree, where the leaves at the top are the axioms. We use $0$ as we have an empty environment, and neither we did extend it or other judgments used the extended environment.
Another judgment, this time with a function:
|
|
$$ \frac{\cfrac{\cfrac{[x\leftarrow number]\vdash x:\ number \qquad [x\leftarrow number]\vdash 3:\ number} {[x\leftarrow number]\vdash(+\ x\ 3):\ number}} {0\vdash(lambda\ (x:\ number):\ number\ (+\ x \ 3)):\ (number\rightarrow number)} \ 0\vdash 5:\ number} {0\vdash((lambda\ (x:\ number ):\ number\ (+\ x\ 3))\ 5):\ number} $$
In this expression, we extend the environment, but we do not evaluate it! x stays x, but we do know now that x is a number. We also go into the body of the $lambda$ even if the function is never applied.
Typing Control
Conditionals
We can expand the language with a conditional construct, but before doing that we would need to introduce a new primitive, the boolean:
|
|
Then, a type judgment for our if conditional will look like this:
$$ \frac{\Gamma\vdash c:\ boolean \qquad \Gamma\vdash t:\ \tau \qquad \Gamma\vdash e:\ \tau} {\Gamma\vdash(\text{if}\ \ c\ \ t\ \ e):\ \tau} $$
Recursion
Having conditionals will allow us to reach Turing-completeness by implementing recursion. But there is a catch, the language we are currently operating with is a subset of the simply-typed lambda calculus, which is said to be strongly normalizing (a program will always terminate). For this reason, we cannot implement recursion, as the most basic form of recursion is an infinite loop, the Omega Combinator: $((\lambda x.x\ x)\quad (\lambda x.x\ x))$. This is because the program will never even start evaluating, we would be infinitely annotating the program before it even runs.
Due to strong normalization, we must provide an explicit recursion construct:
|
|
$$ \frac{\Gamma[i\leftarrow\tau_i]\vdash b:\ \tau \qquad \Gamma[i\leftarrow\tau_i]\vdash v:\ \tau_i} {\Gamma\vdash(rec\ (i:\ \tau_i\ v)\ b):\ \tau} $$
Here we extend the environment twice, the extension to $b$ is the one that initiates the recursion, the extension to $v$ is the one that sustains it.
An implementation of our AST in OCaml
In addition to the AST, we also need book-keeping data about our types.
|
|
A Type-checker implementation in OCaml
Finally, we can implement a type-checker based on our judgments and AST.
|
|
Typing Data
Recursive types
To add recursive types to our language, we would need to let the user define their own types. Which would permit them to:
- Give names to new types.
- Introduce conditionally-defined types (variants).
- Permit recursive definitions
Lets consider a tree:
|
|
With both judgments for $node$ and $leaf$:
$$leaf:\ \rightarrow Tree$$ $$node:\ string \times Tree \times Tree \rightarrow Tree$$
To distinguish between the two variants, we would need a sort of type-case function, so lets define a template for that:
|
|
With judgment ($e_n$ represent the two resulting expressions):
$$ \frac{\Gamma\vdash t:\ Tree \qquad \Gamma\vdash e_1 :\ \tau\qquad \Gamma[n\leftarrow string,l\leftarrow Tree,r\leftarrow Tree]\vdash e_2:\ \tau} {\Gamma\vdash(\text{Tree-case } t\ ([leaf]\ e_1)\ ([node\ n\ l\ r]\ e_2)):\ \tau} $$
Type Soundness
We would like to know that the type system correctly predicted what the program would do. We call this property type soundness. This notion of type soundness is the bridge that connect types with execution.
- For all programs $p$, if the type checker assigns $p$ the type $\tau$, and if the semantics says that $p$ evaluates to a value $v$, then the type checker will also assign $v$ to the type $\tau$.
But this is only true when the program executes legal expressions. In the case of illegal ones we have a few choices:
- Return a value like
-1
- Diverge (aka. go into an infinite loop), which theoreticians like to do.
- Raise an exception, which is preferable.
But raising an exception will mean that the program will not terminate with a value, nor it will terminate. So we must redefine our statement:
- For all programs $p$, if the type of $p$ is $\tau$, $p$ will, if it terminates, either valuate to a value $v$ such that $v : \tau$, or raise one of a well-defined set of exceptions.
Type safety
- Type safety is the property that no primitive operation ever applies to values of the wrong type.
Type Soundness of popular languages
Many languages, like C and C++ do not have a sound type system (In C++ it does for object types, but not the types inherited from C). In fact, C has two type systems, one for primitives and the other defined by lengths bitstrings at level of execution.
Statically checked | Not statically checked | |
---|---|---|
Type safe | ML, Java | Scheme |
Type unsafe | C, C++ | assembly |
Strong typing
A phrase with a million interpretations, to many it just means that “the language has a type-checker”, to others “the language is sound”.
Avoid using this phrase
Explicit Polymorphism
Some functions, like length (which returns the length of a list), need to accept any type they are given. But our type constructs prohibit us to do so. In the case of length, we would need to write a length function for list of numbers, list of symbols…
We could add constructs to our semantics that would allow to create such polymorphisms, we will use the letter $\Lambda$ to denote such construct.
Example length function definition:
|
|
The Type Language
Our signature (type) language has changed, we now permit type variables. These variables are introduced by type procedures (Λ), and discharged by type applications. Now we can use this following notation:
$$ length : \forall\alpha.\ \ list(\alpha)\rightarrow\ number $$
Where every parameter $\forall$ is introduced by a type procedure $\Lambda$. A few other types for polymorphic functions:
$$ filter : \forall\alpha.\ \ list(\alpha)\ \times\ (\alpha \rightarrow boolean) \rightarrow list(\alpha) $$
$$ map : \forall\alpha,\beta.\ \ list(\alpha)\ \times\ (\alpha\rightarrow\beta)\rightarrow list(\beta). $$
Evaluation Semantics and Efficiency
This new semantic constructs need to run during program execution, which will mean that we need to keep types as ordinary values. This is inherently a bad idea, as a type is the abstraction of a value, and conceptually values and their abstractions shouldn’t live in the same universe. Additionally, it will slow down the program significantly.
However, we could build a type elaborator which will convert our type applications into complete expressions, which after being computed will result in an expression without any type variables. These expressions will be able to be type checked by our ordinary type checker. We can formalize this into the judgment:
$$ \frac{\Gamma\vdash e : \forall\alpha.\tau} {\Gamma\vdash e < \tau’ >: \tau[\alpha\leftarrow\tau’]} $$
In informal words: On encountering a type application, we substitute the quantified type with the type argument replacing the type variable.
The corresponding judgment for a type abstraction:
$$ \frac{\Gamma[\alpha]\vdash e : \tau} {\Gamma\vdash <\Lambda\ (\alpha)\ e>: \forall\alpha.\tau} $$
In informal words: We extend $\Gamma$ with a binding for the type variable $\alpha$, but leave the associated type unspecified so it is chosen nondeterministically. If the choice of type actually matters, then the program must not type-check.
Type Inference
In order to infer types we need to assign constraints to expressions in our programs.
Constrains relate different portions of the program by determining how they should be compatible for the program to execute without error. For each expression $n$ in the program’s syntax tree, we introduce a variable of the form $[[n]]$. So if the program has form (foo 1 2)
, we want to introduce variables for 1
, 2
and (foo 1 2)
. We use $[[\cdot]]$ as to represent the type of a node. (eg. for the previous expressions: $[[1]]$, $[[2]]$, $[[(foo\ 1\ 2)]]$)
A table of constraints:
expression at node | generated constraints |
---|---|
n | [[n]] = number |
true | [[true]] = boolean |
false | [[false]] = boolean |
(add1 e) | [[(add1 e)]] = number [[e]] = number |
(+ e1 e2) | [[(+ e1 e2)]] = number [[e1]] = number [[e2]] = number |
(zero? e) | [[(zero? e)]] = boolean [[e]] = number |
(ncons e1 e2) | [[(ncons e1 e2)]] = list(number) [[e1]] = number [[e2]] = list(number) |
(nfirst e) | [[(nfirst e)]] = number [[e]] = list(number) |
(nrest e) | [[(nfirst e)]] = list(number) [[e]] = list(number) |
(nempty? e) | [[(nempty? e)]] = boolean [[e]] = list(number) |
nempty | [[nempty]] = list(number) |
(if c t e) | [[(if c t e)]] = [[t]] [[(if c t e)]] = [[e]] [[c]] = boolean |
(lambda (x) b) | [[(lambda (x) b)]] = [[x]] -> [[b]] |
(f a) | [[(f a)]] = [[a]] -> [[(f a)]] |
Solving Type Constraints
Unification can help solve type constraints. Unification consumes a set of constrains and either
- Identifies inconsistencies amongst the constraints
- Generates a substitution that represents the solution of the constraints
The unification algorithm begins with an empty substitution, then pushes all the constraints onto a stack. When the stack is empty return the substitution, otherwise, pop the constrains X = Y off the stack:
- If X and Y are identical identifiers, do nothing.
- If X is an identifier, replace all occurances of X by Y both on the stack and in the substitution, and add X -> Y to the substitution.
- If Y is an identifier, replace all occurances of Y by X both on the stack and in the substitution, and add Y -> X to the substitution.
- If X is of the form C(X1, …, Xn) for some constructor C, and Y is of the form C(Y1, …, Yn), then push Xi = Xi for all
1 <= i <= n
onto the stack. - Otherwise, X and Y don’t unify. Report an error.