Bindings and References

Scope resolution can be a bit tricky. It creates issues which we need to resolve as straightforwardly as possible. The basic thing we need to determine is when an identifier is a binding occurrence for a local variable, and when it’s a reference to a global variable. We’re used to the situation where this is obvious from the syntax, because patterns are distinct from terms, or whatever. That’s no longer the case.

Where might variables be bound?

  • Left of a colon in a binder $\lambda\vx\hb\vS\TO\vT\quad$. The plan is to extend this from single variable bindings to patterns which are η-longer. eg. $\lambda (\lambda\vx\TO(\vf\: \vx;\vg\:\vx))\TO\cdots$. No other form of pattern can stand here, and it is always the case that the variables being bound are those which occur free in the pattern—what else? Incidentally, what we’ve got here are Miller patterns. The λ-bound variables in a pattern must occur exactly once in the arguments to the pattern variables, inside some canonical stuff. Proper definition in a bit, but it’s all checkable by the scoper, without recourse to type information. By the way, you can always infer the type of the free variables in a Miller pattern, given the type of the pattern and the bound variables, because they only abstract what has already been abstracted. More on this later.
  • In the type of a premise or a conclusion in a natural deduction rule. The idea is that if you haven’t heard of it, you must be quantifying over it implicitly, as in $\Rule{\vx\hab\vX\quad\vxs\hab\TC{List}\:\vX}{\vx\:\DC{::}\:\vxs\hab\TC{List}\:\vX}$. Two questions present themselves
    • Where do these quantifications occur? Previous practice was Randyist, putting the implict things as far right as we could get away with. A bug in Epigram 1 is that these variables don’t get shifted left as necessary by unification (James tripped over it a while ago, must insert link). Given the sorts of things which have shown up in the examples, I propose that we adopt the Jamesist alternative, which is to quantify things as far left as possible. I further propose that by way of sanity, we require the implicit arguments to be quantified at the far left—ie that one may only omit an initial segment of a signature, so no implicit argument may have a type depending on a previous explicit argument in that signature. Tough call: it’s one of those situations where there’s danger that the machine will be more subtle than its operators, so I think we should force weird things to be explicit. (Can of worms: should we require implicitly quantified variables to be free in Miller patterns guarded by constructors, so we have some guarantee that Miller unification will solve them? It might save a certain amount of ‘No I can’t guess $\vx$ and $\vy$ from $\vx\FN{+}\vy$’.)
    • What happens if the previously unheard of thing acquires a spatially prior binding at a later time? Here, current practice is to get out of the capture attempt by making the implicit quantifier explicit. Someone goes and defines X, so we get $\Rule{\vX\quad\vx\hab\vX\quad\vxs\hab\TC{List}\:\vX}{\vx\:\DC{::}\:\vxs\hab\TC{List}\:\vX}$. In order to do this, we have to allow ordinary bindings to shadow. The alternative is to rename.
  • Left-hand sides. Here, variables in patterns might very well be intended to refer to global things, nonlinearity is inevitable, and all sorts of stuff like that. I’m not sure I see much alternative to ‘if you mean a new thing, use a new name’. And once again we have to worry about capture. Situation: we wrote $\DC{0}\:\FN{+}\:\vy \cq \vy$, some fool defines $\vy$. Two ways out of it:
    • Rename, eg by adding enough primes to create a distinct thing.
    • Introduce some piece of punctuation which indicates newness. eg $\DC{0}\:\FN{+}\:\hat{}\vy \cq \vy$. This mark isn’t part of the name, it just overrides the default behaviour of referencing with shadowing, preserving the name. Somehow, I prefer this, as a kind of more deliberate version of adding punctuation until the identifier becomes distinct. I also feel that we have no business changing the punter’s name choices.

After all that rambling, I propose that we

  • Distinguish explicit bindings (Miller patterns left of : in sigs) from implicit bindings (from usage in types within rules and on the lhs for functions).
  • Adopt the convention that explicit bindings always shadow. It would be perverse if they did not.
  • Adopt the convention that implicit bindings require names which are fresh, either by being different, or by being marked explicitly as fresh (hence shadowing) with the sombrero.
  • Protect captured names with sombreros (either in a pattern or a rule), rather than explicit bindings (which can only happen in rules). So $\Rule{\vx\hab\hat{}\vX\quad\vxs\hab\TC{List}\:\vX}{\vx\:\DC{::}\:\vxs\hab\TC{List}\:\vX}$.

Does this seem reasonable?

2 Responses to “Bindings and References”

  1. Carl Witty Says:

    Introduce some piece of punctuation which indicates newness.

    Maybe the elaborator should always add this mark on binding occurrences? Otherwise, code saved on disk will be fragile against library additions.

  2. Conor Says:

    It may well be adding it to files saved on disk precisely to keep things stable. In the editor, I think that deliberately picking a new name should be enough. But it’s easy for the editor to recalculate ‘live’ whether or not the mark is needed, in response to edits which either introduce new things or α-convert old ones. A design objective here is that α-conversion is an edit which never even gets communicated to the elaborator.

Leave a Reply