Representing Concrete Syntax

We need a generic protocol for communicating and elaborating concrete syntax. This creates a problem. How specific should we be wrt the syntactic categories of Epigram? My inclination is not at all. We either need a generic technology which will adapt to arbitrary syntactic categories, or we need to flatten the syntactic categories. At the moment, I don’t feel like figuring out how to do the former. So I suggest we adopt some very general setup like this

type Concrete shed = (Mark, CTree shed)
data CTree shed
  = CShed shed
  | CNode Node [Concrete shed]

Remarks

  • It’s polymorphic in the contents of sheds, but the elaborator is only ever sent Concrete ()
  • The type Mark is some equality type which belongs to the UI: it is used to mark information coming back from the elaborator as concrete syntax is processed.
  • The type Node represents both the terminal and nonterminal symbols, whatever they may be. Blah code will need to interrogate this type in arbitrary ways, so things might get a touch existential. As in
    data Blah
      = ...
      | ∀α. Case (Node → Maybe α, [String]) (α → Blah) Blah

    The function tests if the node suits the success continuation. If so, the strings name the subnodes and the success continuation executes; if not, the failure case runs. Of course, the whole thing suspends if we have a shed rather than a node.

Parsing to this representation requires that we can generate suitable elements of Mark (I suspect this may involve refs.) The real exciting stuff in the parser is more likely to involve figuring names out—more on that anon.

Printing. Hmm.

3 Responses to “Representing Concrete Syntax”

  1. Cyril Konopko Says:

    Can you tell me, please, when that protocol will be created and published, so I can use it to develop my editor further?

  2. Conor Says:

    Working on it at the moment, so hopefully soon. But if you’re looking for a specific date, I just don’t know. There are several other things I’m also supposed to be working on right now. Hopefully we’ll have the Node type nailed down some more today. It’s mostly a question of coding up the grammar, with a couple of slightly sticky points—I’m demanding that the elaborator retains complete ignorance of what identifiers are used to name variables, so scope must be resolved by a parsing phase. The elaborator will be prepared to divulge which variables are in scope at any shed, but it won’t help with their names, because it won’t know them!

  3. Conor Says:

    Further to this…maybe I’ll make a new entry.

Leave a Reply