MIFFS Is Fun For Sums
by rich
An alphanumeric identifier is any sequence of letters, digits, primes (') and underscores (_) starting with a letter or prime
A symbolic identifier is any non-empty sequence of ! % & $ # + / : < = > ? @ \ ~ ` ^ | *
The following characters are always single-character tokens: - _ . , ( ) [ ] ;
Thanks to the flexibility of a hand-coded arbitrary-lookahead recursive-descent parser (hey, if you didn't want technical details, why are you reading this page?), most language constructs can be re-used as value labels when this would not be ambiguous. MIFFS therefore has two classes of reserved words: always reserved and semi-reserved. The latter can be used as labels, but whenever they can be interpreted as keywords, they will be. Careful use of brackets can resolve any problem cases.
The following words always have special meanings (list separated by spaces):
_ ( ) [ ] , . ; if let fn true false
The following words have special meanings in some circumstances, but can also be used for identifiers where it would not cause ambiguity. (Words in brackets will always be treated as identifiers):
val = fun infix infixr nonfix baseunit :: andalso orelse if then else
fn => | in end
I originally intended MIFFS to be strongly typed, but with automatic promotion. Along the way, I have left out type checking as it seems very hard to implement and without much benefit. I have found the lack of type-checking has made things easier in many cases. For the forseeable future, MIFFS will remain untyped.
Having said that, every value must be represented internally, and must therefore have a type associated with it. For interest, and to more closely emulate ML, MIFFS reports the type of every value returned. Here are the types so far:
See built-in functions for a list of operators.
I have included the parse tree here for interest. I have used the
standard notation, so "literal" is a literal string,
[<token>] is an optional token and [<token>]...
is 0 or more tokens. The <number>, <string>
and <int> tokens are not defined here, but their implementation
in the parser closely mirrors that of the Java language specification.
The <id> token is defined in the Identifiers section
above.
<input> ::= ( <defn> | <exp> ) [ ";" [ <input> ] ]
<defn> ::= "val" <at_pat> "=" <exp>
| "fun" <varname>_1 <pattern>_1 "=" <exp>_1
[ "|" <varname>_1 <pattern>_2 "=" <exp>_2 ]...
| "infix" [<int>] <id> [<id>]...
| "infixr" [<int>] <id> [<id>]...
| "nonfix" [<int>] <id> [<id>]...
| "baseunit" <id> [<id>]...
<pattern> ::= <at_pat> [ <pattern> ]
<at_pat> ::= <at_pat> "::" <at_pat>
| <id>
| "_"
| "(" <at_pat> [ "," <at_pat> ]... ")"
| "[" <at_pat> [ "," <at_pat> ]... "]"
| <at_pat_const>
<at_pat_const> ::= "()" | "[]" | "true" | "false" | [ "-" ] <number>
<exp> ::= <exp2>
| <exp2> "andalso" <exp>
| <exp2> "orelse" <exp>
<exp2> ::= "if" <exp> "then" <exp> "else" <exp>
| "fn" <pattern> "=>" <exp> [ "|" <pattern> "=>" <exp> ]...
| "let" ( [ <defn> ] [ ";" ] )... "in" <exp> [ ";" <exp> ]... "end"
| <applyexp>
// | <exp2> : ty //TODO
//For some sufficiently large integer n, we define a family
//of BNF expressions, indexed from 0 to n inclusive:
<applyexp> ::= <apply[0]>
//for 0 <= i < n
<apply[i]> ::= <apply[i+1]> [ <val-infix-level-i> <apply[i+1]> ]...
<apply[n]> ::= <val> [ <val> ]...
<val> ::= <tuple> | <list> | <id> | <const>
<const> ::= "()" | "[]" | "true" | "false" | <number> | <string>
<tuple> ::= "(" <exp> ["," <exp>]... ")"
<list> ::= "[" <exp> ["," <exp>]... "]"