12.7 Common errors

ocamllex: transition table overflow, automaton is too big

The deterministic automata generated by ocamllex are limited to at most 32767 transitions. The message above indicates that your lexer definition is too complex and overflows this limit. This is commonly caused by lexer definitions that have separate rules for each of the alphabetic keywords of the language, as in the following example.

rule token = parse
      "keyword1"   { KWD1 }
    | "keyword2"   { KWD2 }
    | ...
    | "keyword100" { KWD100 }
    | ['A'-'Z' 'a'-'z'] ['A'-'Z' 'a'-'z' '0'-'9' '_'] * as id
                   { IDENT id}

To keep the generated automata small, rewrite those definitions with only one general "identifier" rule, followed by a hashtable lookup to separate keywords from identifiers:

{ let keyword_table = Hashtbl.create 53
      let _ =
        List.iter (fun (kwd, tok) -> Hashtbl.add keyword_table kwd tok)
                  [ "keyword1", KWD1;
                    "keyword2", KWD2; ...
                    "keyword100", KWD100 ]
    }
    rule token = parse
      ['A'-'Z' 'a'-'z'] ['A'-'Z' 'a'-'z' '0'-'9' '_'] * as id
                   { try
                       Hashtbl.find keyword_table id
                     with Not_found ->
                       IDENT id }

ocamllex: Position memory overflow, too many bindings The deterministic

automata generated by ocamllex maintains a table of positions inside the scanned lexer buffer. The size of this table is limited to at most 255 cells. This error should not show up in normal situations.