


ANTLR(1)                PCCTS Manual Pages               ANTLR(1)


NNAAMMEE
       antlr - ANother Tool for Language Recognition

SSYYNNTTAAXX
       aannttllrr [_o_p_t_i_o_n_s] _g_r_a_m_m_a_r___f_i_l_e_s

DDEESSCCRRIIPPTTIIOONN
       _A_n_t_l_r  converts  an  extended form of context-free grammar
       into a set of C  functions  which  directly  implement  an
       efficient  form  of  deterministic recursive-descent LL(k)
       parser.  Context-free grammars may be augmented with pred-
       icates  to  allow  semantics  to  influence  parsing; this
       allows a form  of  context-sensitive  parsing.   Selective
       backtracking  is  also  available  to handle non-LL(k) and
       even non-LALR(k) constructs.  _A_n_t_l_r also produces a  defi-
       nition  of  a  lexer  which can be automatically converted
       into C code for a DFA-based lexer by  _d_l_g.   Hence,  _a_n_t_l_r
       serves  a  function much like that of _y_a_c_c, however, it is
       notably more flexible and is more integrated with a  lexer
       generator (_a_n_t_l_r directly generates _d_l_g code, whereas _y_a_c_c
       and _l_e_x are given independent descriptions).  Unlike  _y_a_c_c
       which  accepts LALR(1) grammars, _a_n_t_l_r accepts LL(k) gram-
       mars in an extended BNF notation -- which  eliminates  the
       need for precedence rules.

       Like  _y_a_c_c grammars, _a_n_t_l_r grammars can use automatically-
       maintained symbol attribute values  referenced  as  dollar
       variables.   Further,  because  _a_n_t_l_r  generates  top-down
       parsers, arbitrary values may  be  inherited  from  parent
       rules (passed like function parameters).  _A_n_t_l_r also has a
       mechanism for creating and  manipulating  abstract-syntax-
       trees.

       There  are  various other niceties in _a_n_t_l_r, including the
       ability to spread one grammar over multiple files or  even
       multiple  grammars in a single file, the ability to gener-
       ate a version of the grammar  with  actions  stripped  out
       (for documentation purposes), and lots more.

OOPPTTIIOONNSS
       --cckk _n  Use  up  to  _n symbols of lookahead when using com-
              pressed  (linear  approximation)  lookahead.   This
              type  of  lookahead is very cheap to compute and is
              attempted before full LL(k) lookahead, which is  of
              exponential  complexity in the worst case.  In gen-
              eral, the compressed lookahead can be  much  deeper
              (e.g,  --cckk  1100) than the full lookahead (which usu-
              ally must be less than 4).

       --CCCC    Generate C++ output from both ANTLR and DLG.

       --ccrr    Generate a cross-reference for all rules.  For each
              rule,  print  a list of all other rules that refer-
              ence it.



ANTLR                       April 1994                          1





ANTLR(1)                PCCTS Manual Pages               ANTLR(1)


       --cctt    Do not make copies of tokens passed to  the  parser
              in  C++  mode (default=to copy).  When using DLG in
              conjunction with ANTLR, you will always want  ANTLR
              to  make  copies because DLG only has space for one
              AANNTTLLRRTTookkeenn (which is passed  to  the  scanner  with
              sseettTTookkeenn);  this  address  is  always returned and,
              hence, without copies, all $-variables would  point
              to the same AANNTTLLRRTTookkeenn.

       --ee11    Ambiguities/errors shown in low detail (default).

       --ee22    Ambiguities/errors shown in more detail.

       --ee33    Ambiguities/errors shown in excruciating detail.

       --ffee file
              Rename eerrrr..cc to file.

       --ffhh file
              Rename ssttddppccccttss..hh header (turns on --gghh) to file.

       --ffll file
              Rename lexical output, ppaarrsseerr..ddllgg, to file.

       --ffmm file
              Rename  file with lexical mode definitions, mmooddee..hh,
              to file.

       --ffrr file
              Rename file which remaps globally visible  symbols,
              rreemmaapp..hh, to file.

       --fftt file
              Rename ttookkeennss..hh to file.

       --ggaa    Generate ANSI-compatible code (default case).  This
              has not been rigorously tested to be  ANSI  XJ11  C
              compliant,  but  it is close.  The normal output of
              _a_n_t_l_r is currently compilable under both K&R,  ANSI
              C,  and C++--this option does nothing because _a_n_t_l_r
              generates a bunch of #ifdef's to do the right thing
              depending on the language.

       --ggcc    Indicates  that  _a_n_t_l_r  should  generate no C code,
              i.e., only perform analysis on the grammar.

       --ggdd    C code is inserted in each of the  _a_n_t_l_r  generated
              parsing  functions to provide for user-defined han-
              dling of a detailed parse trace.  The inserted code
              consists  of  calls  to the user-supplied macros or
              functions called  zzzzTTRRAACCEEIINN  and  zzzzTTRRAACCEEOOUUTT.   The
              only  argument  is  a  _c_h_a_r _* pointing to a C-style
              string which is the grammar rule recognized by  the
              current  parsing  function.   If  no  definition is



ANTLR                       April 1994                          2





ANTLR(1)                PCCTS Manual Pages               ANTLR(1)


              given for the trace functions, upon rule entry  and
              exit,  a  message will be printed indicating that a
              particular rule as been entered or exited.

       --ggee    Generate an error class for each non-terminal.

       --gghh    Generate ssttddppccccttss..hh for  non-ANTLR-generated  files
              to  include.  This file contains all defines needed
              to describe the type of parser generated  by  _a_n_t_l_r
              (e.g. how much lookahead is used and whether or not
              trees are  constructed)  and  contains  the  hheeaaddeerr
              action specified by the user.

       --ggkk    Generate parsers that delay lookahead fetches until
              needed.   Without  this  option,  _a_n_t_l_r   generates
              parsers  which  always  have  _k tokens of lookahead
              available.  This option is  incompatible  with  --pprr
              and  renders  references  to  LLAA((_i))  invalid as one
              never knows when the _i_t_h token of lookahead will be
              fetched.

       --ggll    Generate  line  info  about  grammar  actions  in C
              parser of the form ## _l_i_n_e ""_f_i_l_e"" which makes  error
              messages from the C/C++ compiler make more sense as
              they will point  into  the  grammar  file  not  the
              resulting  C  file.   Debugging  is easier as well,
              because you will step through  the  grammar  not  C
              file.

       --ggpp _p_r_e_f_i_x
              Prefix all functions generated from rules with _p_r_e_-
              _f_i_x.  This is now obsolete.  Use the #parser "name"
              _a_n_t_l_r directive.

       --ggss    Do  not  generate  sets for token expression lists;
              instead  generate  a   ||||-separated   sequence   of
              LLAA((11))====_t_o_k_e_n___n_u_m_b_e_r.   The  default  is to generate
              sets.

       --ggtt    Generate code for Abstract-Syntax Trees.

       --ggxx    Do not create  the  lexical  analyzer  files  (dlg-
              related).   This  option  should  be given when the
              user wishes to provide a  customized  lexical  ana-
              lyzer.   It  may  also  be  used in _m_a_k_e scripts to
              cause only the parser to be rebuilt when  a  change
              not  affecting the lexical structure is made to the
              input grammars.

       --kk _n   Set k of LL(k) to _n; i.e. set tokens of  look-ahead
              (default==1).

       --oo dir Directory    where    output    files   should   go
              (default=".").  This is very nice for  keeping  the



ANTLR                       April 1994                          3





ANTLR(1)                PCCTS Manual Pages               ANTLR(1)


              source directory clear of ANTLR and DLG spawn.

       --pp     The  complete  grammar,  collected  from  all input
              grammar files and  stripped  of  all  comments  and
              embedded  actions,  is  listed  to ssttddoouutt.  This is
              intended to aid in viewing the entire grammar as  a
              whole  and  to  eliminate  the need to keep actions
              concisely stated so that the grammar is  easier  to
              read.   Hence,  it is preferable to embed even com-
              plex actions directly in the grammar,  rather  than
              to  call  them as subroutines, since the subroutine
              call overhead will be saved.

       --ppaa    This option is the same as --pp except that the  out-
              put  is  annotated  with  the first sets determined
              from grammar analysis.

       --pprr    Obsolete -- used to turn on use  of  predicates  in
              parsing  decisions  in release 1.06.  Now, in 1.10,
              the specification of a predicate  implies  that  it
              should be used.  When a syntactic ambiguity is dis-
              covered, _a_n_t_l_r searches for predicates that can  be
              used to disambiguate the decision.  Predicates have
              dual roles as semantic validation  and  disambigua-
              tion predicates.

       --pprrcc oonn
              Turn  on  the computation and hoisting of predicate
              context.

       --pprrcc ooffff
              Turn off the computation and hoisting of  predicate
              context.   This  option  makes 1.10 behave like the
              1.06 release with option --pprr on.  Context  computa-
              tion is off by default.

       --rrll _n  Limit  the  maximum  number  of  tree nodes used by
              grammar analysis  to  _n.   Occasionally,  _a_n_t_l_r  is
              unable  to analyze a grammar submitted by the user.
              This rare situation can only occur when the grammar
              is  large  and  the  amount of lookahead is greater
              than one.  A nonlinear analysis algorithm  is  used
              by  PCCTS to handle the general case of LL(k) pars-
              ing.  The average complexity of analysis,  however,
              is  near  linear  due to some fancy footwork in the
              implementation which reduces the number of calls to
              the full LL(k) algorithm.  An error message will be
              displayed, if this limit is  reached,  which  indi-
              cates  the  grammar  construct  being analyzed when
              _a_n_t_l_r hit a  non-linearity.   Use  this  option  if
              _a_n_t_l_r  seems to go out to lunch and your disk start
              thrashing; try _n=10000 to start.  Once the  offend-
              ing  construct  has  been identified, try to remove
              the ambiguity that _a_n_t_l_r  was  trying  to  overcome



ANTLR                       April 1994                          4





ANTLR(1)                PCCTS Manual Pages               ANTLR(1)


              with large lookahead analysis.  The introduction of
              (...)? backtracking blocks eliminates some of these
              problems --  _a_n_t_l_r  does  not  analyze alternatives
              that begin with (...)? (it  simply  backtracks,  if
              necessary, at run time).

       --ww11    Set  low  warning  level.   Do not warn if semantic
              predicates and/or  (...)?  blocks  are  assumed  to
              cover ambiguous alternatives.

       --ww22    Ambiguous  parsing decisions yield warnings even if
              semantic predicates  or  (...)?  blocks  are  used.
              Warn  if  predicate  context  computed and semantic
              predicates  incompletely  disambiguate  alternative
              productions.

       --      Read  grammar  from  standard  input  and  generate
              ssttddiinn..cc as the parser file.

SSPPEECCIIAALL CCOONNSSIIDDEERRAATTIIOONNSS
       _A_n_t_l_r works...  we think.  There is no implicit  guarantee
       of  anything.   We reserve no lleeggaall rights to the software
       known as the Purdue Compiler Construction Tool Set (PCCTS)
       --  PCCTS  is in the public domain.  An individual or com-
       pany may do whatever  they  wish  with  source  code  dis-
       tributed  with  PCCTS  or  the  code  generated  by PCCTS,
       including the incorporation of PCCTS, or its output,  into
       commercial  software.  We encourage users to develop soft-
       ware with PCCTS.  However, we do ask that credit is  given
       to  us for developing PCCTS.  By "credit", we mean that if
       you incorporate our source code into one of your  programs
       (commercial  product, research project, or otherwise) that
       you acknowledge this fact somewhere in the  documentation,
       research report, etc...  If you like PCCTS and have devel-
       oped a nice tool with the output, please mention that  you
       developed it using PCCTS.  As long as these guidelines are
       followed, we expect to continue enhancing this system  and
       expect  to  make  other  tools  available as they are com-
       pleted.

FFIILLEESS
       *.c    output C parser

       *.C    output C++ parser when C++ mode is used

       ppaarrsseerr..ddllgg
              output _d_l_g lexical analyzer

       eerrrr..cc  token string array, error sets  and  error  support
              routines

       rreemmaapp..hh
              file  that  redefines  all  globally visible parser
              symbols.  The use of the #parser directive  creates



ANTLR                       April 1994                          5





ANTLR(1)                PCCTS Manual Pages               ANTLR(1)


              this file

       ssttddppccccttss..hh
              list  of  definitions needed by C files, not gener-
              ated by PCCTS, that reference PCCTS objects.   This
              is not generated by default.

       ttookkeennss..hh
              output _#_d_e_f_i_n_e_s for tokens used and function proto-
              types for functions generated for rules

SSEEEE AALLSSOO
       dlg(1), pccts(1)












































ANTLR                       April 1994                          6


