There are three parts: lexical analyser, parser, and code generation
      
      lexcial analyser
      
      lexical analyser changes stream of input text into tokens.
      regular expression is used to specify "word" in the language.
      the pattern of regex defines set of words in the language.
      skill in defining words is:
        a.  given a regex write down the set of words
        b.  know how to write a specification, given set of words,
      write its regex
      note:  writing a regex is considered to be both rigorous and
      creative. One has to strive for a balance between simple and complex.
      implementation of lex
      a regex can be translated into an automaton.  (similar to a finite
      state machine)
      a set of automaton is composed from individual automaton, it becomes
      Non-deterministic Finite Automata, NFA.
      to implement NFA, we need to transform it into DFA (deterministic finite
      automata, which has many more states than NFA, but it is realizable.)
      the main parser uses lexical analyser by calling "next token" function.
      
      parser
      
      the syntax of a language is specified by a grammar.  
      Context Free Grammar and a type of grammar (which is simple) that is used
      to specify a computer language.
      there are two methods to parse a sentence:
        1. recursive descence parser
        2. LL(1)
      
      skill in defining syntax:
        a.  given a grammar, write set of sentences (this can be
      infinite)
        b.  given a language, write its CFG. (this is similar to write
      a spec. in engineering job)
      practical grammar
      in order to implement a parser, the grammar must be of some specific form.
      it has to avoid "left recursion".
      
      recursive descence parser
      
      it is the most simple way to write a parser for small grammar.
      you write it by hands. (other methods are automated method)
      every non-terminal symbol becomes a function
      the body of function is "call" to other non-terminal symbol and "match" to
      terminal symbol.
      in this way, the whole parser becomes a nested recursive call, started
      from starting symbol.
      to implement "optional" part, one needs a lookahead to see the next symbol
      before making a decision.  
      the "match" function compares the current token with the non-terminal
      symbol in the grammar and return yes/no.  If it is successful (return
      yes) it will advance to the next token (call next token).
      
      LL(1)
      
      it is an algorithm that use grammar and a push-down stack to parse a
      sentence.
      the grammar must be in correct form (which we will know "after" we
      construct the parser).
      in order to write an LL(1) parser, one need to compute
        a.  first set
        b.  follow set
      first and follow set can be compute efficiently.
      with these two sets, we can draw a parsing table.
      use the parsing table, with push-down stack, we can parse a sentence.
      to check whether the grammar is LL(1) compatible, after we create the
      parsing table, we check each entry in the table, it must has only one
      rule.
      we can rewrite the grammar so that it is  LL(1) compatable most of
      the time.
      
      code generation
      
      while we parse, we construct an auxilliary data structure called "parse
      tree".
      the parse tree represents the input source.
      an interpreter can be written so that given a parse tree as its input, its
      can "run" (execute) that program.
      this interpreter "traverses" the parse tree and perform the function
      according to the "operator" in the parse tree (such as +).
      to create a code generator, we change the interpreter one step further.
      instead of "run" the parse tree, the code generator outputs the assembly
      language string that is equivalent to "run".
      for example a parse tree of A+2 is (+ A 2) 
      the code generator outputs the following sequence of assembly instruction
      for 3-address format as follows:
      assume A is in R1
      2 is in R2
      the result is in R3
      
         load R1, A
         mov R2, #2
         add R3, R1, R2
      
      other parsing methods
      
      two methods that we learn are the top down method.
      it is limited to the grammar that is compatible.
      some syntax can not be handle by LL(1).
      there are many more methods that are "bottom up" methods.
      they are more powerful and at the same time more complex.
      
      end