TatSu Parser Development
########################

To generate a parser for the |OSC| language using TatSu_, we need to start off
by writing a language specification using a variation of EBNF_.

Here is our starting point:

..  literalinclude::    ../scadparser/ebnf/scad01.ebnf
    :linenos:
    :caption: scad.ebnf

Here, we define a name for the language *grammar*, and set up the generated
parser to allow (and skip) C++ style comments. The only rule found in this
startup file defines an integer number as a sequence of digits, possibly
allowing for an exponent using the scientific type notation. The pattern shown
here is a standard one using *regular expressions* in |PY|.

..  note::

    In order to assist in producing this documentation, I am numbering versions
    of the ebnf- we produce and use in tests. In normal application
    development, we simple add lines to a single 88scad.ebnf fle and ditch the
    numbers.

Here is a simple test showing how we use this grammar and generate a parser for
our extremely simple language:

..  literalinclude::    ../sandbox/step01.py
    :linenos:
    :caption:   sandbox/step01.py

..  note::

    I created a **sandbox** directory in this project just for simple tests

Let's run this and see what we get:

..  command-output:: python sandbox/step01.py
    :cwd: ../

The output of this run is a simple string, which is the "code" we passed to the
parser. Looks like it worked.

PyTest Testing
**************

With a working parser, we really should test it more extensively. For that, we
will use PyTest_.

Here is a test that uses a feature of PyTest_ that **parametrizes** (yes it is
misspelled) a test. Also, I created a **scad.ebnf** file which will be our real
|OSC| language specification.

..  literalinclude::    ../tests/test_integers.py
    :linenos:
    :caption: tests/test_integers.py

In this test, the **t** and **e** parameters mean "string to test", and
"expected result" respectively. PyTest_ uses the list of string pairs to run
the test multiple times, once for each pair. This let's you create several
tests in a simple way. The **start** parameter tells the parser what rule in
the specification grammar to use to begin processing.

This test also relies on a setup file that creates a parser for each test:

..  literalinclude::    ../tests/conftest.py
    :linenos:
    :caption: tests/conftest.py

The parser is passed to the test function as the first parameter.


Let's see if this works:

.. command-output::    pytest tests/test_integers.py
    :cwd:   ../

Now, we have a much better testing setup, which we will use as we proceed

Adding Real Numbers
*******************

Now that we can process integers, let's add a rul that accepts real numbers as
well. This one is similar to the *integer* rule, bu it allows a decimal point;

..  literalinclude::    ../scadparser/ebnf/scad02.ebnf
    :linenos:
    :caption:   scadparser/ebnf/scad02.ebnf

Instead of building another test code, we will just add a new test:

..  literalinclude::    ../tests/test_reals.py
    :linenos:
    :caption: tests/test_reals.py

Now we can run this test as well:

..  command-output::    pytest tests/test_reals.py
    :cwd:   ../

Looks like we are making progress.

..  note::

    In normal development, running pytest will run all tests found in the
    **test** directory. I am limiting the test run to just a sngle test for
    this documentation.

Identifiers
***********

We will be naming things in our design work, and |OSC| has some rules on names.

..  literalinclude::    ../scadparser/ebnf/scad.ebnf
    :lines: 23-26

This new rule allows names that start with a dollar sign, but those names are reserved for internal work by |OSC|. Some strange names are allowed as well, as we will see in our next test code:

..  literalinclude::    ../tests/test_identifiers.py
    :linenos:
    :caption: tests/test_identifiers.py

I am going to omit the test run (it works!)


w that we have a start on our parser, let's move on to something more interesting: *expressions*!