Read time: 4.9 minutes (493 words)

TatSu Parser Development

To generate a parser for the OpenSCAD language using TatSu, we need to start off by writing a language specification using a variation of EBNF.

Here is our starting point:

scad.ebnf
 1@@grammar :: scad
 2@@comments ::	/\/\*.*\*\//
 3@@eol_comments ::	/\/\/.*/
 4
 5integer
 6	=
 7	/[-]?\d+([eE]-?\d+)?/
 8	;
 9
10

Here, we define a name for the language grammar, and set up the generated parser to allow (and skip) C++ style comments. The only rule found in this startup file defines an integer number as a sequence of digits, possibly allowing for an exponent using the scientific type notation. The pattern shown here is a standard one using regular expressions in Python.

Note

In order to assist in producing this documentation, I am numbering versions of the ebnf- we produce and use in tests. In normal application development, we simple add lines to a single 88scad.ebnf fle and ditch the numbers.

Here is a simple test showing how we use this grammar and generate a parser for our extremely simple language:

sandbox/step01.py
 1import tatsu
 2from pprint import pprint
 3
 4
 5def test():
 6    g = open('scadparser/ebnf/scad01.ebnf').read()
 7    parser = tatsu.compile(g)
 8    ast = parser.parse('123')
 9    pprint(ast)
10
11if __name__ == '__main__':
12    test()

Note

I created a sandbox directory in this project just for simple tests

Let’s run this and see what we get:

$ python sandbox/step01.py
'123'

The output of this run is a simple string, which is the “code” we passed to the parser. Looks like it worked.

PyTest Testing

With a working parser, we really should test it more extensively. For that, we will use PyTest.

Here is a test that uses a feature of PyTest that parametrizes (yes it is misspelled) a test. Also, I created a scad.ebnf file which will be our real OpenSCAD language specification.

tests/test_integers.py
 1import pytest
 2
 3@pytest.mark.parametrize('t,e', [
 4    ('123', "123"),
 5    ('-123', "-123"),
 6    ('123e02', "123e02"),
 7    ('123e-2', "123e-2")
 8])
 9def test_integers(scadparser, t, e):
10    ast = scadparser.parse(t, start="integer")
11    assert str(ast) == e

In this test, the t and e parameters mean “string to test”, and “expected result” respectively. PyTest uses the list of string pairs to run the test multiple times, once for each pair. This let’s you create several tests in a simple way. The start parameter tells the parser what rule in the specification grammar to use to begin processing.

This test also relies on a setup file that creates a parser for each test:

tests/conftest.py
 1import pytest
 2import sys
 3import tatsu
 4
 5GRAMMAR = "scadparser/ebnf/scad.ebnf"
 6
 7
 8@pytest.fixture(scope="function")
 9def scadparser():
10    g = None
11    try:
12        f = open(GRAMMAR)
13    except IOError:
14        print("Grammar file cannot be opened:", GRAMMAR)
15        sys.exit(1)
16    else:
17        with f:
18            g = f.read()
19    model = tatsu.compile(g)
20    yield model
21

The parser is passed to the test function as the first parameter.

Let’s see if this works:

$ pytest tests/test_integers.py
============================= test session starts ==============================
platform darwin -- Python 3.9.6, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 -- /Users/rblack/_dev/ScadParser/.direnv/python-3.9.6/bin/python3.9
cachedir: .pytest_cache
rootdir: /Users/rblack/_dev/ScadParser, configfile: pytest.ini
plugins: cov-2.12.1
collecting ... collected 4 items

tests/test_integers.py::test_number[123-123] PASSED                      [ 25%]
tests/test_integers.py::test_number[-123--123] PASSED                    [ 50%]
tests/test_integers.py::test_number[123e02-123e02] PASSED                [ 75%]
tests/test_integers.py::test_number[123e-2-123e-2] PASSED                [100%]Coverage.py warning: No data was collected. (no-data-collected)
/Users/rblack/_dev/ScadParser/.direnv/python-3.9.6/lib/python3.9/site-packages/pytest_cov/plugin.py:271: PytestWarning: Failed to generate report: No data to report.

  self.cov_controller.finish()
WARNING: Failed to generate report: No data to report.



---------- coverage: platform darwin, python 3.9.6-final-0 -----------


============================== 4 passed in 0.37s ===============================

Now, we have a much better testing setup, which we will use as we proceed

Adding Real Numbers

Now that we can process integers, let’s add a rul that accepts real numbers as well. This one is similar to the integer rule, bu it allows a decimal point;

scadparser/ebnf/scad02.ebnf
 1@@grammar :: scad
 2@@comments ::	/\/\*.*\*\//
 3@@eol_comments ::	/\/\/.*/
 4
 5integer
 6	=
 7	/[-]?\d+([eE]-?\d+)?/
 8	;
 9
10real
11	=
12	/[-]?[0-9]*[\.][0-9]+([eE]-?[0-9]+)?/
13	;

Instead of building another test code, we will just add a new test:

tests/test_reals.py
 1import pytest
 2
 3@pytest.mark.parametrize('t,e', [
 4    ('123.45', "123.45"),
 5    ('-123.0', "-123.0"),
 6    ('123.45e02', "123.45e02"),
 7    ('123.0e-2', "123.0e-2")
 8])
 9def test_reals(scadparser, t, e):
10    ast = scadparser.parse(t, start="real")
11    assert str(ast) == e

Now we can run this test as well:

$ pytest tests/test_reals.py
============================= test session starts ==============================
platform darwin -- Python 3.9.6, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 -- /Users/rblack/_dev/ScadParser/.direnv/python-3.9.6/bin/python3.9
cachedir: .pytest_cache
rootdir: /Users/rblack/_dev/ScadParser, configfile: pytest.ini
plugins: cov-2.12.1
collecting ... collected 4 items

tests/test_reals.py::test_number[123.45-123.45] PASSED                   [ 25%]
tests/test_reals.py::test_number[-123.0--123.0] PASSED                   [ 50%]
tests/test_reals.py::test_number[123.45e02-123.45e02] PASSED             [ 75%]
tests/test_reals.py::test_number[123.0e-2-123.0e-2] PASSED               [100%]Coverage.py warning: No data was collected. (no-data-collected)
/Users/rblack/_dev/ScadParser/.direnv/python-3.9.6/lib/python3.9/site-packages/pytest_cov/plugin.py:271: PytestWarning: Failed to generate report: No data to report.

  self.cov_controller.finish()
WARNING: Failed to generate report: No data to report.



---------- coverage: platform darwin, python 3.9.6-final-0 -----------


============================== 4 passed in 0.37s ===============================

Looks like we are making progress.

Note

In normal development, running pytest will run all tests found in the test directory. I am limiting the test run to just a sngle test for this documentation.

Identifiers

We will be naming things in our design work, and OpenSCAD has some rules on names.

identifier
    =
    /[$]?[a-zA-Z_]+[0-9]*/
    ;

This new rule allows names that start with a dollar sign, but those names are reserved for internal work by OpenSCAD. Some strange names are allowed as well, as we will see in our next test code:

tests/test_identifiers.py
 1import pytest
 2
 3@pytest.mark.parametrize('t,e', [
 4    ('a', "a"),
 5    ('_0', "_0"),
 6    ('max_span', "max_span"),
 7])
 8def test_identifiers(scadparser, t, e):
 9    ast = scadparser.parse(t, start="identifier")
10    assert str(ast) == e

I am going to omit the test run (it works!)

w that we have a start on our parser, let’s move on to something more interesting: expressions!