Go to file

Robin Jadoul ecc187c51b Add comments to the input file		2017-01-28 11:56:27 +01:00
cmake	Prevent g3log recompilation	2017-01-27 17:18:03 +01:00
docs	Initial commit	2016-11-02 13:19:36 +01:00
examples	Add cmake examples target	2017-01-28 11:09:13 +01:00
include/Parsodus	Change precedence level meaning + update TODO	2017-01-27 16:18:58 +01:00
man	Add comments to the input file	2017-01-28 11:56:27 +01:00
src	Add comments to the input file	2017-01-28 11:56:27 +01:00
templates/c++	Debug output from the table generator	2017-01-26 21:02:30 +01:00
test_data	Tests with the examples	2017-01-26 16:21:36 +01:00
tests	Debug output from the table generator	2017-01-26 21:02:30 +01:00
.gitignore	Some architectural design and task delegation	2016-11-09 13:04:56 +01:00
CMakeLists.txt	Changed inputparser: added terminal/variable check and insertion lexesis terminals	2017-01-27 20:49:29 +01:00
Doxyfile.in	Some architectural design and task delegation	2016-11-09 13:04:56 +01:00
Parsodus-completion.bash	Debug flag added	2017-01-26 21:01:19 +01:00
README.md	Add comments to the input file	2017-01-28 11:56:27 +01:00
TODO	Change precedence level meaning + update TODO	2017-01-27 16:18:58 +01:00
parser.example.pds	Fixed the generated parser	2017-01-27 16:11:50 +01:00
run_tests.py	Tests with the examples	2017-01-26 16:21:36 +01:00

README.md

Parsodus

A language agnostic parser generator

Introduction
Requirements
Building
Getting started
More examples
Tested with
Authors

Introduction

Parsodus is a language agnostic parser generator. Which means that it uses a description of a grammar in a kind of BNF notation, and outputs source files for a parser (which can be in any language for which a backend has been built, currently only c++), using a specified parsing algorithm (currently 4 kinds of LR parser have been implemented). This parser can then be augmented with rule handling to build an abstract structure representing the data, doing computations immediately, or anything else you can imagine. It's principle is very similar to the well known tools such as yacc or bison, which the difference that Parsodus has a simpler input format, and does not depend on language specific actions to be specified in the configuration file. It uses a programming language independent description of the grammar, along with optional naming for the rules, in order to allow a bigger reusability across different programming languages of the same parser specification.

This project came into existence as an application exercise in a course on languages and turing machines for the University of Antwerp, and can be considered a continuation of Lexesis (a lexical analyser generator).

Requirements

git
CMake 3.2.2+
Boost variant header library (needed for mstch)
Doxygen (optional, needed for building documentation)

For those still on Ubuntu Trusty, the default cmake version is still 2.8.12, so there is a ppa available with a more up-to-date version.

Run

sudo apt-get update && sudo apt-get -y install software-properties-common; \
sudo add-apt-repository -y ppa:george-edison55/cmake-3.x; \
sudo apt-get update && sudo apt-get install -y cmake

to get this newer version

Boost variant can be installed on an ubuntu machine by running

sudo apt-get update && sudo apt-get -y install libboost-dev

Used dependencies

The following dependencies will be automatically downloaded with git while building

mstch
optparse
g3log
Lexesis

Building

Get your terminal in the source tree and run the following commands:

mkdir build
cd build
cmake ..
make
make install

This will place the Parsodus executable in the build/bin folder, with some extra needed data for Parsodus in build/share You can now simply run ./bin/Parsodus with the arguments you like (see below and in the man pages for an overview).

If you want to build the documentation as well, simply run

make doc

The output should be located in build/doc, with the main html page in build/doc/html/index.html.

Running tests

To run the unit tests: simply run make test To run the tests for the generated parser: build the examples and run python3 ./run_tests.py in the project root

Building examples

To build the examples: after running make install, run cmake . && make examples You will now find the examples built in the example subdirectory of the build folder

Getting started

Now that Parsodus is successfully built and your terminal is in the build folder, it's time to generate the parser based on your input file.

The input file

Input files for Parsodus have a .pds extension and have a set of some very simple rules: Variables in the grammar follow the regular expression <[a-zA-Z_][a-zA-Z0-9_]*>, and terminals use the same scheme, except using double quotes instead of angular brackets.

Furthermore, Parsodus uses a couple of key-value associations, including

parser: the parsing algorithm to use
terminals: a whitespace separated list of terminals
lexesis (optional): a reference to a lexesis specification file. If given, terminals will be read from the lexesis file, and should as such not be specified separately in this file.
precedence (optional): a whitespace separated list of left, right, or nonassoc followed by terminals, higher up is a higher precedence
start: a variable to use as the start symbol
grammar: a list of rules (see below)

A grammar rule is a variable followed by ::= followed by a |-separated list of rule tails ended with a semicolon. A rule tail is a list of variables and terminals followed by an optional rule name of the form [name].

parser: lalr(1)
terminals:
    "A"
start: <s>
grammar:
    <s> ::= "A" [single]
          | "A" "A" [double]
          ;

We are building an LALR(1) parser, with replacement rules, both starting from the start-symbol <s>, named appropriately single and double.

Conventionally, terminals are all caps, while variables are lowercase.

Comments are from a # to the end of the line.

Using the parser

Of course, how you use the generated parser highly depends on which backend you used to generate it. For the default c++ backend however, the easiest way of getting to know the parser is probably having a look at the class definition in the generated header file, usually named <Parsername>.h. In general, there should be some way to run the parser, along with user defined actions, and get back the generated structure or abstract syntax tree.

More examples

More examples can be found in the examples subdirectory, go ahead an have a look at them. Feel free to play around and experiment with them.

Authors

Thomas Avé
Robin Jadoul
Kobe Wullaert