Complete README?

This commit is contained in:
Robin Jadoul 2016-05-29 19:50:44 +02:00
parent 3b69ec581d
commit 2d1c077303
1 changed files with 14 additions and 0 deletions

View File

@ -89,6 +89,20 @@ When we run `A` through the generated lexer, it will return that it's a `CAPITAL
### Regular expressions
Most posix regular expression features have been implemented, with the exception of some notable features:
* There is no way to match the beginning or ending of a line (`^` or `$`)
* Repetition using `{` and `}` is (not yet) supported
It should be noted that escape characters inside character classes don't exist, so a `-` that is part of the class should be specified as very first or very last element of the class (and cannot be used as endpoint of a range), and a `]` should be specified as first element. A `^` however should not be used as first element, unless it is meant as an inversion modifier for the character class.
When needed (for example at the beginning of a rule, when whitespace is stripped by the reading of the input rules) a space can be specified as `\s` for convenience. Otherwise `[ ]`, a character class containing only a space can be used as well.
### Using the lexer
Of course, how you use the generated lexer highly depends on which backend you used to generate it. For the default c++ backend however, the easiest way of getting to know the lexer is probably having a look at the generated header file, usually named *<Lexername>.h*.
In general, there should be some way the tokens are defined, and there should be some way to generate a list of tokens (or get each tokens separately).
## More examples
More examples can be found in the *examples* subdirectory, go ahead an have a look at them.
Feel free to play around and experiment with them.