Recently I've had a few ideas than can improve pydsl but unfortunately I have no time to implement them...

Alphabets

I think that alphabets can have their own class and become part of pydsl. A lexer reads the user input (assuming and encoding) and generates a list of tokens.  Its definition includes a list of valid tokens (symbols), and they can be recognized with a Finite State Machine.

That list is in fact an alphabet definition. That process is repeated by programs/libraries:
  • Hard disk contains a list of 1 and 0s
  • From (01) alphabet libraries, according to an encoding, creates a list of characters (i.e. ASCII)
  • That's the input of the lexer. Then the lexer generates a list of tokens which feeds the parser.
That application of the same principle in diferent places makes alphabet a good candidate for an abstraction.Here is the list of planned functions in pydsl:
  • lexer(ad, input): Generates a tokenlist from a string (is a special case of translate)
  • translate(ad, tokenlist) -> tokenlist: generic translator
  • checker(ad, input): Tests if a given string or tokenlist belongs to the given definition 
  • guess(input, [gd]): returns a list of alphabets that are compatible with the input. It also accepts grammar definitions
  • distance(gd, input1, input2): returns the distance between two inputs according to an alphabet

In the next post I'll talk about the other idea: First order logic in grammar definitions