Let’s look at the next contact in our email list:


This one just has an email address. So the email name portion is optional. In other words, a valid full email address has an optional encapsulated email name followed by spaces followed by an encapsulated email address.

We can represent this in Treetop as follows:

grammar EmailList
  rule full_email_address
    optional_email_name [ ]* '<' email_address '>'
  rule optional_email_name
    '"' email_name '"' / ''
  rule email_name
  rule email_address

In a Treetop grammar, / separates two options. The grammar first tries to match the first option. If that fails, then it goes on to the second option. Our second option in the rule optional email name is just two single quotes, meaning the empty string. (You can have more than two options if you want.)

The key thing to remember, when you are writing grammars with options, is that there needs to be some text or pattern that clearly identifies which option is the correct one. Just like the real-time translator at the UN, the parser needs grab some word or character that identifies the correct option for this text. In our case, the double-quote clearly identifies the first option. Then we know that, if there is no double-quote, we should be using the second option.

Notice that the Syntax Node structure has changed. That means the understanding our email list program has of the text has also changed. How would you change the line

puts "You said the email name #{result.email_name.text_value} and the email address #{result.email_address.text_value}."

If you guessed result.optional_email_name.email_name.text_value, good choice! However, we have a problem. Will there be a Syntax Node called email_name when there is no double-quoted email name? In fact there won’t. You can try it out and see.

Things now get messy, so the Ruby way is to bury the messiness back in the class. It would be nice if we could modify the optional_email_name class so that it returned an empty string when there is no email.

Guess what! Treetop allows this! We can add Ruby code right into the grammar that will be included in our resulting tree of Syntax Node instances. Modify the grammar as follows (noting the round brackets, which make sure the rule is properly applied):

rule optional_email_name
  ('"' email_name '"' / '' ) {
    def email_name_text_value
      if self.terminal?

And modify the email list program as follows:

puts "You said the email name #{result.optional_email_name.email_name_text_value} and the email address #{result.email_address.text_value}."

Give it a spin with our first two full addresses:

"Jena L. Dovie" <jdovie_qs@agora.bungi.com>

Works? Good! What about the two addresses together on one line? Let’s do that in our next tutorial…

Previous | Next