Schemers - Parsers

Published December 16th, 2016

Ready for the next lesson? We'll be writing our parser now, the Read in the REPL. We'll be using the nom library for parsing. It's what's known as a parser combinator library. By building small parsers that do one thing like parse a number or one that parses a string we can build them up into larger ones, for instance one that could parse a string or a number. It uses macros to build parsers. We'll start small and parse a procedure something like:

(+ 3 4 5)

Into a data structure we can use. By no means will our parser be complete but it will give us an introduction to building one and using it to place data into a data structure. We'll also likely being changing these data structures as needed but it's a start!

We'll be covering some of the following things in our code:

Alright then, let's get started by first making our code from before more idiomatic Rust!

Making our code a little more Rusty

If you remember our code from Exercise 1 it looked like this:

extern crate rustyline;
fn main() {
    let mut done = false;
    let mut reader = rustyline::Editor::<()>::new();
    while !done {
        match reader.readline(">> ") {
            Ok(line) =>
                if line.trim() == "(exit)" {
                    done = true;
                } else {
                    println!("{}",line)
                },
            Err(e) => {
                use rustyline::error::ReadlineError::*;
                match e {
                    Eof | Interrupted => done = true,
                    _ => println!("Couldn't readline. Error was: {}", e),
                }
            }
        }
    }
}

This has one thing that really isn't Rust like and it's the use of done and the while loop. Since we know this is a loop we want to go forever unless certain conditions are met we should structure it that way rather then having it do a boolean comparison each time to figure it out!

We'll be using two new reserved keywords to accomplish this, loop and break. loop does just that, loop the code inside it forever. For example:

loop {
  println!("I'm a loop looping forever");
}

Will keep printing that forever till either your terminal crashes or you forcefully stop the program from executing. break causes whatever type of loop it's in to stop executing and go to the next instruction in your code. With that in mind let's rewrite what we have so far:

extern crate rustyline;
fn main() {
    let mut reader = rustyline::Editor::<()>::new();
    loop {
        match reader.readline(">> ") {
            Ok(line) =>
                if line.trim() == "(exit)" {
                    break;
                } else {
                    println!("{}",line)
                },
            Err(e) => {
                use rustyline::error::ReadlineError::*;
                match e {
                    Eof | Interrupted => break,
                    _ => println!("Couldn't readline. Error was: {}", e),
                }
            }
        }
    }
}

This code does the exact same thing as before except it uses loop where the while statement was, it gets rid of the variable done, and replaces the statements where it was changed to say true with break statements instead. Why would we do this? Why is this more Rust like?

For one, the use of loop explicitly tells us this will continue on for some time and keep executing everything inside. Uses of break tell us that here and only here are the conditions to cause this loop to exit no matter what. Rust code tends to explicit and clearly stating your intention in your code is cleaner and easier to understand. This also removes an unnecessary mutable variable from the scope of the program. Limiting the amount of variables that are mutable is a good thing. You'll find that often you won't need to have mutable data to solve your problems. While this was a small amount of code I could have changed the true to false by accident and the code would still compile and it becomes a runtime bug, something not even Rust can capture. Using break makes it harder to change it's functionality without the compiler complaining or you explicitly changing it on purpose.

I'm going to do one more thing to make the code a bit more clear:

match e {
    // Close the program on a Ctrl-C or Ctrl-D
    Eof | Interrupted => break,
    _ => println!("Couldn't readline. Error was: {}", e),
}

The // is how we do comments in Rust. Anything after them on the line are ignored. I've added this one here to make it a bit more clear as to what could trigger an Eof or Interrupted error.

Getting nom setup

We need another library! That means we need to open up Cargo.toml again. It should look sort of like this right now:

[package]
name = "schemers"
version = "0.1.0"
description = "Scheme interpreter written in Rust!"
authors = ["Michael Gattozzi <mgattozzi@gmail.com>"]
readme = "README.md"
license = "MIT OR Apache-2.0"
homepage = "https://github.com/mgattozzi/schemers"
repository = "https://github.com/mgattozzi/schemers"

[dependencies]
rustyline = "1.0.0"

Add this line under rustyline:

nom = "~2.0.0"

This is saying, "Use the nom library v2.0.0 but if there is any version between v2.0.0 and v2.1.0 (not including v2.1.0) use that". This means we get some bug fixes and we won't use a new minor version that might break our code a little. Cool! Now let's import it by putting this at the top of our main.rs file:

#[macro_use]
extern crate nom;

This is saying import the nom crate and it tells the compiler we'll also be using the macros that are exported from it. If you don't place #[macro_use] there then we can't use them and then writing a parser with nom would be possible but incredibly hard. Keep your main.rs file open still. We need to do one last thing.

Mods mods mods get your mods here

Code usually is easier to reason about if we split it up in to separate files. In this case we'll be building our parser in a separate file called parser.rs. However, we need to tell the compiler we'll be using this in our project. This is where modules come in! By defining a module in our main file we can then import that module with use statements anywhere else in our program. Neat huh? First create an empty file under src called parser.rs. Now in main.rs add this line somewhere near the top:

mod parser;

This lets Rust know to look for the parser.rs file when compiling. It now will treat anything inside the file as part of the parser module. I generally will put my module definitions after my external imports and before use statements in a file but it doesn't matter what the order is since it really just comes down to personal preference. With all of that done the header of your main.rs file should look like this:

extern crate rustyline;
#[macro_use]
extern crate nom;

mod parser;

If you want to make sure it works compile it using cargo build in fact if you want to see it fail delete parser.rs and build the project. The compiler will spit out a warning saying it was unable to find the file. If everything is building then we're ready to actually start building our parser!

Macros and Parsing

Remember when I briefly touched on macros for println!()? Well we're about to use a lot of macros and going a little more in depth about how they work will be beneficial. We won't be writing macros which can be kind of hard if you're unfamiliar with all of the syntax. We'll just be covering how they work at a higher level!

What is a macro exactly? It's code that generates code at compile time. Weird right? You can think of it as a set of rules that given input expands out to the proper form in code. Let's look at a simple example, the try!() macro. try!() works by taking a Result type as input. If it's not an error it unwraps the data and if it is then it returns it early. This means your function will also implement Result and the error type used when using try!(). We'll use opening and reading from a file as an example and we'll look at the macro and expanded version.

Let's assume we have a file with some text in it, we'll have it be "Hello!" for this example. Here's a simple program that uses a function read_file that is given a &str, a reference to a String or a non growable String is a way to think about it, and opens up the file and reads it into a String and returns that String.

use std::path::Path;
use std::fs::File;
use std::io::{Read, Error};

fn main() {
    let output = read_file("example");
    match output {
        Ok(x) => println!("The file contained: {}", x),
        Err(e) => println!("Failed to read file. Error was: {}", e),
    }
}

fn read_file(file: &str) -> Result<String, Error> {
    let path = Path::new(file);
    let mut handle = try!(File::open(path));
    let mut buffer = String::new();
    try!(handle.read_to_string(&mut buffer));
    Ok(buffer)
}

You'll notice I used try!() twice here. Think of these as operations that could fail. In this case they're both I/O errors that could occur. Maybe the file got deleted while reading, isn't there, or there was some kind of corruption. The Error type from std::io will contain the appropriate error that occurred if it does. That means it won't execute anything else if it fails. Why though? Couldn't we just do this with regular Rust code? Yes we can! Here is what it looks like all expanded out:

use std::path::Path;
use std::fs::File;
use std::io::{Read, Error};

fn main() {
    let output = read_file("example");
    match output {
        Ok(x) => println!("{}", x),
        Err(e) => println!("Failed to read file. Error was: {}", e),
    }
}

fn read_file(file: &str) -> Result<String, Error> {
    let mut buffer = String::new();
    let path = Path::new(file);
    let mut handle = match File::open(path) {
        Ok(a) => a,
        Err(e) => return Err(e),
    };

    match handle.read_to_string(&mut buffer) {
        Ok(a) => a,
        Err(e) => return Err(e),
    };

    Ok(buffer)
}

Both now do a match on the function. In each of them we use return to tell Rust to stop execution and return this value. Often you shouldn't need it since the last expression without a semicolon is the return value. However, in this case we do need it. If we had to do this expansion for every single Result type though it would be a pain. That's why we have try!().

You might have heard about the new ? syntax that can be used in place of try!(). It looks like this if being used:

fn read_file(file: &str) -> Result<String, Error> {
    let mut buffer = String::new();
    let path = Path::new(file);
    let mut handle = File::open(path)?;
    handle.read_to_string(&mut buffer)?;

    Ok(buffer)
}

I personally like it, others do not, really just use whichever one works for you or is more readable since it does the same thing. It was added syntax though because this pattern was so common in Rust programs.

Macros allow flexible code generation based off sets of rules defined in them meaning we can have it work differently depending on the context and inputs! Macros are quite powerful and if you're willing to learn them you can do some real neat things. We won't be writing our own, for the near future at least. Instead let's use some to start parsing!

Rustacean's First Parser

Let's write a simple parser that takes input and returns a String type. In nom everything is read in as raw bytes, represented as an array of bytes or &[u8]. Let's actually start writing some code.

First you need to put this at the top of parser.rs so that it can use functions from nom that aren't macros:

use nom::*;

Now we can write our first parser:

named!(string<&[u8], String>,
    do_parse!(
        word: ws!(alphanumeric) >>
        (String::from_utf8(word.to_vec()).unwrap())
    )
);

You might be confused now. That's totally fine. We're going through each part step by step. First off the named!() macro. Our first input is saying what our parsing function will be called. We'll call it string and the next part between the <> is the type returned. The &[u8] is saying to nom, "Look, this will be used and combined with others and so it might actually have more data to parse. Store it as part of the return value!" The other part is the type of data returned from being parsed so in this case we're returning a String.

If that seems a bit confusing, that's okay. It took me a bit to understand that much myself. It might be easier for you to understand when we make a test case to make sure this parser is working. Next up we're using the do_parse!() macro. If you're familiar with Haskell it acts loosely like do notation. If you don't know what that is don't worry. We can use this macro to store the values of parsed results for use later or just completely eat up a certain portion of input and move on. In this case take a look at this line here:

word: ws!(alphanumeric) >>

This is saying store the result of the parsing that is caused by everything between : and >> and store it into word. We can call this variable whatever we want but for what's going on here this makes the most sense. Now what's this ws!() macro? Well it just eats up white space and returns the result of what's inside. alphanumeric is an inbuilt nom parser that will return the characters A-Z, a-z, and 0-9. Because of the ws!() it will stop once it hits a whitespace character or a non alphanumeric character, whichever comes first.

The >> is telling the do_parse!() macro, "Hey this is the end of this parser go to the next line." The stuff inside of the () on this line

(String::from_utf8(word.to_vec()).unwrap())

is what gets returned by the function the macro will generate! In this case we're saying take whatever was stored in word turn it from a &[u8] to a vector of bytes, Vec<u8> (Vec uses slices inside of it so the promotion from [] to Vec is guaranteed to work, no Option or Result needed!), and then turn that into a String and unwrap it even if it might fail. Now we'll make this a little bit more robust later but in this case since we've been transforming data between compatible types it's not likely to fail. I wouldn't normally recommend that you do this. It's just easier to do to work on an example. Cool so this should take data in and turn it into a String. Let's make some tests to make sure it works!

We're going to utilize a really cool aspect of Rust. You can write in file unit tests as well as integration tests or documentation tests and they'll run when you use the cargo test command. It's amazing to be able to do that. We're going to write a test to make sure our string parser works. It should take in any input, get rid of the white space and turn the first word into a string.

Put this at the bottom of your file:

#[test]
fn string_parser() {
    let comp_string = String::from("hi");

    match string(b"hi") {
        IResult::Done(_,s) => assert_eq!(comp_string, s),
        _ => panic!("Failed to parse string")
    }

    match string(b"   hi    ") {
        IResult::Done(_,s) => assert_eq!(comp_string, s),
        _ => panic!("Failed to parse string")
    }

    match string(b"hi      ") {
        IResult::Done(_,s) => assert_eq!(comp_string, s),
        _ => panic!("Failed to parse string")
    }

    match string(b"        hi") {
        IResult::Done(_,s) => assert_eq!(comp_string, s),
        _ => panic!("Failed to parse string")
    }
}

Let's walk through it so you know what's going on. First off look at the #[test] annotation. What it's telling the Rust compiler is,"Hey, look this function below me is a test and so you should ignore it if you're not running tests." However if it is running tests it compiles it and runs it. If nothing causes the function to terminate unexpectedly then it passes otherwise it fails. It will also fail if any assertions do. We have three types of assertions:

With them you can check that the behavior of your program is working. Let's look at the four test cases here. They're all testing that the input is equal to a String with the value of "hi". There are four different variations of "hi" being input to our string function. You might notice the b before the quotation marks. This tells Rust that this is a byte string not an &str. All of the parsers are expecting input as bytes. Now we check that it's parsed successfully. nom has an IResult enum and that's how we know if something parsed successfully or not. We only want it to pass if it finished parsing. So we match it on the Done variant of IResult. Since it has two types we only want the part parsed as a String. We use the _ symbol to discard the other return type and then assert that the string returned from the parser is indeed just "hi". On any other type of IResult we've failed and so we tell the compiler to panic and fail that test.

Our four cases vary with no white space, white space on both sides of the word, on just the right side and just the left side. Let's see if our parser works like we expected. Run cargo test. You should see something like this:

michael@eva-01 ~/Code/Tutorials/schemers (git)-[parser] % cargo test
   Compiling schemers v0.1.0 (file:///home/michael/Code/Tutorials/schemers)
    Finished debug [unoptimized + debuginfo] target(s) in 1.33 secs
     Running target/debug/deps/schemers-9f33802606b3ac13

running 1 test
test parser::string_parser ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured

Sweet! Our string parser is working and it ignores whitespace! Now we need to build parsers for procedures we can call. There are a few different types in scheme, Primitives, Special Forms, and then a User procedure. What we're doing here is setting up parsers for them. The functions and how they act are part of the Evaluator. Now we could parse things like "if" or "+" as a string but that's not as easy to reason about and harder to make sure we match all possibilities of a possible special form for instance further on down the line. So let's create a way to represent them. We'll make an enum for procedures which we'll call Op and it will have values that will store other enums. Seems weird? Let me show you what I mean:

#[derive(Debug, PartialEq, Eq)]
enum Op {
    Primitive(Prim),
    SpecialForm(SForm),
    User(String),
}

#[derive(Debug, PartialEq, Eq)]
enum Prim {
    Add,
    Sub,
    Mul,
    Div,
}

#[derive(Debug, PartialEq, Eq)]
enum SForm {
    If,
    Begin,
    Define,
    Lambda,
    Let,
}

We've defined the enum Op and it has three possible values:

You might have noticed this at the top of each of these:

#[derive(Debug, PartialEq, Eq)]

These are traits that can be generically derived. Rather than me having to write out how to test for equality the Rust Compiler can figure it out for me by implementing PartialEq and Eq. We'll be going more into traits when we implement our own error handling but if you're familiar with Object Oriented Programming they're like interfaces. You can read more about them here if you're interested.

Remember Option and Result? Those were enums! This means we can use match on an Op enum and have it act differently depending on what needs to be done later on! Now each of those contains some internal type. For Primitive there's an enum Prim that contains a list of types to represent the different Scheme primitives, in this case +, -, * and /. This isn't all of the primitives in Scheme but it's small enough of a list to get the point across of how it works. The same thing with SpecialForm. It contains an enum SForm that could be any of the ones below it. In this case it's just a capitalized version of how you would type it out in Scheme. For the User though we don't really know what they would have it be. We can't really type check it so we'll just make it a String for now. Right now we're focusing on parsing and not how the evaluation works. We'll figure out how to deal with these values later! Right now we just need a way to represent them. Let's build three different parsers for each type in Op and have them return a corresponding Op value. Here's what you should add to your code:

named!(specialform<&[u8], Op>,
    alt!(
        map!(tag!("if"),     |_| Op::SpecialForm(SForm::If))     |
        map!(tag!("begin"),  |_| Op::SpecialForm(SForm::Begin))  |
        map!(tag!("define"), |_| Op::SpecialForm(SForm::Define)) |
        map!(tag!("lambda"), |_| Op::SpecialForm(SForm::Lambda)) |
        map!(tag!("let"),    |_| Op::SpecialForm(SForm::Let))
    )
);

named!(primitive<&[u8], Op>,
    alt!(
        map!(tag!("*"), |_| Op::Primitive(Prim::Mul)) |
        map!(tag!("+"), |_| Op::Primitive(Prim::Add)) |
        map!(tag!("-"), |_| Op::Primitive(Prim::Sub)) |
        map!(tag!("/"), |_| Op::Primitive(Prim::Div))
    )
);

named!(user<&[u8], Op>,
    do_parse!(
        user_proc: string >>
        (Op::User(user_proc))
    )
);

What's this map!() stuff? What's a tag!()? What's this |_| and |? Also alt!()? Really the only thing that looks like what we built before is the user parser! Let's break these new macros and syntax down. Do note the syntax inside these macros is specific to nom and not Rust if that wasn't clear.

Let's start with the user parser. It's like string except we're actually using the string parser here to store a String into user_proc! We then create an Op of the User type with user_proc as it's internal value! If you look user returns an Op here. What's really cool is we used a parser we wrote in another parser we wrote! This is what it means to use a parser combinator library. We're not done yet though.

Let's look at primitive. It's also returning an Op but briefly looking over the code we can see it's returning a Primitive rather than a User. Well what's the alt!() doing? It's matching on alternate possibilities! A primitive could be a * or a + or a - or a / and so it matches on whichever one it is. tag!() is just matching on some string put in it. In this case it's single characters. What's this map!() though? Well it's simply saying, if matched by the first argument do the function on the right. We're passing it a lambda with no inputs denoted by |_| and transforming the value into something else. In this case we're just returning the value on the right of the lambda with no transformations or anything. You might have noticed the | this is how alt!() knows how to separate the alternate parser choices. Knowing that let's look at specialform. It's acting exactly the same as primitive but it's instead matching on special form keywords and returning the value corresponding to those keywords.

Make sense? Well let's write some test cases so you can see them in action and maybe it'll make more sense if you're a bit confused.

#[test]
fn user_op_parser() {
    match user(b"userop") {
        IResult::Done(_,s) => assert_eq!(Op::User(String::from("userop")), s),
        _ => panic!("Failed to parse userop")
    }
}

#[test]
fn specialform_op_parser() {
    match specialform(b"if") {
        IResult::Done(_,a) => assert_eq!(Op::SpecialForm(SForm::If), a),
        _ => panic!("Failed to parse special form")
    }
    match specialform(b"begin") {
        IResult::Done(_,a) => assert_eq!(Op::SpecialForm(SForm::Begin), a),
        _ => panic!("Failed to parse special form")
    }
    match specialform(b"define") {
        IResult::Done(_,a) => assert_eq!(Op::SpecialForm(SForm::Define), a),
        _ => panic!("Failed to parse special form")
    }
    match specialform(b"lambda") {
        IResult::Done(_,a) => assert_eq!(Op::SpecialForm(SForm::Lambda), a),
        _ => panic!("Failed to parse special form")
    }
    match specialform(b"let") {
        IResult::Done(_,a) => assert_eq!(Op::SpecialForm(SForm::Let), a),
        _ => panic!("Failed to parse special form")
    }
}

#[test]
fn primitive_op_parser() {
    match primitive(b"+") {
        IResult::Done(_,a) => assert_eq!(Op::Primitive(Prim::Add), a),
        _ => panic!("Failed to parse primitive")
    }
    match primitive(b"*") {
        IResult::Done(_,a) => assert_eq!(Op::Primitive(Prim::Mul), a),
        _ => panic!("Failed to parse primitive")
    }
    match primitive(b"-") {
        IResult::Done(_,a) => assert_eq!(Op::Primitive(Prim::Sub), a),
        _ => panic!("Failed to parse primitive")
    }
    match primitive(b"/") {
        IResult::Done(_,a) => assert_eq!(Op::Primitive(Prim::Div), a),
        _ => panic!("Failed to parse primitive")
    }
}

Given specific inputs we're expecting certain outcomes like before. If you run the code you should get something like this:

michael@eva-01 ~/Code/Tutorials/schemers (git)-[parser] % cargo test
   Compiling schemers v0.1.0 (file:///home/michael/Code/Tutorials/schemers)
    Finished debug [unoptimized + debuginfo] target(s) in 5.4 secs
     Running target/debug/deps/schemers-9f33802606b3ac13

running 4 tests
test parser::string_parser ... ok
test parser::user_op_parser ... ok
test parser::specialform_op_parser ... ok
test parser::primitive_op_parser ... ok

test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured

Looks like our mapping and use of alt!() worked out really well. Our code now parses those words as expected. Now what if we actually wanted to just parse for all types of operations simultaneously? Let's write a parser that combines all of them!

named!(op<&[u8], Op>,
    alt!(
        ws!(specialform) |
        ws!(primitive)   |
        ws!(user)
    )
);

This parser tries the three we just made to return an Op and ignores whitespace since someone might do something like this which is valid syntax:

(    +
  1 2 3
)

Let's create a test case for them.

#[test]
fn op_parser() {
    match op(b"  +   ") {
        IResult::Done(_,a) => assert_eq!(Op::Primitive(Prim::Add), a),
        _ => panic!("Failed to parse primitive")
    }
    match op(b"   let   ") {
        IResult::Done(_,a) => assert_eq!(Op::SpecialForm(SForm::Let), a),
        _ => panic!("Failed to parse special form")
    }
    match op(b"   myprocedure   ") {
        IResult::Done(_,a) => assert_eq!(Op::User(String::from("myprocedure")), a),
        _ => panic!("Failed to parse user op")
    }
}

Notice that we're only using the op parser here to test some of the things we just did with individual parsers and we added whitespace just to make sure the ws!() invocation was working. Let's test it out!

michael@eva-01 ~/Code/Tutorials/schemers (git)-[parser] % cargo test
   Compiling schemers v0.1.0 (file:///home/michael/Code/Tutorials/schemers)
    Finished debug [unoptimized + debuginfo] target(s) in 5.70 secs
     Running target/debug/deps/schemers-9f33802606b3ac13

running 5 tests
test parser::op_parser ... ok
test parser::primitive_op_parser ... ok
test parser::specialform_op_parser ... ok
test parser::string_parser ... ok
test parser::user_op_parser ... ok

test result: ok. 5 passed; 0 failed; 0 ignored; 0 measured

It works! I'm hoping you're beginning to see the power of being able to build larger parsers from smaller ones. Eventually we'll just want to call one parser on all input to have it build out our Abstract Syntax Tree, commonly referred to as AST, so that we can then evaluate it.

We're almost done for this portion of the tutorial let's actually get a parser built for a procedure! We're going to use a data structure that's not completely how we're going to represent things in the future. For now it's just to show you the results of everything we just built all combined together!

First off we'll need a struct. It's a way to have multiple fields wrapped up into one and they can have different types. Let's take a look at what we'll be using here:

#[derive(Debug, PartialEq, Eq)]
struct Procedure {
    op: Op,
    args: Vec<String>
}

We're defining a struct and calling it a Procedure. It has two fields op and args. If we make a Procedure we need op to be of type Op and args to be a vector of the String type.

Now let's build our parser for it:

named!(procedure<&[u8], Procedure>,
    do_parse!(
        tag!("(")   >>
        op_type: op >>
        arguments: ws!(many0!(string)) >>
        tag!(")")   >>
        (Procedure { op: op_type, args: arguments  })
    )
);

Our procedure parser returns a Procedure struct. Here's how it works. First it eats the opening (. Notice how we didn't store the value at all here in this do_parse!()? Now we use our op parser and store the value in op_type. What about this line starting with arguments? Well first we want to get rid of any whitespace but what's this many0!()? It's saying we're going to use this parser on the inside as many times as possible but sometimes it might not return anything which is the 0 part. If we use many1!() we are saying match the parser on the inside once and possibly more then that. Since something like

(my-no-argument-procedure)

is possible in Scheme we need to account for the fact there might not always be arguments in a procedure call. Now we eat up the last thing in a procedure which is a ) then we create a Procedure struct and return it! Let's test it out to see if it works!

#[test]
fn procedure_test() {

    let procedure_num = Procedure {
        op: Op::Primitive(Prim::Add),
        args: vec!["1".to_string(), "2".to_string(), "3".to_string(), "4".to_string()]
    };

    let procedure_user = Procedure {
        op: Op::User(String::from("myprocedure")),
        args: Vec::new()
    };

    match procedure(b"(+ 1 2 3 4)") {
        IResult::Done(_,a) => assert_eq!(procedure_num, a),
        _ => panic!("Failed to parse primitive")
    }

    match procedure(b"(myprocedure)") {
        IResult::Done(_,a) => assert_eq!(procedure_user, a),
        _ => panic!("Failed to parse primitive")
    }
}

First we'll create our comparison objects to be used against the parser value and then test both of them. One has arguments the other does not and so we should be able to tell if the many0!() call worked as expected. Let's run those tests again!

michael@eva-01 ~/Code/Tutorials/schemers (git)-[parser] % cargo test
   Compiling schemers v0.1.0 (file:///home/michael/Code/Tutorials/schemers)
    Finished debug [unoptimized + debuginfo] target(s) in 1.80 secs
     Running target/debug/deps/schemers-9f33802606b3ac13

running 6 tests
test parser::procedure_test ... ok
test parser::specialform_op_parser ... ok
test parser::primitive_op_parser ... ok
test parser::op_parser ... ok
test parser::user_op_parser ... ok
test parser::string_parser ... ok

test result: ok. 6 passed; 0 failed; 0 ignored; 0 measured

It worked! We've built the beginning of our parser that actually can take a procedure and turn it into a Rust data type that we can then use to manipulate and evaluate later on. Pretty cool right?

Exercises

  1. We didn't implement all of the special form or primitive key words. Add the lists below to the proper enum and write test cases for them:

Primitive:

Special Form:

These aren't all of the rest of them but it should be enough for you to get a grasp of enums and writing test cases.

  1. Procedure has a field args that's just a vector of strings. This isn't really what we would want in a parser. There can be multiple types following it like numbers or other procedures. Create an enum Value that can handle String, Op, Procedure or integers (make it hold an i64 type). Modify Procedure's args field to hold a Vec<Value> instead and modify the parser to do that.

Conclusion

We covered a lot here. It might take you some time to digest that and it's fine. We covered Rust's struct and enum types, wrote test cases, built the beginnings of a parser and covered other smaller concepts along the way. I recommend hand typing these out rather then copying and pasting or just skimming over so you can get a feel for it. Also do the exercises. Two might be a bit more difficult but try it. You'd be surprised what you can accomplish. It's the only way you'll be able to learn a new language.

If you have questions you can either open up an issue on the repo itself or you can ping me on Twitter or by email. I would also suggest the #rust-beginners irc channel as well. If you found a flaw in the article you can open up a pull request on my website repo with the changes. If you do place a note that you made corrections if you want the credit.

All of this article's code is available at this commit. You can find the previous article here.