Rust Lang in a nutshell: 2# Enums, pattern matching and Options

This is the second part of Rust mini-series. Those of you who haven’t seen the first part yet are encouraged to take a look at it.

Before we begin, let me point out that the aim of the series is to give a gentle introduction to Rust programming language supported by some practical examples. Please note that this is not by any means a systematic introduction to Rust, but rather a coffee break story for those who heard a little bit about Rust and want to get better understanding of the features of the language.

For actual learners struggling hard with Rust these articles can be also of interest, mainly because of a possibly different perspective on selected topics or bringing up some details that may have escaped your attention while reading more comprehensive sources.

In this part we will discuss enums, pattern matching and, what is very characteristic of Rust, Options.

Enums

Rust enums, as in other programming languages, are data types consisting of a set of named variants. But what makes Rust enum truly powerful is that all enum variants can have data associated with it. Syntax for defining associated data is analogous to struct definition – you can use tuple notation, or name required fields. Fields can be of any type.

enum State {
    Disconnected,
    Connected (IpAddr)
    Authorized {service: String, auth: u128},
    Pending,
}

You can think of enums as unions, but unions that actually know which variant they hold at the moment. Rust enums are an example of an algebraic type or, to be more specific, a sum type. This is because all possible enum values are the sum of all enum variants possible values.

Let’s use enums. To refer to a variant we use:

let mut s = State::Connected(IpAddr::from_str("127.0.0.1").unwrap());
...
s =  State::Authorized{service: "/test".to_string(),
                                     auth: 260272236571964567899907446074317682114};
…
s = State::Disconnected;

To see what value enum contains we can’t simply use comparison. In general, we can’t write:

if s == State::Disconnected { //in general this won’t work
    println!("Disconnected");
}

This is because enums do not implement trait PartialEq which is required to allow for comparisons (we will write about traits later).

Note1: in some cases, comparisons made as in the example above may be useful (although they are certainly not idiomatic). To allow this we have to implement PartialEq which fortunately may be done automatically for us by #[derive] attribute:

#[derive(PartialEq)]
enum State {Disconnected}
..
//now we can use
if s == State::Disconnected {
    println!("Disconnected");
}

Attribute #[derive] is used to automatically implement traits for data type. Many commonly used traits (like Debug, Error, PartialEq, Copy, Clone, Ord, ...) can be derived in this way. You can also define your own derive macro which can automatically generate implementation for your own traits.

To see what value enum contains in an idiomatic rust way, we will use pattern matching.

Pattern Matching

Match is an extremely useful language feature. It uses pattern matching to choose from alternative code branches and allows for programming in a clear way, without using multiple if/else clauses.

Using match we can check enums variants and their associated data:

match s {
    State::Disconnected => println!("Disconnected"),
    State::Connected(x) => println!("Connected to {}", x),
    State::Authorized{service: s, auth: a}  => println!("Authorized to {} with {}", s, a),
    _ => println!("unused!")
}

Pattern matching is exhaustive which means all possible values have to be served. When you are interested only in selected values you can use _ to denote ‘all other options’.

But match is not only used to check on enums. You can use match with any data type. Moreover, in pattern matching there are some exciting features, for instance we can:

enumerate values separated with |
use inclusive ranges denoting them with ..=
bind matched values with @ and use them later in matched code branch
use additional conditions (guards) with if statement

Take a look at the following examples:

match x {
    1 => println!("1"),
    2|3 => println!("2,3"),
    4 ..= 7 => println!("4,5,6,7"),
    e @ 11 ..= 15 => println!("element from range 11 ... 15 is {}", e),
    20 ..= 100 if x % 2 == 0 => println!("even element from range 20...100"),
    _ => println!("all other")
}

Note1: if we bind the matched value (above with e variable) we can overshadow the existing bindings.

Note2: Ranges are only allowed on numeric and char values, so we can write also:

match y {
	'A' ..= 'Z' =>  println!("Upper case letter"),
    'z' ..= 'z' =>  println!("Lower case letter"),
    _ => println!("sth else")
}

Note3: overlapping ranges are checked, so you can’t for instance do that:

 match x {
 	1|2 => println!("first option!"),
	2..=3 => println!("second option!"), //this will fail! 2 is included in previous match
	_ => println!("all other")
}

But beware, patterns are not checked for overlap in general, and the first match is used:

let x = 4;
match x {
	4 => println!("number 4"),
	1..=100 if x % 2 == 0 => println!("even element from range 1...100"),
	_ => println!("all other")
}

The code above results in number 4, but if we change the order of the matches then:

match x {
	1..=100 if x % 2 == 0 => println!("even element from range 1...100"),
	4 => println!("number 4"),
	_ => println!("all other")
}

we will get even element from range 1...100

Pattern matching can also be used for unwrapping structure fields (like in the previous enum example) and what is more, we can even skip unused fields in various ways using:

_ for unused variable
.. if we want to skip all unused variables

All the matches below are correct, and we can fire any of them by changing their order.

struct Color{r: u8, b: u8, g:u8}
…
match c {
        Color{b, g, ..} => println!("Color components: g, b: {} {}", g,b),
        Color{g, ..} => println!("Color component: g: {}", g),
        Color{r, ..} => println!("Color component: r: {}", r),
        Color{r: rr, g:_, b: bb} => println!("Color components: r, b {},{}", rr,bb),
        Color{r,g,b} => println!("Color r:{} g:{} b:{}", r,g,b)
}

Note that the matches „overlap” in the sense many of them can match at the same time. The compiler does not forbid this. As we have already mentioned, the match conditions overlapping check applies only to numeric and char ranges.

Match is an expression and as such can be used to return a value:

let c = Color{r: 255,g:0,b:0 };
let y: u8 = match c {
    Color{r, g, b} if r > 0 => (0.299*(r as f32)+0.587*(g as f32)+0.114*(b as f32)) as u8,
    _ => 0u8,
};
println!("gray scale value: {}", y);

Which gives: gray scale value: 76. In this example gray scale level is calculated if r component is nonzero, otherwise the match returns 0u8 (this type of indicating suffix syntax we can use for numeric literals).

Note 4: The above example looks like field destructuring, but if this is your main purpose you should rather use let syntax:

let c = Color{r: 255,g:0,b:0 };
…
//now destructuring:
let Color{r, g, b} = c;
let y= (0.299*(r as f32)+0.587*(g as f32)+0.114*(b as f32)) as u8;

println!("gray scale value: {}", y);

Note 5: Rust is explicit about used types and in cases such as before numeric type conversions are required – which is done with the usage of keyword ‘as’.

If we have tuple struct then matching looks like that:

struct Color(u8,u8,u8);

let c = Color(255,0,0);
let r: u8 =match c {
    Color(x,..) if x > 0 => x,
     _ => 0u8,
};
println!("color r component: {}", r);

Option

Rust uses Option type to take care of values that may not exist in your program. Option is an enum type and is defined in the standard library as follows:

pub enum Option<T> {
    None,
    Some(T),
}

Generics syntax is used in this definition, where T represents any type. Let’s say we have Option<u8> and we will use it to represent u8 values which sometimes can be not available. If a value is not present, we will use None variant. If a value is known, let’s say it equals 5, we will use Some(5) variant.

Note1: Option<T> is a generic type and will be instantiated for all types we actually use with Option and those instantiated types aren’t equal. For example, Option<u8> is not the same as Option<Color>.

Why does Rust introduce Options after all? The main reason is to ensure that all variables always have meaningful values and therefore you can safely refer to them. Although Option introduces an additional layer of abstraction, it forces the language type system to work for us in dealing with possibly non-existent values, saving us from accidental null pointer dereference.

Other languages in such cases allow to use null or not initialized values. But then, when you look at the code nothing helps you to understand if a variable at given point is null or has a value. This dilemma led to introduce pointer and reference distinction in languages like C++. Pointers can be null, therefore you have to program in a defensive way, checking if the pointed value exists. In the case of references, you have to initialize variables and can’t assign null values, but then you are given a guarantee that reference is not null (at least in a well-defined program).

As you may notice, wrapping values in Option is again a very characteristic approach for Rust – when you need a certain - possibly risky feature (null values can be real cures of your code) you must be explicit about it.

Let’s finally use Options. You can directly assign option variables with None or Some(value) when you want to wrap value in Option.

struct Customer {
   name: String,
   company: Option<String>
}

fn main() {
    let mut text: Option<String> = Some("Hello, world!".to_string());
    text = None;

    let c = Customer{name: "John".to_string(), company: None};
}

Note2: You may wonder why we do not use Option::None and Option::Some, as it is in the case of any other enums. The reason for that is Rust imports for you many commonly used symbols defined in std::prelude module, one of such imports is std::option::Option::{self, Some, None}.

You can use Option in function/method to indicate the absence of returned value

fn divide(n: i32, d: i32) -> Option<i32> {
    if d == 0 {None} else {Some(n/d)}
}

Even in this simple example Option is used as a simple error handling mechanism in which, when a function defaults, it returns None – a method well known from good old C where we return 0 or -1 in the case of error.

With Options we can also deal with optional function parameters, although then all calls will require some additional boilerplate code:

fn optional_arg_fn(data: &Option<String>){}
..
optional_arg_fn(None);
optional_arg_fn(Some("Hello, world!".to_string()));

Note3: There are some difficulties associated with this approach, mainly:

value has to be wrapped in Option type
you have to provide an argument even if it is None

You can address both of these problems in a different way.

Option wrapping tends to stay explicit, therefore you will see occasionally: Some( _some_value_ ) passed as function arguments in your code. The pain can be relieved a little with the implementation of Default and Into Option traits for the type you use as an optional parameter. Unfortunately, there is also a cost here that can outweigh benefits. Function signatures have to go generic like that:

fn optional_arg_fn<T: Into< Option<String> > >(data: T){}

But the real-world solution here is not to use optional parameters at all!

To deal with multiple optional parameters you will find builder pattern very useful. This pattern allows you to build objects with convenient methods that can set incrementally all parameters you require at the moment – see – no need for optional parameters!

struct Foo{} // type we want to build
struct FooBuilder{} //builder

let f : Foo = FooBuilder::new()
             .with_baz(...)
             .with_bar(...)
             .from_foo(...)
             .build();

Builder pattern requires some effort to implement, and as such it does not provide the answer for tooling up functions with a single or small number of optional parameters. In such cases if you really don’t want to expose Option in your API, you can simply hide optionality by providing additional public functions:

//private function
fn foo_optional_data(required: u32, data: Option<String>){} 

//public api functions
pub fn foo(required: u32){
    foo_optional_arg(required, None);
}

pub fn foo_with_data(required: u32, data: String){
    foo_optional_arg(required, Some(data));
}

Now we have an idea how to declare Option usage and how to wrap data into Options, but how to check Options for wrapped data? How to unwrap it and is it always necessary?

To check on wrapped value we can use pattern matching

match divide(10,2) {
        None => println!("Cant divide!"),
        Some(d) => println!("result is: {}", d),
}

When we need better flow control, we can use let Some expression.

if let Some(d) = divide(10,2) {
    println!("result is: {}", d);
}

Note 4: We can also unwrap Option explicitly to take a value, but it will panic and end your program immediately in case of None value.

let text: Option<String> = Some("Hello, world!".to_string());
println!("value is: {}", text.unwrap());

You absolutely don't want to do it unless you have an unrecoverable error. And even then - think twice, because what seems to you as an unrecoverable error, can look like a practical problem to be handled for your API user. Option unwrapping may be useful though in the case of small pet projects.

Option often don’t require unwrapping at all. Standard library provides handy methods (combinators) to work with options. Most important of them are map and and_then.

Method map allows us to apply function f : T → U to Option<T> and return Option<U>. It is implemented as:

pub fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Option<U> {
    match self {
        Some(x) => Some(f(x)),
        None => None,
    }
}

For now, please skip the intricacies of method signature which is mostly based on generic syntax.

You can think of map as a convenient method to apply to Option wrapped value function which has always a result (cannot fail thus has no Option as a result). When you need to use a function that can possibly produce no value (in other words has Option as a result type) use and_then combinator.

Method and_then takes a function f: T → Option<U> and allows it to be applied on Option<T> implementation looks like that:

pub fn and_then<U, F: FnOnce(T) -> Option<U>>(self, f: F) -> Option<U> {
    match self {
        Some(x) => f(x),
        None => None,
    }
}

It’s worth noticing that chained calls of multiple and_then and map give None if None is produced at any step.

Combinators and_then and map but also many others (defined in libcore/option.rs of Rust sources) allow to build sophisticated pipelines without explicitly unwrapping Option values in your code.

//simple pipeline build from map combinators
fn compute_or_none(n: Option<i32>) -> Option<String> {
    n.map(|x| x*x)
     .map(|x| x+1)
     .map(|x| format!("{:x}", x) )
}
…

let divisor = 2;
if let Some(d) = compute_or_none(divide(100,divisor)) {
        println!("result is: {}", d);
}else{
        println!("result is: None");
}

The pipeline compute_or_none with the use of a sequence of map transformations calculates (x*x+1) and then represents the outcome as a hexadecimal number. All that without even thinking of Option value.

Note that when the divisor is set to 2 (as above) the code execution gives: result is: 9c5, but when we set it to 0 we will get: result is: None.

Note 5: Syntax |x| x+1 is used to define closure – a simple unnamed function which can also use other variables available in current scope. Of course, combinators can use named functions in which case we would have

fn square(x: i32)->i32{x*x}
fn square_or_none(n: Option<i32>) -> Option<String> {
    n.map(square) //use of named function
     .map(|x| x+1)
     .map(|x| format!("{:x}", x) )
}

In the case of transformations that can have no result, as it was said earlier, we should use and_then in our pipelines. Let’s look at the example:

fn compute_or_none(n: Option<i32>) -> Option<String> {
    n.map(|x| x as f64)
    .and_then(|x| {
        match x {
            x if x >= 0f64 => Some(x.sqrt()),
            _ => None
        }
    }).map(|x| format!("{}", x) )
}
…
let divisor = 10;
if let Some(d) = compute_or_none(divide(100,divisor)) {
    println!("result is: {}", d);
}else{
    println!("result is: None");
}

Again, if we set the divisor to 10 above, we will get result is: 3.1622776601683795. But if we set the divisor to 0 (result of divide will be None) or if we use a negative divisor (result of and_then will be None) the outcome of the whole compute_or_none pipeline will be None and the code execution will give: result is: None.

Complex applications pipelines, which are illustrated by those somewhat artificial examples, can be hundreds of lines long and, what is important here, no explicit Option unwrapping is used.

That’s all for this part. Stay tuned, the next article coming soon!

You can follow us on Twitter and LinkedIn to be up to date with new posts.

Products

All services

Rust Lang in a nutshell: 2# Enums, pattern matching and Options

Enums

Pattern Matching

Option