19/02/2021

Thinking In Rust Types

Rust is a programming language that requires you to change how you think.

To learn a new programming language, you need to learn it's flavour of syntax. For many, once you know the standard OOP paradigms of classes and inheritence, you can transfer between other languages with ease.

Rust requires learning an entirely new paradigm. You might have met some of this paradigm in functional programming. But you wont have met it all.

This is primarily because of lifetimes, but more on that later.


This blog series is going to try and teach you how to think in Rust, from the ground up.

Let's start, by asserting that rust is a type driven programming language. In many other programming languages, you may be driven by classes and object heirarchies. In rust, you are going to be driven by types and composition.

To start with that, we need to understand which types we can compose. To begin, let us look at our fundamental primitives.

fn main() {
    // Unsigned and signed numbers of varying bit width
    let unsigned_eight_bit_number: u8;
    let signed_eight_bit_number: i8;
    let unsigned_sixteen_bit_number: u16;
    let signed_sixteen_bit_number: i16;
    let unsigned_thirty_two_bit_number: u32;
    let signed-thirty_two_bit_number: i32;
    let unsigned_sixty_four_bit_number: u64;
    let signed_sixty_four_bit_number: i64;
    let unsigned_hundred_twenty_eight_bit_number: u128;
    let signed_hundred_twenty_eight_bit_number: i128;
    // Unsigned and signed number with architecture (32bit, 64bit) width
    let unsigned_architecture_sized_number: usize;
    let signed_architecture_sized_number: usize;
    // Floating point single and double precision numbers
    let single_precision_floating_point_number: f32;
    let double_precision_floating_point_number: f64;
    // Boolean type (u8 constrained to be 0 or 1)
    let boolean: bool;
    // A unicode scalar value (NOT JUST A U8)
    let unicode_scalar_value: char;
}

Rust then gives us a simple form of type composition. The untagged product type: a tuple. Ignoring the type jargon here, this means that a tuple of several types stores each of those types. Allowing you to compose as sequence of different types into a single type.

fn main() {
    // A tuple of three different types
    let char_int_and_bool: (char, i32, bool);
    // A tuple can even have one type
    let tuple_of_bool: (bool);
    // Or even none! This is normally called "unit" in rust
    let empty_tuple: ();
}

Finally, we have another kind of primitive sequence, arrays. These are fixed size sequences of a single type, with a fixed number of elements.

fn main() {
    // A sequence of 8 integers
    let sequence_eight_ints: [i32; 8];
    // A sequence of no booleans
    let sequence_no_bools: [bool; 0];
}

We've already seen one way to compose types in rust, but we have another. While tuples provide us an untagged product type (the type and members do not have names, and are just accessed in sequence: tuple.0, tuple.1, tuple.2 etc). We can instead define a struct, allowing us to define a named, tagged product type.

/// A product type of chars, ints, and bools
struct MyCustomType {
    character: char,
    integer: i32,
    boolean: bool
}

/// A product type with no constituent types (akin to a unit type tuple)
struct MyCustomTypeWithNoData { }
/// A better way to define the above
struct MyCustomUnitType;
/// As a structure can be a unit, it can also alias other tuples
struct MyCustomTuple(char, i32, bool);

fn main() {
    // A named structure of three different types
    let my_custom_data: MyCustomType;
    // These two have no data inside them
    let my_empty_data: MyCustomTypeWithNoData;
    let my_empty_data_two: MyCustomUnitType;
    // This has three, but the fields are `0`, `1`, and `2` again
    let my_custom_tuple: MyCustomTuple;
}

We're not done with composing types yet however, we have only covered product types, we need to also talk about sum types. While a product type allows us to compose multiple different types such that the data must hold all of their values, a sum type allows us to compose different types such that the data must hold exactly one of them. The types assigned within a tagged sum type must be valid tagged product type definitions.

/// Instances of this type can store either a character, integer, or unit
enum MySumType {
    Character {
        value: char
    },
    Integer(i32),
    Unit
}

/// As with tuples and structs, an enum can also contain no data
enum Never { }

fn main() {
    // Contains either character, integer, or unit data.
    let my_sum_data: MySumType;
    // Can never be created as an enum instance must contain exactly one type
    let never_creatable: Never;
}

This kind of sum type is named an enum, as it is used to replicate the enumeration feature present in many other languages. In other langauges, these will usually just be collections of constant numbers with helpful names. In rust, they present a higher level of type system guaruntee.

/// An enum defined with only unit values replicates other languages enums
enum MyEnumeration {
    First,
    Second,
    Third
}

fn main() {
    // This can store either `First`, `Second`, or `Third`
    let my_enum: MyEnumeration;
}

It is worth noting that with rusts tagged sum types, we are forced to provide internally named tagged product type representations, so that the compiler can use these to force us to guaruntee the internal structure and to know at all times what kind of value is stored in the type.

As it is possible to use a tuple for an untagged product type, we can define a union as an untagged sum type. However, as there are no tags, accessing any part of the union will be an unsafe operation, and thus this will be a story for another day when we talk about all the things you can do in unsafe rust.

Sometimes, you will want to defer the decision of which type to consume over until later, allowing you to write one type definition that can compose over multiple different types. This is done using generics.

/// Composes an integer, and another type `T`
struct IntegerAndOther<T> {
    integer: i32,
    other: T
}

/// Wraps an inner type in an outer single element named tuple
struct Wrapper<T>(T);

fn main() {
    // Composes an integer and another integer
    let integer_and_integer: IntegerAndOther<i32>;
    // Composes an integer and a boolean
    let integer_and_boolean: IntegerAndOther<bool>;
    // Puts a tuple of an integer and a character in a wrapper
    let wrapped_int_bool: Wrapper<(i32, bool)>;
}

We may wish to use these generic type definitions inside the inner product type definitions within our sum types.

/// Composes an integer, and an unknown `T`
enum IntegerOrOther<T> {
    Integer(i32),
    Other(T)
}

fn main() {
    // Must contain either an integer, or a boolean
    let int_or_bool: IntegerOrOther<bool>;
    // Must contain an integer tagged `Integer`, or one tagged `Other`
    let int_or_other_int: IntegerOrOther<i32>;
}

This is how rusts two most commonly used error handling types are defined.

enum Option<T> {
    Some(T),
    None
}
enum Result<T, E> {
    Ok(T),
    Err(E)
}

Hopefully, this gives you a taste of the power of rusts type system. In the next part, we're going to start introducing lifetimes, and talk about the primitives we missed, [] and str.