Traits and Generics

Polymorphism: 多态 in Chinese. Rust supports polymorphism with two features: traits and generics.

use std::io::Write;

fn say_hello(out: &mut dyn Write) -> std::io::Result<()> {
  out.write_all(b"hello world\n")?;
  out.flush()
}

The type of out is &mut dyn Write, meaning “a mutable reference to any value that implements the Write trait.”

Using Traits

Examples:

  • A value that implements std::io::Write can write out bytes.

  • A value that implements std::iter::Iterator can produce a sequence of values.

  • A value that implements std::clone::Clone can make clones of itself in memory.

  • A value that implements std::fmt::Debug can be printed using println!() with the {:?} format specifier.

The trait itself must be in scope. Otherwise, all its methods are hidden. Some traits has been imported implicitly as part of the standard prelude.

use std::io::Write;

let mut buf: Vec<u8> = vec![];
buf.write_all(b"hello")?;  // ok

The calls to write_alls are statically decided, resulting in low overhead. However, the calls through &mut dyn Write incur the overhead of a dynamic dispatch, also known as a virtual method call, which is indicated by the dyn keyword in the type.

There are two ways of using traits to write polymorphic code in Rust: trait objects and generics.

Trait Objects

Rust doesn’t permit variables of type dyn Write:

use std::io::Write;

let mut buf: Vec<u8> = vec![];
let writer: dyn Write = buf;  // error: `Write` does not have a constant size
let writer: &mut dyn Write = &mut buf;  // ok

A variable’s size has to be known at compile time, and types that implement Write can be any size. (Since dyn Write could potentially refer to any type implementing the Write trait, its size is not fixed).

A reference to a trait type, like writer, is called a trait object. Like any other reference, a trait object points to some value, it has a lifetime, and it can be either mut or shared.

What makes a trait object different is that Rust usually doesn’t know the type of the referent at compile time.

Memory Representation

A trait object in memory is a fat pointer that includes two components: a pointer to the actual value and a pointer to a table that represents the value's type. Because of this, a trait object occupies two machine words. The trait object (indicated by &dyn, which is a fat pointer) differs from a regular reference, which is just a bare pointer. In other words, &dyn and & are distinct types.

In C++, the vtable pointer, or vptr, is stored as part of the struct. Rust uses fat pointers instead. This way, a struct can implement dozens of traits without containing dozens of vptrs.

Auto Conversion

Rust automatically converts ordinary references into trait objects when needed.

let mut local_file = File::create("hello.txt")?;
say_hello(&mut local_file)?;

The type of &mut local_file is &mut File, and the type of the argument to say_hello is &mut dyn Write. Since a File is a kind of writer, Rust allows this, automatically converting the plain reference to a trait object.

Likewise, Rust will happily convert a Box<File> to a Box<dyn Write>, a value that owns a writer in the heap:

let w: Box<dyn Write> = Box::new(local_file);

Box<dyn Write>, like &mut dyn Write, is a fat pointer: it contains the address of the writer itself and the address of the vtable. The same goes for other pointer types, like Rc<dyn Write>.

Generic Functions and Type Parameters

fn say_hello<W: Write>(out: &mut W) -> std::io::Result<()> {
    out.write_all(b"hello world\n")?;
    out.flush()
}

When you pass &mut local_file to the generic say_hello() function, you’re calling say_hello::<File>(). Rust infers the type W from the type of the argument and generate machine code for the calls to the corresponding versions of functions. This process is known as monomorphization (单态化), and the compiler handles it all automatically.


use std::hash::Hash;
use std::fmt::Debug;

fn top_ten<T: Debug + Hash + Eq>(values: &Vec<T>) { ... }

A generic function can have both lifetime parameters and type parameters. Lifetime parameters come first.

/// Return a reference to the point in `candidates` that's
/// closest to the `target` point.
fn nearest<'t, 'c, P>(target: &'t P, candidates: &'c [P]) -> &'c P
    where P: MeasureDistance
{
    ...
}

In addition to types and lifetimes, generic functions can take constant parameters as well.

fn dot_product<const N: usize>(a: [f64; N], b: [f64; N]) -> f64 {
    let mut sum = 0.;
    for i in 0..N {
        sum += a[i] * b[i];
    }
    sum
}

Trait Objects or Generic Code?

Sometimes you want to manage a group of objects of different types - but implementing the same trait, it's not a good idead to use generic code, which is hard to express "objects of different types". Check out the example from the book:

trait Vegetable {
    ...
}

struct Salad<V: Vegetable> {
    veggies: Vec<V>
}

The Vec can only hold objects of the same type V, which might be IcebergLettuce, which is not ideal for the need that veggies should contain vegetables of different types.

In contrast, you can use trait objects:

struct Salad {
    veggies: Vec<Box<dyn Vegetable>>
}

Use trait objects can also reduce the total amount of compiled code.


Outside of situations involving salad or low-resource environments, generics have three important advantages over trait objects, resulting in generics being the more common choice.

  • Speed: The dyn keyword isn’t used because there are no trait objects—and thus no dynamic dispatch—involved.

  • Not every trait can support trait objects

  • Easy to bound a generic type parameter with several traits at once: types like &mut (dyn Debug + Hash + Eq) aren’t supported in Rust.

Defining and Implementing Traits

trait Visible {
  fn draw(&self, canvas: &mut Canvas);
}

impl Brrom {
  /// Helper function used by Broom::draw() below.
}

impl Visible for Broom {
    fn draw(&self, canvas: &mut Canvas) {
      ...
  	}
}

Traits can also include default implementation.

You can use a generic impl block to add an extension trait to a whole family of types at once.

Self in Traits

pub trait Spliceable {
    fn splice(&self, other: &Self) -> Self;
}

Using Self as the return type here means that the type of x.clone() is the same as the type of x.

impl Spliceable for CherryTree {
    fn splice(&self, other: &Self) -> Self {
        ...
    }
}

Self is an alias for the type of struct.

A trait that uses the Self type is incompatible with trait objects.

// error: the trait `Spliceable` cannot be made into an object
// This calls back to the statement that not every trait can support trait objects.
fn splice_anything(left: &dyn Spliceable, right: &dyn Spliceable) {
    let combo = left.splice(right);
    // ...
}

Rust rejects this code because it has no way to type-check the call left.splice(right). The whole point of trait objects is that the type isn’t known until run time. Rust has no way to know at compile time if left and right will be the same type, as required.

But we can design a trait-object-friendly trait:

pub trait MegaSpliceable {
    fn splice(&self, other: &dyn MegaSpliceable) -> Box<dyn MegaSpliceable>;
}

Subtraits

Say that Creature is a subtrait of Visible, and that Visible is Creature’s supertrait. Subtraits extend the functionality of their supertraits. Every type that implements Creature must also implement the Visible trait.

/// Someone in the game world, either the player or some other
/// pixie, gargoyle, squirrel, ogre, etc.
trait Creature: Visible {
    fn position(&self) -> (i32, i32);
    fn facing(&self) -> Direction;
    ...
}

impl Visible for Broom {
    ...
}

impl Creature for Broom {
    ...
}

The syntax of subtraits can be experssed in the following fashion also:

trait Creature where Self: Visible {
    ...
}

Type-Associated Functions

Traits can include type-associated functions, Rust’s analog to static methods.

trait StringSet {
    /// Return a new empty set.
    fn new() -> Self;

    /// Return a set that contains all the strings in `strings`.
    fn from_slice(strings: &[&str]) -> Self;

    /// Find out if this set contains a particular `value`.
    fn contains(&self, string: &str) -> bool;

    /// Add a string to this set.
    fn add(&mut self, string: &str);
}

from_slice and new don't take a self argument. These functions can be called using :: syntax, just like any other type-associated function.

Fully Qualified Method Calls

Fully qualified method calls tell your intention precisely by specifying the exact method we can calling.

"hello".to_string()
str::to_string("hello")
ToString::to_string("hello")
<str as ToString>::to_string("hello") // This is the fully qualified method call.

The . operator does not say exactly which to_string method we are calling and Rust has a method to lookup algorithm that figures this out.

Application:

  • There are methods with the same name from different traits:

    outlaw.draw();  // error: draw on screen or draw pistol?
    
    Visible::draw(&outlaw);  // ok: draw on screen
    HasPistol::draw(&outlaw);  // ok: corral
  • When the type of the self argument can’t be inferred:

    let zero = 0;  // type unspecified; could be `i8`, `u8`, ...
    zero.abs();  // error: can't call method `abs`
                 // on ambiguous numeric type
    i64::abs(zero);  // ok
  • When using the function itself as a function value:

    let words: Vec<String> =
        line.split_whitespace()  // iterator produces &str values
            .map(ToString::to_string)  // ok
            .collect();
  • When calling trait methods in macros.

Traits That Define Relationships Between Types

The standard Iterator trait of Rust:

pub trait Iterator {
  type Item;
  
  fn next(&mut self) -> Opiton<Self::Item>;
}
  • type Item; is an associated type. Each type that implements Iterator must specify what type of item it produces.

To implement Iterator for a type:

// (code from the std::env standard library module)
impl Iterator for Args {
  type Item = String;
  
  fn next(&mut self) -> Option<String> {
    ...
  }
  ...
}

std::env::Args is the type of iterator returned by the standard library function std::env::args(). It produces String values.


Generic code can use associated types:

fn collect_into_vector<I: Iterator>(iter: I) -> Vec<I::Item> {
  let mut results = Vec::new();
  for value in iter {
    results.push(value);
  }
  results
}

Rust can infer the type of value inside the function, but we still need to sepcify the type of the return value. You can also write the generic code in the following manner:

fn dump<I>(iter: I) where I: Iterator<Item=String> {
  ...
}

// or

fn dump(iter: &mut dyn Iterator<Item=String>) {
  for (index, s) in iter.enumerate() {
    println("{}: {:?}", index, s);
  }
}

Generic Traits (or How Operator Overloading Works)

/// The std::ops::Mul trait, which allows types to support the `*` operator
pub trait Mul<RHS=Self> {
  /// The resulting type after applying the `*` operator
  type Output;
  
  /// The method that defines the behavior of the `*` operator
  fn mul(self, rhs: RHS) -> Self::Output;
}
  • As shown, the Self type is associated with and used by the definitions of traits.

  • While traits are typically used to enable polymorphism, they can themselves be designed with polymorphic behavior, allowing for multiple variants or implementations.

This is a generic trait, the instances of which correspond to the underlying type of RHS. For instance, Mul<String> and Mul<u64> are different types. Therefore, a single type—say, WindowSize—can implement both Mul<f64> and Mul<i32>, and many more.

The syntax RHS=Self means that RHS defaults to Self.

Return impl Trait

We can simplify the return type of a function by specifying the trait or traits that the return value implements:

fn cyclical_zip(v: Vec<u8>, u: Vec<u8>) -> impl Iterator<Item=u8> {
    v.into_iter().chain(u.into_iter()).cycle()
}

However, you cannot use impl Trait to implement the factory pattern directly:

trait Shape {
    fn new() -> Self;
    fn area(&self) -> f64;
}

fn make_shape(shape: &str) -> impl Shape {
    match shape {
        "circle" => Circle::new(),
        "triangle" => Triangle::new(),
        "rectangle" => Rectangle::new(),
    }
}

In the first example, the return type is clear because v and u are both Vec<u8>, so the function returns an iterator over u8. However, in the second example, the return type could be Circle, Triangle, or Rectangle, which means the return type is not uniquely determined.

Additionally, impl Trait can only be used in free functions (i.e., functions not associated with a trait) or in methods associated with specific types, but not in trait methods themselves. This limitation exists because the return type of a trait method must be explicitly known and consistent across all implementations of the trait.

Associated Consts

Like structs and enums, traits can have associated constants. You can declare a trait with an associated constant using the same syntax as for a struct or enum:

trait Greet {
  const GREETING: &'static str = "Hello";
  fn greet(&self) -> String;
}

This allows you to write generic code that uses these values:

fn add_one<T: Float + Add<Output=T>>(value: T) -> T {
  value + T::ONE
}

Associated constants can’t be used with trait objects, since the compiler relies on type information about the implementation to pick the right value at compile time.

Last updated