Fundamental Types

The rest of this chapter covers Rust’s types from the bottom up, starting with simple numeric types like integers and floating-point values then moving on to types that hold more data: boxes, tuples, arrays, and strings.

Fixed-Width Numeric Types

Integer Types

u8: Rust uses the u8 type for byte values. For example, reading data from a binary file or socket yields a stream of u8 values.

char: Characters in Rust are 32 bits long.

Integer Literals:

  • Integer literals in Rust can take a suffix indicating their type: 42u8 is a u8 value, and 1729isize is an isize.

  • In the end, if multiple types could work, Rust defaults to i32 if that is among the possibilities.

  • The prefixes 0x, 0o, and 0b designate hexadecimal, octal, and binary literals.

Byte Literals:

char c1 = b'A';
char c2 = b'\\';
char c3 = b'\x1b'; // Emphasize that c3 is an ASCII code.

Type Casts: cast based on raw bits:

// Conversions that are out of range for the destination
// produce values that are equivalent to the original modulo 2^N, 
// where N is the width of the destination in bits. This
// is sometimes called "truncation."
assert_eq!( 1000_i16 as u8, 232_u8);
assert_eq!(65535_u32 as i16, -1_i16);
assert_eq!( -1_i8 as u8, 255_u8); assert_eq!( 255_u8 as i8, -1_i8);

Overflow: defined behavior (In C/C++, it is undefined behavior)

  • In debug build, Rust panics.

  • In release build, the operation wraps around: it produces the value equivalent to the mathematically correct result modulo the range of the value.

Different Mechanisms for Handling Overflows:

// Checked
assert_eq!(100_u8.checked_add(200), None);

// Do the addition; panic if it overflows.
let sum = x.checked_add(y).unwrap();

// Wrapping
// We get 250000 modulo 2^16.
assert_eq!(500_u16.wrapping_mul(500), 53392);

// Saturating
assert_eq!((-32760_i16).saturating_sub(10), -32768);

// Overflowing
// Return a tuple (result, overflowed)
assert_eq!(255_u8.overflowing_add(2), (1, true));

Floating-Point Types

Type Inference: If Rust finds that either floating-point type could fit a variable without type specified, it chooses f64 by default.

Floating Constants: The types f32 and f64 have associated constants for the IEEE-required special values like INFINITY, NEG_INFINITY (negative infinity), NAN (the not-a-number value), and MIN and MAX (the largest and smallest finite values):

assert!((-1. / f32::INFINITY).is_sign_negative()); 
assert_eq!(-f32::MIN, f32::MAX);

The std::f32::consts and std::f64::consts modules provide various commonly used mathematical constants like E, PI, and the square root of two.

Conversions: Unlike C and C++, Rust performs almost no numeric conversions implicitly. But you can always write out explicit conversions using the as operator: i as f64, or x as i32.

The bool Type

Conversion to Integer: Rust’s as operator can convert bool values to integer types:

assert_eq!(false as i32, 0); 
assert_eq!(true as i32, 1);

Characters

Rust’s character type char represents a single Unicode character, as a 32-bit value. You can write any Unicode character as '\u{HHHHHH}', where HHHHHH is a hexadecimal number up to six digits long.

Conversion:

  • For target types smaller than 32 bits, the upper bits of the character’s value are truncated.

  • u8 is the only type the as operator will convert to char. Every integer type other than u8 includes values that are not permitted Unicode code points.

Tuples

Indices: Tuples allow only constants as indices, like t.4. You can’t write t.i or t[i] to get the ith element.

Zero-tuple:

fn swap<T>(x: &mut T, y: &mut T);
// Equals
fn swap<T>(x: &mut T, y: &mut T) -> ();

Pointer Types

Rust is designed to help keep allocations to a minimum.

The value ((0, 0), (1440, 900)) is stored as four adjacent integers. If you store it in a local variable, you’ve got a local variable four integers wide. Nothing is allocated in the heap.

Three pointer types: references, boxes, and unsafe pointers.

References

The expression &x borrows a reference to x. Given a reference r, the expression *r refers to the value r points to. Like a C pointer, a reference does not automatically free any resources when it goes out of scope.


Unlike C:

  • Rust references are never null.

  • Rust tracks the ownership and lifetimes of values, so mistakes like dangling pointers, double frees, and pointer invalidation are ruled out at compile time.


Two Flavors of References:

  • &T: as with const T* in C.

  • &mut T: as long as the reference exists, you may not have any other references of any kind to that value.

Boxes

The simplest way to allocate a value in the heap is to use Box::new​.

let t = (12, "eggs");
let b = Box::new(t); // allocate a tuple in the heap

When b goes out of scope, the memory is freed immediately, unless b has been moved —by returning it, for example.

Raw Pointers

Rust also has the raw pointer types *mut T and *const T.

Arrays, Vectors, and Slices

  • The type [T; N] represents an array of N values, each of type T. An array’s size is a constant determined at compile time and cannot be changed.

  • The type Vec<T>, called a vector of Ts, is a dynamically allocated, growable sequence of values of type T. A vector’s elements live on the heap, so you can resize vectors at will.

  • The types &[T] and &mut [T], called a shared slice of Ts and mutable slice of Ts, are references to a series of elements that are a part of some other value, like an array or vector. A mutable slice &mut [T] lets you read and modify elements, but can’t be shared; a shared slice &[T] lets you share access among several readers, but doesn’t let you modify elements.

Arrays

let lazy_caterer: [u32; 6] = [1, 2, 4, 7, 11, 16]; 
let taxonomy = ["Animalia", "Arthropoda", "Insecta"];
let mut sieve = [true; 10000];
let buf = [0u8, 1024];

The useful methods you’d like to see on arrays—iterating over elements, searching, sorting, filling, filtering, and so on—are all provided as methods on slices, not arrays. But Rust implicitly converts a reference to an array to a slice when searching for methods, so you can call any slice method on an array directly.

let mut chaos = [3, 5, 4, 1, 2]; 
chaos.sort();

Vectors

let mut primes = vec![2, 3, 5, 7];
let v: Vec<i32> = (0..5).collect();

// You’ll often need to supply the type 
// when using collect (as we’ve done here 
// by specifying Vec<i32>), because it can 
// build many different sorts of collections, 
// not just vectors.

A Vec<T> consists of three values:

  • A pointer to the heap-allocated buffer for the elements, which is created and owned by the Vec<T>

  • The number of elements that buffer has the capacity to store

  • The number it actually contains now (in other words, its length).


Instead of Vec::new you can call Vec::with_capacity to create a vector with a buffer large enough to hold them all. Then the overhead of reallocation is mitigated.


The pop method will remove the last element and return an Option<T>.

Slices

Slices are always passed by reference. A reference to a slice is a fat pointer: a two-word value comprising a pointer to the slice’s first element, and the number of elements in the slice.

let v: Vec<f64> = vec![0.0, 0.707, 1.0, 0.707]; 
let a: [f64; 4] = [0.0, -0.707, -1.0, -0.707];
let sv: &[f64] = &v; 
let sa: &[f64] = &a;

A reference to the slices offers the possibility that we can write a function to operate on vectors and arrays at the same time.

[!important]

A reference to a vector is intrinsically different from a reference to a slice, though Rust can implictly convert a reference to a vector to a reference to a slice.

fn main() {
    println!("The size of a vector is {}", std::mem::size_of::<Vec<i32>>());
    println!("The size of a reference to a vector is {}", std::mem::size_of::<&Vec<i32>>());
    println!("The size of a reference to a slice is {}", std::mem::size_of::<&[i32]>());
}

Gives:

The size of a vector is 24
The size of a reference to a vector is 8
The size of a reference to a slice is 16

String Types

In C++, there are two types of strings: const char * and std::string.

String Literals

println!("In the room the women come and go,
      Singing of Mount Abora");
println!("It was a bright, cold day in April, and \
      there were four of us—\
      more or less.");

will yield:

In the room the women come and go,
      Singing of Mount Abora
It was a bright, cold day in April, and there were four of us—more or less.

No escape sequences are recognized in raw strings:

let default_win_install_path = r"C:\Program Files\Gorillas"; 
let pattern = Regex::new(r"\d+(\.\d+)*");

If you want to add " in a raw string, you should write in this way:

println!(r###"
      This raw string started with 'r###"'.
      Therefore it does not end until we reach a quote mark ('"')
      followed immediately by three pound signs ('###'):
"###);

Byte Strings

A string literal with the b prefix is a byte string, which means it is neither String nor &str since it does not operate on UTF-8. Such a string is a slice of u8 values—that is, bytes—rather than Unicode text:

let method = b"GET";
assert_eq!(method, &[b'G', b'E', b'T']);

Raw byte strings start with br".

Strings in Memory

Rust strings are sequences of Unicode characters, stored using UTF-8. Each ASCII character in a string is stored in one byte. Other characters take up multiple bytes.

A String has a resizable buffer holding UTF-8 text while a &str (pronounced “stir” or “string slice”) is a reference to a run of UTF-8 text owned by someone else: it “borrows” the text.

A String or &str’s .len() method returns its length. The length is measured in bytes, not characters.

It is impossible to modify a &str. For creating new strings at run time, use String.

The type &mut str does exist, but it is not very useful, since almost any operation on UTF-8 can change its overall byte length, and a slice cannot reallocate its referent. In fact, the only operations available on &mut str are make_ascii_uppercase and make_ascii_lowercase, which modify the text in place and affect only single-byte characters, by definition.

Create Strings

  • .to_string() converts a &str to a String

  • format!() macro works just like println!(), except that it returns a new String

  • Arrays, slices, and vectors of strings have two methods, .concat() and .join(sep), that form a new String from many strings.

    let bits = vec!["veni", "vidi", "vici"]; 
    assert_eq!(bits.concat(), "venividivici"); 
    assert_eq!(bits.join(", "), "veni, vidi, vici");

Other String-Like Types

Rust’s solution is to offer a few string-like types for these situations:

  • Stick to String and &str for Unicode text.

  • When working with filenames, use std::path::PathBuf and &Path instead.

  • When working with binary data that isn’t UTF-8 encoded at all, use Vec<u8> and &[u8].

  • When working with environment variable names and command-line arguments in the native form presented by the operating system, use OsString and &OsStr.

  • When interoperating with C libraries that use null-terminated strings, use std::ffi::CString and &CStr.

Type Aliases

type Bytes = Vec<u8>;

Last updated