Fundamental Types
The rest of this chapter covers Rust’s types from the bottom up, starting with simple numeric types like integers and floating-point values then moving on to types that hold more data: boxes, tuples, arrays, and strings.
Fixed-Width Numeric Types
Integer Types
u8
: Rust uses the u8
type for byte values. For example, reading data from a binary file or socket yields a stream of u8
values.
char
: Characters in Rust are 32 bits long.
Integer Literals:
Integer literals in Rust can take a suffix indicating their type:
42u8
is au8
value, and1729isize
is anisize
.In the end, if multiple types could work, Rust defaults to
i32
if that is among the possibilities.The prefixes
0x
,0o
, and0b
designate hexadecimal, octal, and binary literals.
Byte Literals:
Type Casts: cast based on raw bits:
Overflow: defined behavior (In C/C++, it is undefined behavior)
In debug build, Rust panics.
In release build, the operation wraps around: it produces the value equivalent to the mathematically correct result modulo the range of the value.
Different Mechanisms for Handling Overflows:
Floating-Point Types
Type Inference: If Rust finds that either floating-point type could fit a variable without type specified, it chooses f64
by default.
Floating Constants: The types f32
and f64
have associated constants for the IEEE-required special values like INFINITY
, NEG_INFINITY
(negative infinity), NAN
(the not-a-number value), and MIN
and MAX
(the largest and smallest finite values):
The std::f32::consts
and std::f64::consts
modules provide various commonly used mathematical constants like E
, PI
, and the square root of two.
Conversions: Unlike C and C++, Rust performs almost no numeric conversions implicitly. But you can always write out explicit conversions using the as operator: i as f64
, or x as i32
.
The bool
Type
bool
TypeConversion to Integer: Rust’s as
operator can convert bool values to integer types:
Characters
Rust’s character type char represents a single Unicode character, as a 32-bit value. You can write any Unicode character as '\u{HHHHHH}'
, where HHHHHH is a hexadecimal number up to six digits long.
Conversion:
For target types smaller than 32 bits, the upper bits of the character’s value are truncated.
u8
is the only type the as operator will convert tochar
. Every integer type other thanu8
includes values that are not permitted Unicode code points.
Tuples
Indices: Tuples allow only constants as indices, like t.4
. You can’t write t.i
or t[i]
to get the ith element.
Zero-tuple:
Pointer Types
Rust is designed to help keep allocations to a minimum.
The value ((0, 0), (1440, 900))
is stored as four adjacent integers. If you store it in a local variable, you’ve got a local variable four integers wide. Nothing is allocated in the heap.
Three pointer types: references, boxes, and unsafe pointers.
References
The expression &x
borrows a reference to x
. Given a reference r
, the expression *r
refers to the value r
points to. Like a C pointer, a reference does not automatically free any resources when it goes out of scope.
Unlike C:
Rust references are never null.
Rust tracks the ownership and lifetimes of values, so mistakes like dangling pointers, double frees, and pointer invalidation are ruled out at compile time.
Two Flavors of References:
&T
: as withconst T*
in C.&mut T
: as long as the reference exists, you may not have any other references of any kind to that value.
Boxes
The simplest way to allocate a value in the heap is to use Box::new
.
When b goes out of scope, the memory is freed immediately, unless b has been moved —by returning it, for example.
Raw Pointers
Rust also has the raw pointer types *mut T
and *const T
.
Arrays, Vectors, and Slices
The type
[T; N]
represents an array ofN
values, each of typeT
. An array’s size is a constant determined at compile time and cannot be changed.The type
Vec<T>
, called a vector ofT
s, is a dynamically allocated, growable sequence of values of typeT
. A vector’s elements live on the heap, so you can resize vectors at will.The types
&[T]
and&mut [T]
, called a shared slice of Ts and mutable slice of Ts, are references to a series of elements that are a part of some other value, like an array or vector. A mutable slice&mut [T]
lets you read and modify elements, but can’t be shared; a shared slice&[T]
lets you share access among several readers, but doesn’t let you modify elements.
Arrays
The useful methods you’d like to see on arrays—iterating over elements, searching, sorting, filling, filtering, and so on—are all provided as methods on slices, not arrays. But Rust implicitly converts a reference to an array to a slice when searching for methods, so you can call any slice method on an array directly.
Vectors
A Vec<T>
consists of three values:
A pointer to the heap-allocated buffer for the elements, which is created and owned by the
Vec<T>
The number of elements that buffer has the capacity to store
The number it actually contains now (in other words, its length).
Instead of Vec::new
you can call Vec::with_capacity
to create a vector with a buffer large enough to hold them all. Then the overhead of reallocation is mitigated.
The pop
method will remove the last element and return an Option<T>
.
Slices
Slices are always passed by reference. A reference to a slice is a fat pointer: a two-word value comprising a pointer to the slice’s first element, and the number of elements in the slice.
A reference to the slices offers the possibility that we can write a function to operate on vectors and arrays at the same time.
[!important]
A reference to a vector is intrinsically different from a reference to a slice, though Rust can implictly convert a reference to a vector to a reference to a slice.
Gives:
String Types
In C++, there are two types of strings: const char *
and std::string
.
String Literals
will yield:
No escape sequences are recognized in raw strings:
If you want to add "
in a raw string, you should write in this way:
Byte Strings
A string literal with the b
prefix is a byte string, which means it is neither String
nor &str
since it does not operate on UTF-8. Such a string is a slice of u8 values—that is, bytes—rather than Unicode text:
Raw byte strings start with br"
.
Strings in Memory
Rust strings are sequences of Unicode characters, stored using UTF-8. Each ASCII character in a string is stored in one byte. Other characters take up multiple bytes.
A String
has a resizable buffer holding UTF-8 text while a &str
(pronounced “stir” or “string slice”) is a reference to a run of UTF-8 text owned by someone else: it “borrows” the text.
A String
or &str
’s .len()
method returns its length. The length is measured in bytes, not characters.
It is impossible to modify a &str
. For creating new strings at run time, use String
.
The type
&mut str
does exist, but it is not very useful, since almost any operation on UTF-8 can change its overall byte length, and a slice cannot reallocate its referent. In fact, the only operations available on &mut str are make_ascii_uppercase and make_ascii_lowercase, which modify the text in place and affect only single-byte characters, by definition.
Create Strings
.to_string()
converts a&str
to aString
format!()
macro works just likeprintln!()
, except that it returns a newString
Arrays, slices, and vectors of strings have two methods,
.concat()
and.join(sep)
, that form a newString
from many strings.
Other String-Like Types
Rust’s solution is to offer a few string-like types for these situations:
Stick to
String
and&str
for Unicode text.When working with filenames, use
std::path::PathBuf
and&Path
instead.When working with binary data that isn’t UTF-8 encoded at all, use
Vec<u8>
and&[u8]
.When working with environment variable names and command-line arguments in the native form presented by the operating system, use
OsString
and&OsStr
.When interoperating with C libraries that use null-terminated strings, use
std::ffi::CString
and&CStr
.
Type Aliases
Last updated