Understand Rust: Iteration

Table of Contents

You probably have seen some of these for loops after using Rust for a while. If you are new to Rust, chances are they can be a bit confusing.

for x in xs { }
for x in &xs { }
for x in &mut xs { }
for x in xs.into_iter() { }
for x in xs.iter() { }
for x in xs.iter_mut() { }

Why the hell does Rust have so many for loop variations?

In safe Rust, there are three common ways you can access a value: via an owner, a shared reference, or an exclusive reference. The same goes for collection items. That is, you can iterate over the items with different types of access (in a homogeneous manner), hence the different variations of for. Which variation to use depends on what type of access you need.

In fact, a for loop in Rust is syntactic sugar for using iterators which is otherwise verbose to write. To truly understand for loops in Rust, we need to take a peek into what is going on under the sugarcoat.

Sugar-free iteration
#

But before that, let us meet the leading roles central to the Rust iteration story.

// trait for iterator behavior
pub trait Iterator {
    // item type to be yielded
    type Item;

    // iteration logic
    fn next(&mut self) -> Option<Self::Item>;
}

// trait for `Iterator` creation
pub trait IntoIterator {
    type Item;
    type IntoIter: Iterator<Item = Self::Item>;

    // logic for creating an `Iterator`
    fn into_iter(self) -> Self::IntoIter;
}

At a certain stage during compilation, Rust code will be translated into some intermediate representation, in which for loops are expanded. You can try this command at home. Write some for loops and see the IR for yourself.

cargo +nightly rustc -- -Z unpretty=hir

// ------
// source

for x in xs { }
for x in &xs { }
for x in &mut xs { }
for x in xs.into_iter() { }
for x in xs.iter() { }
for x in xs.iter_mut() { }

// ---------------------------
// intermediate representation

// rough approximation of IR of the `for` loops above
{
    // Do you see the pattern?
    match <_ as IntoIterator>::into_iter(xs) {
    match <_ as IntoIterator>::into_iter(&xs) {
    match <_ as IntoIterator>::into_iter(&mut xs) {

    // Note that `IntoIterator::into_iter()` can also be called on `Iterator`s.
    match <_ as IntoIterator>::into_iter(xs.into_iter()) {
    match <_ as IntoIterator>::into_iter(xs.iter()) {
    match <_ as IntoIterator>::into_iter(xs.iter_mut()) {

        // `iter` is an `Iterator`
        // loop until `next()` returns `None`
        mut iter => loop {
            match <_ as Iterator>::next(&mut iter) {
                None => break,
                Some(x) => { }
            }
        }
    }
}

Under the hood, an Iterator is created with IntoIterator::into_iter(), which is used to iterate over the collection, by calling next() repeatedly until it returns None, indicating the end of the iteration.

Iterator triad
#

As you can tell from the expansion above, for loop is not magic that just works for collections. They require the collection type to have certain functions implemented. Now we can try to make some educated guesses about the minimum requirements for all those for variations to compile.

// say `xs` is a `Collection<T>`

// For all those `for` variations to work,
// `Collection<T>`, `&Collection<T>` and `&mut Collection<T>`
// should all implement `IntoIterator`.
match <_ as IntoIterator>::into_iter(xs) { }
match <_ as IntoIterator>::into_iter(&xs) { }
match <_ as IntoIterator>::into_iter(&mut xs) { }

// Plus, `Collection<T>` should implement all three of
// `into_iter()`, `iter()` and `iter_mut()`.
// They all return an `Iterator`, yielding different kinds of access.
match <_ as IntoIterator>::into_iter(xs.into_iter()) { }
match <_ as IntoIterator>::into_iter(xs.iter()) { }
match <_ as IntoIterator>::into_iter(xs.iter_mut()) { }

As for what specific Iterator types to be created by these functions, the std::collections types have established the following conventions.

+----------------------------------+----------+----------+
| Created By                       | Iterator | Yielding |
+----------------------------------+----------+----------+
| IntoIterator::into_iter(xs)      | IntoIter | T        |
| IntoIterator::into_iter(&xs)     | Iter     | &T       |
| IntoIterator::into_iter(&mut xs) | IterMut  | &mut T   |
| into_iter(self)                  | IntoIter | T        |
| iter(&self)                      | Iter     | &T       |
| iter_mut(&mut self)              | IterMut  | &mut T   |
+----------------------------------+----------+----------+

Accordingly, any iterable type should come with three Iterator types: IntoIter, Iter, and IterMut, each yielding the access of T, &T, and &mut T respectively. You can take a look at std::collections for concrete examples of how the iterators are used and implemented.

When you are creating your own iterable types, nothing will mandate you to implement the same set of functions and iterators. But you SHOULD follow the conventions, as those those are what other Rustaceans expect from an iterable type.

Having the Iterator triad provides separation of concerns such that we do not need to worry about the iteration details when designing the type, at the price of having a few extra names. This is also a good trade-off considering the variations of iterators needed, each of which may require different states. The std::collections::VecDeque iterators can be a great example of that.

Do it yourself
#

Let us try to write a Collection<T> with Iterator dummies such that it satisfies the requirements mentioned above, as an example of a minimal iterable type that does nothing but works with all those for variations.

// let us pretend it has items of type `T`
pub struct Collection<T> {
    // ghost state to legitimize `T`
    _item: PhantomData<T>,
}

// iterator types

pub struct IntoIter<T> {
    // By convention, `IntoIter` should consume the collection.
    // As a result, it will not be accessible after iteration.
    inner: Collection<T>,
}

pub struct Iter<'a, T> {
    _iter: PhantomData<&'a T>,
}

pub struct IterMut<'a, T> {
    _iter_mut: PhantomData<&'a T>,
}

impl<T> Collection<T> {
    // enable `for x in xs.iter()`
    pub fn iter(&self) -> Iter<'_, T> {
        Iter {
            _iter: PhantomData,
        }
    }

    // enable `for x in xs.iter_mut()`
    pub fn iter_mut(&mut self) -> IterMut<'_, T> {
        IterMut {
            _iter_mut: PhantomData,
        }
    }
}

// enable `for x in xs`
impl<T> IntoIterator for Collection<T> {
    type Item = T;
    type IntoIter = IntoIter<T>;

    // enable `for x in xs.into_iter()`
    fn into_iter(self) -> Self::IntoIter {
        IntoIter {
            inner: self,
        }
    }
}

// enable `for x in &xs`
impl<'a, T> IntoIterator for &'a Collection<T> {
    type Item = &'a T;
    type IntoIter = Iter<'a, T>;

    fn into_iter(self) -> Self::IntoIter {
        self.iter()
    }
}

// enable `for x in &mut xs`
impl<'a, T> IntoIterator for &'a mut Collection<T> {
    type Item = &'a mut T;
    type IntoIter = IterMut<'a, T>;

    fn into_iter(self) -> Self::IntoIter {
        self.iter_mut()
    }
}

// it really is a dummy

impl<T> Iterator for IntoIter<T> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        None
    }
}

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a T;

    fn next(&mut self) -> Option<Self::Item> {
        None
    }
}

impl<'a, T> Iterator for IterMut<'a, T> {
    type Item = &'a mut T;

    fn next(&mut self) -> Option<Self::Item> {
        None
    }
}

Try playing with the snippet. By taking out some of the implementations, certain for variations over Collection<T> will complain during compilation. Hopefully, for loops in Rust will start to make sense.

Bonus
#

Do a quick experiment on implementing an iterable type that is for-compatible. You may notice that even if you do not implement IntoIterator for those Iterator types, they will still work.

for x in x.into_iter() { }
for x in x.iter() { }
for x in x.iter_mut() { }

// are translated into

match <_ as IntoIterator>::into_iter(xs.into_iter()) { ... }
match <_ as IntoIterator>::into_iter(xs.iter()) { ... }
match <_ as IntoIterator>::into_iter(xs.iter_mut()) { ... }

How?

There is only one plausible explanation - if you have not, someone else must have done it for you. This is one of the blanket implementations which Rust does to save us from writing the obvious.

// All types implementing `Iterator` will automatically
// have a default `IntoIterator` implementation.
impl<I: Iterator> IntoIterator for I {
    type Item = I::Item;
    type IntoIter = I;

    #[inline]
    fn into_iter(self) -> I {
        self
    }
}

Sugar-free iteration#

Iterator triad#

Do it yourself#

Bonus#

Sugar-free iteration
#

Iterator triad
#

Do it yourself
#

Bonus
#