You probably have seen some of these for
loops after using Rust for a while.
If you are new to Rust, chances are they can be a bit confusing.
for x in xs { }
for x in &xs { }
for x in &mut xs { }
for x in xs.into_iter() { }
for x in xs.iter() { }
for x in xs.iter_mut() { }
Why the hell does Rust have so many for
loop variations?
In safe Rust, there are three common ways you can access a value: via an owner, a shared reference, or an exclusive reference.
The same goes for collection items.
That is, you can iterate over the items with different types of access (in a homogeneous manner), hence the different variations of for
.
Which variation to use depends on what type of access you need.
In fact, a for
loop in Rust is syntactic sugar for using iterators which is otherwise verbose to write.
To truly understand for
loops in Rust, we need to take a peek into what is going on under the sugarcoat.
Sugar-free iteration#
But before that, let us meet the leading roles central to the Rust iteration story.
// trait for iterator behavior
pub trait Iterator {
// item type to be yielded
type Item;
// iteration logic
fn next(&mut self) -> Option<Self::Item>;
}
// trait for `Iterator` creation
pub trait IntoIterator {
type Item;
type IntoIter: Iterator<Item = Self::Item>;
// logic for creating an `Iterator`
fn into_iter(self) -> Self::IntoIter;
}
At a certain stage during compilation, Rust code will be translated into some intermediate representation, in which for
loops are expanded.
You can try this command at home.
Write some for
loops and see the IR for yourself.
cargo +nightly rustc -- -Z unpretty=hir
// ------
// source
for x in xs { }
for x in &xs { }
for x in &mut xs { }
for x in xs.into_iter() { }
for x in xs.iter() { }
for x in xs.iter_mut() { }
// ---------------------------
// intermediate representation
// rough approximation of IR of the `for` loops above
{
// Do you see the pattern?
match <_ as IntoIterator>::into_iter(xs) {
match <_ as IntoIterator>::into_iter(&xs) {
match <_ as IntoIterator>::into_iter(&mut xs) {
// Note that `IntoIterator::into_iter()` can also be called on `Iterator`s.
match <_ as IntoIterator>::into_iter(xs.into_iter()) {
match <_ as IntoIterator>::into_iter(xs.iter()) {
match <_ as IntoIterator>::into_iter(xs.iter_mut()) {
// `iter` is an `Iterator`
// loop until `next()` returns `None`
mut iter => loop {
match <_ as Iterator>::next(&mut iter) {
None => break,
Some(x) => { }
}
}
}
}
Under the hood, an Iterator
is created with IntoIterator::into_iter()
, which is used to iterate over the collection, by calling next()
repeatedly until it returns None
, indicating the end of the iteration.
Iterator triad#
As you can tell from the expansion above, for
loop is not magic that just works for collections.
They require the collection type to have certain functions implemented.
Now we can try to make some educated guesses about the minimum requirements for all those for
variations to compile.
// say `xs` is a `Collection<T>`
// For all those `for` variations to work,
// `Collection<T>`, `&Collection<T>` and `&mut Collection<T>`
// should all implement `IntoIterator`.
match <_ as IntoIterator>::into_iter(xs) { }
match <_ as IntoIterator>::into_iter(&xs) { }
match <_ as IntoIterator>::into_iter(&mut xs) { }
// Plus, `Collection<T>` should implement all three of
// `into_iter()`, `iter()` and `iter_mut()`.
// They all return an `Iterator`, yielding different kinds of access.
match <_ as IntoIterator>::into_iter(xs.into_iter()) { }
match <_ as IntoIterator>::into_iter(xs.iter()) { }
match <_ as IntoIterator>::into_iter(xs.iter_mut()) { }
As for what specific Iterator
types to be created by these functions, the std::collections
types have established the following conventions.
+----------------------------------+----------+----------+
| Created By | Iterator | Yielding |
+----------------------------------+----------+----------+
| IntoIterator::into_iter(xs) | IntoIter | T |
| IntoIterator::into_iter(&xs) | Iter | &T |
| IntoIterator::into_iter(&mut xs) | IterMut | &mut T |
| into_iter(self) | IntoIter | T |
| iter(&self) | Iter | &T |
| iter_mut(&mut self) | IterMut | &mut T |
+----------------------------------+----------+----------+
Accordingly, any iterable type should come with three Iterator
types: IntoIter
, Iter
, and IterMut
, each yielding the access of T
, &T
, and &mut T
respectively.
You can take a look at std::collections
for concrete examples of how the iterators are used and implemented.
When you are creating your own iterable types, nothing will mandate you to implement the same set of functions and iterators. But you SHOULD follow the conventions, as those those are what other Rustaceans expect from an iterable type.
Having the Iterator
triad provides separation of concerns such that we do not need to worry about the iteration details when designing the type, at the price of having a few extra names.
This is also a good trade-off considering the variations of iterators needed, each of which may require different states.
The std::collections::VecDeque
iterators can be a great example of that.
Do it yourself#
Let us try to write a Collection<T>
with Iterator
dummies such that it satisfies the requirements mentioned above, as an example of a minimal iterable type that does nothing but works with all those for
variations.
// let us pretend it has items of type `T`
pub struct Collection<T> {
// ghost state to legitimize `T`
_item: PhantomData<T>,
}
// iterator types
pub struct IntoIter<T> {
// By convention, `IntoIter` should consume the collection.
// As a result, it will not be accessible after iteration.
inner: Collection<T>,
}
pub struct Iter<'a, T> {
_iter: PhantomData<&'a T>,
}
pub struct IterMut<'a, T> {
_iter_mut: PhantomData<&'a T>,
}
impl<T> Collection<T> {
// enable `for x in xs.iter()`
pub fn iter(&self) -> Iter<'_, T> {
Iter {
_iter: PhantomData,
}
}
// enable `for x in xs.iter_mut()`
pub fn iter_mut(&mut self) -> IterMut<'_, T> {
IterMut {
_iter_mut: PhantomData,
}
}
}
// enable `for x in xs`
impl<T> IntoIterator for Collection<T> {
type Item = T;
type IntoIter = IntoIter<T>;
// enable `for x in xs.into_iter()`
fn into_iter(self) -> Self::IntoIter {
IntoIter {
inner: self,
}
}
}
// enable `for x in &xs`
impl<'a, T> IntoIterator for &'a Collection<T> {
type Item = &'a T;
type IntoIter = Iter<'a, T>;
fn into_iter(self) -> Self::IntoIter {
self.iter()
}
}
// enable `for x in &mut xs`
impl<'a, T> IntoIterator for &'a mut Collection<T> {
type Item = &'a mut T;
type IntoIter = IterMut<'a, T>;
fn into_iter(self) -> Self::IntoIter {
self.iter_mut()
}
}
// it really is a dummy
impl<T> Iterator for IntoIter<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
None
}
}
impl<'a, T> Iterator for Iter<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
None
}
}
impl<'a, T> Iterator for IterMut<'a, T> {
type Item = &'a mut T;
fn next(&mut self) -> Option<Self::Item> {
None
}
}
Try playing with the snippet.
By taking out some of the implementations, certain for
variations over Collection<T>
will complain during compilation.
Hopefully, for
loops in Rust will start to make sense.
Bonus#
Do a quick experiment on implementing an iterable type that is for
-compatible.
You may notice that even if you do not implement IntoIterator
for those Iterator
types, they will still work.
for x in x.into_iter() { }
for x in x.iter() { }
for x in x.iter_mut() { }
// are translated into
match <_ as IntoIterator>::into_iter(xs.into_iter()) { ... }
match <_ as IntoIterator>::into_iter(xs.iter()) { ... }
match <_ as IntoIterator>::into_iter(xs.iter_mut()) { ... }
How?
There is only one plausible explanation - if you have not, someone else must have done it for you. This is one of the blanket implementations which Rust does to save us from writing the obvious.
// All types implementing `Iterator` will automatically
// have a default `IntoIterator` implementation.
impl<I: Iterator> IntoIterator for I {
type Item = I::Item;
type IntoIter = I;
#[inline]
fn into_iter(self) -> I {
self
}
}