Skip to content

Add array_chunks #1023

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 134 additions & 0 deletions src/arrays.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
use alloc::vec::Vec;

use crate::next_array::ArrayBuilder;

macro_rules! const_assert_positive {
($N: ty) => {
trait StaticAssert<const N: usize> {
const ASSERT: bool;
}

impl<const N: usize> StaticAssert<N> for () {
const ASSERT: bool = {
assert!(N > 0);
true
};
}

assert!(<() as StaticAssert<N>>::ASSERT);
};
}

/// An iterator that groups the items in arrays of const generic size `N`.
///
/// See [`.next_array()`](crate::Itertools::next_array) for details.
#[derive(Debug, Clone)]
pub struct Arrays<I: Iterator, const N: usize> {
iter: I,
partial: Vec<I::Item>,
}
Comment on lines +26 to +29
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It strikes me as a code smell that N doesn't appear in the definition here. Shouldn't partial be an ArrayBuilder<T, N>?

Copy link
Contributor Author

@ronnodas ronnodas Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making it ArrayBuilder<T, N> a priori makes sense, but I had trouble with the bounds for the impl Clone in that case. Note that MaybeUninit<T>: Clone requires T: Copy.

Maybe it could also be array::IntoIter<T, N> or a hypothetical ArrayBuilder::IntoIter<T, N> (this is close to what the unstable Iterator::ArrayChunks does) but I think the unsafe code for that would need to be more careful.

The state you need to keep track of doesn't really depend on N (except partial.len() <= N but this is sort of incidental). However Arrays needs to have N as a parameter for Arrays::Item to be well-defined.

Suggestions?


impl<I: Iterator, const N: usize> Arrays<I, N> {
pub(crate) fn new(iter: I) -> Self {
const_assert_positive!(N);

// TODO should we use iter.fuse() instead? Otherwise remainder may behave strangely
Self {
iter,
partial: Vec::new(),
}
}

/// Returns an iterator that yields all the items that have
/// not been included in any of the arrays. Use this to access the
/// leftover elements if the total number of elements yielded by
/// the original iterator is not a multiple of `N`.
///
/// If `self` is not exhausted (i.e. `next()` has not returned `None`)
/// then the iterator returned by `remainder()` will also include
/// the elements that *would* have been included in the arrays
/// produced by `next()`.
///
/// ```
/// use itertools::Itertools;
///
/// let mut it = (1..9).arrays();
/// assert_eq!(Some([1, 2, 3]), it.next());
/// assert_eq!(Some([4, 5, 6]), it.next());
/// assert_eq!(None, it.next());
/// itertools::assert_equal(it.remainder(), [7,8]);
///
/// let mut it = (1..9).arrays();
/// assert_eq!(Some([1, 2, 3]), it.next());
/// itertools::assert_equal(it.remainder(), 4..9);
/// ```
pub fn remainder(self) -> impl Iterator<Item = I::Item> {
self.partial.into_iter().chain(self.iter)
}
}

impl<I: Iterator, const N: usize> Iterator for Arrays<I, N> {
type Item = [I::Item; N];

fn next(&mut self) -> Option<Self::Item> {
if !self.partial.is_empty() {
return None;

Check warning on line 75 in src/arrays.rs

View check run for this annotation

Codecov / codecov/patch

src/arrays.rs#L75

Added line #L75 was not covered by tests
}
let mut builder = ArrayBuilder::new();
for _ in 0..N {
if let Some(item) = self.iter.next() {
builder.push(item);
} else {
break;
}
}
if let Some(array) = builder.take() {
Some(array)
} else {
self.partial = builder.into_vec();
None
}
}

fn size_hint(&self) -> (usize, Option<usize>) {
if N == 0 {
(usize::MAX, None)

Check warning on line 95 in src/arrays.rs

View check run for this annotation

Codecov / codecov/patch

src/arrays.rs#L93-L95

Added lines #L93 - L95 were not covered by tests
} else {
let (lo, hi) = self.iter.size_hint();
(lo / N, hi.map(|hi| hi / N))

Check warning on line 98 in src/arrays.rs

View check run for this annotation

Codecov / codecov/patch

src/arrays.rs#L97-L98

Added lines #L97 - L98 were not covered by tests
}
}

Check warning on line 100 in src/arrays.rs

View check run for this annotation

Codecov / codecov/patch

src/arrays.rs#L100

Added line #L100 was not covered by tests
}

impl<I: ExactSizeIterator, const N: usize> ExactSizeIterator for Arrays<I, N> {}

#[cfg(test)]
mod tests {
use crate::Itertools;

fn exact_size_helper(it: impl Iterator) {
let (lo, hi) = it.size_hint();
let count = it.count();
assert_eq!(lo, count);
assert_eq!(hi, Some(count));
}

#[test]
fn exact_size_not_divisible() {
let it = (0..10).array_chunks::<3>();
exact_size_helper(it);
}

#[test]
fn exact_size_after_next() {
let mut it = (0..10).array_chunks::<3>();
_ = it.next();
exact_size_helper(it);
}

#[test]
fn exact_size_divisible() {
let it = (0..10).array_chunks::<5>();
exact_size_helper(it);
}
}
53 changes: 53 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,8 @@ pub mod structs {
TakeWhileRef, TupleCombinations, Update, WhileSome,
};
#[cfg(feature = "use_alloc")]
pub use crate::arrays::Arrays;
#[cfg(feature = "use_alloc")]
pub use crate::combinations::{ArrayCombinations, Combinations};
#[cfg(feature = "use_alloc")]
pub use crate::combinations_with_replacement::CombinationsWithReplacement;
Expand Down Expand Up @@ -171,6 +173,8 @@ pub use crate::unziptuple::{multiunzip, MultiUnzip};
pub use crate::with_position::Position;
pub use crate::ziptuple::multizip;
mod adaptors;
#[cfg(feature = "use_alloc")]
mod arrays;
mod either_or_both;
pub use crate::either_or_both::EitherOrBoth;
#[doc(hidden)]
Expand Down Expand Up @@ -741,6 +745,55 @@ pub trait Itertools: Iterator {
groupbylazy::new_chunks(self, size)
}

/// Return an iterator that groups the items in arrays of const generic size `N`.
///
/// Use the method `.remainder()` to access leftover items in case
/// the number of items yielded by the original iterator is not a multiple of `N`.
///
/// `N == 0` is a compile-time (but post-monomorphization) error.
///
/// See also the method [`.next_array()`](Itertools::next_array).
///
/// ```rust
/// use itertools::Itertools;
/// let mut v = Vec::new();
/// for [a, b] in (1..5).arrays() {
/// v.push([a, b]);
/// }
/// assert_eq!(v, vec![[1, 2], [3, 4]]);
///
/// let mut it = (1..9).arrays();
/// assert_eq!(Some([1, 2, 3]), it.next());
/// assert_eq!(Some([4, 5, 6]), it.next());
/// assert_eq!(None, it.next());
/// itertools::assert_equal(it.remainder(), [7,8]);
///
/// // this requires a type hint
/// let it = (1..7).arrays::<3>();
/// itertools::assert_equal(it, vec![[1, 2, 3], [4, 5, 6]]);
///
/// // you can also specify the complete type
/// use itertools::Arrays;
/// use std::ops::Range;
///
/// let it: Arrays<Range<u32>, 3> = (1..7).arrays();
/// itertools::assert_equal(it, vec![[1, 2, 3], [4, 5, 6]]);
/// ```
///
/// ```compile_fail
/// use itertools::Itertools;
///
/// let mut it = (1..5).arrays::<0>();
/// assert_eq!(Some([]), it.next());
/// ```
#[cfg(feature = "use_alloc")]
fn arrays<const N: usize>(self) -> Arrays<Self, N>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO array_chunks is the better name, since it differentiates it from array_windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add context, I was trying to avoid triggering the unstable-name-collisions lint, because of rust-lang/rust#100450. Should I change it back to array_chunks?

where
Self: Sized,
{
Arrays::new(self)
}

/// Return an iterator over all contiguous windows producing tuples of
/// a specific size (up to 12).
///
Expand Down
21 changes: 20 additions & 1 deletion src/next_array.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
#[cfg(feature = "use_alloc")]
use alloc::vec::Vec;
use core::mem::{self, MaybeUninit};

/// An array of at most `N` elements.
struct ArrayBuilder<T, const N: usize> {
pub(crate) struct ArrayBuilder<T, const N: usize> {
/// The (possibly uninitialized) elements of the `ArrayBuilder`.
///
/// # Safety
Expand Down Expand Up @@ -86,6 +88,23 @@ impl<T, const N: usize> ArrayBuilder<T, N> {
None
}
}

#[cfg(feature = "use_alloc")]
pub(crate) fn into_vec(mut self) -> Vec<T> {
let len = self.len;
// SAFETY: Decreasing the value of `self.len` cannot violate the
// safety invariant on `self.arr`.
self.len = 0;
(0..len)
.map(|i| {
// SAFETY: Since `self.len` is 0, `self.arr` may safely contain
// uninitialized elements.
let item = mem::replace(&mut self.arr[i], MaybeUninit::uninit());
// SAFETY: we know that item is valid since i < len
unsafe { item.assume_init() }
})
.collect()
}
}

impl<T, const N: usize> AsMut<[T]> for ArrayBuilder<T, N> {
Expand Down