More memory-efficient parallel encoding

Right now, parallel encoding is implemented by [collecting the encoded coded fragments into one large `Vec<Vec<u8>>`](https://github.com/image-rs/image-dds/blob/23aab09875eee621b5c59baef0bf8a363eacb441/src/encode/mod.rs#L146). This means that the entire encoded surface is stored in memory, which is just unnecessary.

Let `n` be the number of CPU cores, then we only need to store `n` encoded fragments in memory at a time. The basic idea is to process them in parallel, but commit them to disk in order. Like this:

```
4 CPU cores

Core 1: <- encode frag 1, commit -> <- encode frag 5, commit -> ...
Core 2: <- encode frag 2, (wait) commit -> <- encode frag 6, commit -> ...
Core 3: <- encode frag 3, (    wait   ) commit -> <- encode frag 7, commit -> ...
Core 4: <- encode frag 4, (       wait       ) commit -> <- encode frag 8, commit -> ...
time -->
```

Commits go to a single writer, so they lock on a mutex waiting for their turn. 

This also means that if `frag_time < commit_time * (n-1)` (where `frag_time` is the time it takes to encode one fragment and `commit_time` is the time it takes to write one fragment to disk), we will unnecessarily block rayon's thread pool. This is only a problem for others though. It doesn't make parallel encoding any slower.

Note that this approach is **not** faster than the current approach. They will both be equally fast (ignoring synchronization overhead). The difference is that this approach uses less memory, which will make it possible to encode certain large images in parallel that the current approach cannot (due to OOM crash).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More memory-efficient parallel encoding #38

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

More memory-efficient parallel encoding #38

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions