Skip to content

More memory-efficient parallel encoding #38

Closed
@RunDevelopment

Description

@RunDevelopment

Right now, parallel encoding is implemented by collecting the encoded coded fragments into one large Vec<Vec<u8>>. This means that the entire encoded surface is stored in memory, which is just unnecessary.

Let n be the number of CPU cores, then we only need to store n encoded fragments in memory at a time. The basic idea is to process them in parallel, but commit them to disk in order. Like this:

4 CPU cores

Core 1: <- encode frag 1, commit -> <- encode frag 5, commit -> ...
Core 2: <- encode frag 2, (wait) commit -> <- encode frag 6, commit -> ...
Core 3: <- encode frag 3, (    wait   ) commit -> <- encode frag 7, commit -> ...
Core 4: <- encode frag 4, (       wait       ) commit -> <- encode frag 8, commit -> ...
time -->

Commits go to a single writer, so they lock on a mutex waiting for their turn.

This also means that if frag_time < commit_time * (n-1) (where frag_time is the time it takes to encode one fragment and commit_time is the time it takes to write one fragment to disk), we will unnecessarily block rayon's thread pool. This is only a problem for others though. It doesn't make parallel encoding any slower.

Note that this approach is not faster than the current approach. They will both be equally fast (ignoring synchronization overhead). The difference is that this approach uses less memory, which will make it possible to encode certain large images in parallel that the current approach cannot (due to OOM crash).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions