Chained hash pipelining in array hashing #58252

adienes · 2025-04-28T16:36:26Z

the proposed switch in #57509 from 3h - hash_finalizer(x) to hash_finalizer(3h -x) should increase the hash quality of chained hashes, as the expanded expression goes from something like sum((-3)^k * hash(x) for k in ...) to a non-simplifiable composition

this does have the unfortunate impact of long chains of hashes getting a bit slower as there is more data dependency and the CPU cannot work on the next element's hash before combining the previous one (I think --- I'm not particularly an expert on this low level stuff). As far as I know this only really impacts AbstractArray

so, I've implemented a proposal that does some unrolling / pipelining manually to recover AbstractArray hashing performance. in fact, it's quite a lot faster now for most lengths. I tuned the thresholds (8 accumulators, certain length breakpoints) by hand on my own machine.

oscardssmith · 2025-04-28T20:00:48Z

show performance benchmarks and then lgtm.

adienes · 2025-04-28T20:28:22Z

#57509 (comment) the vec column contains timing; the data for this PR is under :commit == "pipeline" sorry I should have been more clear in that comment

graphically:

note that this cannot merge before #57509, which is also waiting on #58053

adienes · 2025-05-11T17:50:27Z

CI failure unrelated

adienes · 2025-05-15T16:12:33Z

should this method be used for big Tuples as well?

oscardssmith · 2025-05-15T17:39:31Z

how does the Tuple perf look? if it only helps for 100 or more, I wouldn't bother. if it's useful in the 10-100 range, I think we should

adienes · 2025-05-16T21:36:42Z

eh, idt it helps. I guess tuples should mostly be hashing at compile time anyway

adienes added 3 commits April 27, 2025 15:08

pipelining for chained hash

28cd8fb

better impl

d6c90bd

move to multidimensional

8525bcc

adienes added performance Must go faster hashing labels Apr 28, 2025

adienes mentioned this pull request Apr 28, 2025

use rapidhash #57509

Merged

whitespace

1fb79bf

adienes added 4 commits May 10, 2025 10:06

Merge branch 'master' into chained_hash_pipelining

2de255c

I'm the CEO of hating 32-bit systems

ef6a236

Merge branch 'master' into chained_hash_pipelining

7930179

fix

4f001b9

adienes added the status: waiting for PR reviewer label May 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chained hash pipelining in array hashing #58252

Chained hash pipelining in array hashing #58252

adienes commented Apr 28, 2025

oscardssmith commented Apr 28, 2025

adienes commented Apr 28, 2025

adienes commented May 11, 2025

adienes commented May 15, 2025

oscardssmith commented May 15, 2025

adienes commented May 16, 2025

Chained hash pipelining in array hashing #58252

Are you sure you want to change the base?

Chained hash pipelining in array hashing #58252

Conversation

adienes commented Apr 28, 2025

oscardssmith commented Apr 28, 2025

adienes commented Apr 28, 2025

adienes commented May 11, 2025

adienes commented May 15, 2025

oscardssmith commented May 15, 2025

adienes commented May 16, 2025