Open
Description
I am using a machine with 512GB of RAM and am running out of memory when loading a 1BN node / 2BN edge graph. Each node has 3 properties and each edge has 4 properties. Is this an expected amount of memory consumption for a graph of this scale. Is there anything I can be doing to minimize the amount of memory being consumed during the load?
Activity
jeffreylovitz commentedon Jul 8, 2021
Hi @AndrewHannigan,
There's no simple rule for expected memory consumption; factors like the number of labels and relationship types can change the value greatly. I would not be surprised if your graph once loaded consumes at least a few hundred GBs, however.
By default the bulk loader will only buffer 1GB of changes at a time, which is negligible compared to the amount used here. It also maintains a dictionary of node IDs for resolving endpoints, which can get rather large.
To determine whether this is a bulk loader issue or just too large of a graph for your machine, I would try running the process again while monitoring memory usage with
htop
or similar. If the majority of memory is used byredis-server
, then there is nothing to do but try to modify the inputs or provision a larger machine. If the bulk loader process is consuming a large quantity of memory, there may be code changes we can develop to ameliorate the cost of loading.