Querying a graph that's larger than available RAM

Graph can’t be held in memory

I’m curious what happens once a graph exceeds the available RAM. Persistence is guaranteed through snapshots and WAL - but at some point we will likely hit a limit of how much of the graph can be held in memory.

Is Memgraph aware of the completeness of the graph in memory? If so, does it have strategies to offload less-used paths from memory? And if that’s the case how would it guarantee that a query returns a complete result (and might not have missed offloaded segments of the graph)?

If file storage based queries are part of the strategy, how much of a performance hit is to be expected?

Best
Eduard

Hi @eduard!

First, welcome to Memgraph’s forum. Thank you for the question, it’s an amazing one :smiley:

Memgraph stores the entire graph inside RAM. In other words, there has to be enough memory to store the whole dataset. That was an early strategic decision because we didn’t want to sacrifice performance. At this point, there is no other option. Memgraph is durable because of snapshots and WALs + there are memory limits in place (at some point when the limit is reached, Memgraph will stop accepting writes). A side note, but essential to mention, is that graph algorithms usually use the entire dataset multiple times, which means you, either way, have to bring data to memory (Memgraph is really optimized for that case).

There are drawbacks to the above approach. In the great majority of cases, you can find enough RAM, but it’s not the cheapest.

We plan to improve Memgraph horizontal scaling capabilities and store data to disk as primary storage (store the whole graph on a less expensive medium). All mentioned is quite early on our roadmap. The horizontal scale is a bit more critical at the moment since Memgraph is a platform to deal with streaming graph data. Again, we are super aware of the “cold storage implications”, and that’s also high on the priority list.

All that being said, could you tell us more about your use case, requirements, any relevant info. Memgraph is super early on the journey, + I’m eager to help! Cheers!