Keyun Cheng

Gemini: A Computation-Centric Distributed Graph Processing System

Download

OSDI, 2016

Summary

This paper introduces Gemini, a distributed graph processing system that builds scalability (distributed) on top of efficiency (shared-memory). First, the paper analyzes existing shared-memory and distributed graph systems and find the bottleneck is still computation; and argue that the design should pay close attention to the (local) computation overhead. Experiments over 8 node cluster show that Gemini scales similarly but runs much faster compared with existing graph processing systems.

Details

Dual Update Propagation model
- Sparse mode: master to mirror -> update neighbors
- Dense mode: neighbors -> mirror (computation) -> master (lowers the computation overhead on master node)
Partitioning: Chunk based. Partition the vertex set to continuous chunks preserves locality.
Edges: Using CSR/CSC format for incoming/outcoming edges. Especially, for sparse mode, add an existance bitmap; for dense mode, doubly compressed sparse column.
- Goal: reduce memory access in processing edges.
Locality aware chunking.
- Powerlaw distribution, load balancing exhibits poor load balance.
HPC optimizations
Experiments

Strength

Preserving scalability, while greately reducing the computation time.
- locality preserved via chunk based partitioning

Weakness

How about the effect of locality preserved chunking? It’s not analyzed in detail in the paper.