Topology-preserving compression faces a trade-off between accuracy and throughput. EXaCTz offers a way to preserve the contour tree and extremum graph without sacrificing speed.
In scientific computing tasks, lossy compression has long become a necessity. Data volumes are growing to terabytes, and without aggressive compression, the pipeline simply does not fit within the storage or network budget. The problem manifests later—when the compressed data begins to be used for analysis. Even with strict error bounds, topology is violated: the contour tree and extremum graph become distorted, meaning downstream conclusions may become incorrect. Existing topology-preserving compression approaches partially address this, but create a new bottleneck: their throughput remains at the level of MB/s compared to GB/s for modern compressors.
EXaCTz solves this problem by changing the model. Instead of explicitly constructing the topology (which is costly and poorly scalable), the algorithm introduces a system of constraints on the values of the scalar field. It guarantees the preservation of the extremum graph and contour tree through three classes of invariants: consistency of critical points, global order of saddle points, and correctness of merge/split events. These constraints are implemented through iterative edits, where values are changed monotonically and strictly within the specified error bounds. This approach fits well on GPUs and distributed systems, as it avoids global dependencies like integral path tracing.
A key engineering shift is the abandonment of contour tree reconstruction. Instead, a connection between the extremum graph and merge tree is used, allowing control of topology through local properties. This reduces complexity and eliminates the main source of latency. For distributed execution, the need for path tracing between nodes is additionally removed: the global order of critical points becomes a sufficient condition for correctness. This reduces inter-process communication and enhances scalability.
Results show that this compromise works. On a single GPU, EXaCTz achieves throughput of up to 4.52 GB/s, which is orders of magnitude faster than previous methods (up to 3285× compared to GPU implementations and 213× against CPU). In a distributed configuration, the algorithm scales to 128 GPUs with an efficiency of 55.6% compared to 6.4% for naive parallelization. Processing 512 GB takes less than 48 seconds, and the aggregate throughput reaches 32.69 GB/s. Moreover, the algorithm has a theoretically limited number of iterations—the upper bound is defined by the path length in the vulnerability graph, making behavior predictable.
For the industry, this appears as a pragmatic shift towards constraint-driven architectures. Instead of heavy global computations, the system relies on local checks and guarantees of convergence. This approach is applicable more broadly than just scientific data: any pipelines where structural integrity is important during lossy compression (e.g., feature extraction or simulation pipelines) can benefit from a similar strategy. The main trade-off is the complexity of correction logic and the need to control cascading effects of edits, but EXaCTz demonstrates that this can be formalized and limited.
Information source
arXiv is the largest open preprint repository (since 1991, under the auspices of Cornell), where researchers quickly post working versions of papers; the materials are publicly accessible but do not undergo full peer review, so results should be considered preliminary and, where possible, checked against updated versions or peer‑reviewed journals. arxiv.org