We are going to investigate how to effectively parallelize Worst-Case Optimal Join (WCOJ) algorithms on multicore machines. We will try to optimize join performance by exploring full multi-variable partitioning, eager indexing (CoCo), and evaluating performance bottlenecks such as load skew and redundant computation.
https://supergary10.github.io/cmu-15618-final-project/
Worst-Case Optimal Join (WCOJ) algorithms evaluate multiway joins by iterating over join variables one at a time and performing multi-relation intersections. These algorithms are particularly effective for cyclic queries (like triangle queries), where traditional binary joins underperform.
Despite their theoretical efficiency, parallel WCOJ implementations remain difficult in practice. Most systems only parallelize the top-level variable. Although simple, this approach suffers from:
Our project explores these challenges empirically by starting from a minimal baseline and optimizing incrementally.
There are several reasons this project is challenging from a parallel systems perspective:
It is likely that there is no a possible optimal solution that can solve all the challenges. We will aim to learn how fine-grained control over workload distribution and index layout can improve parallel join performance.
We will use the following resources to complete our project:
We will use GHC machines for development and testing. As it has a i7-9700 CPU with 8 cores and 16GB memory, it is suitable for our project. If this is not enough, we will use the PSC cluster.
We are familiar with C++, and we have experienced a lot of OpenMP in Assignment 3. CUDA utilized SIMD, which is not suitable for our project. The performance of MPI is not as good as OpenMP from the results of Assignment 3 and 4.
Date | Task |
---|---|
4.2 | Finish the basic sequential WCOJ algorithm |
4.9 | Implement the simple parallel WCOJ algorithm |
4.10 | Performance evaluation and optimization analysis |
4.14 | Implement the advanced parallel WCOJ algorithm |
4.15 | Performance evaluation and optimization analysis |
4.26 | Finish all the optimizations and evaluation |
4.28 | Final report writing |
4.29 | Poster session |
Wu, J., & Suciu, D. (2025). HoneyComb: A Parallel Worst-Case Optimal Join on Multicores. arXiv preprint arXiv:2502.06715.
Ngo, H. Q., Porat, E., RĂ©, C., & Rudra, A. (2018). Worst-case optimal join algorithms. Journal of the ACM (JACM), 65(3), 1-40.
SNAP Datasets: https://snap.stanford.edu/data/