MSE Research Project Database

Clustering, covering, and computational geometry in the MPC model

Project Leader: Anthony Wirth
Staff: Junhao Gan
Primary Contact: Anthony Wirth (
Disciplines: Computing and Information Systems

The Massively Parallel Communication (MPC) model is a popular parallel computation paradigm in recent years. It has been implemented by several distributed computing frameworks such as MapReduce and Pregrel. In this project, we will explore a number of interesting problems in the MPC model.

Recent developments by Wirth and colleagues have shown how to implement the greedy approach for Set Cover practically on external memory, with highly efficient performance, and that this technique is asymptotically optimal for multi-pass streams. With other colleagues, Wirth showed how to instrument the pivoting technique for Correlation Clustering in such streams. In this project we will explore how to expand this to the MPC model.

Recently, Gan and clleagues have achieved some exicting results on several problems of density based clustering problems by computational geometry techniques. In this project, we will study how to develop new computational geometry techniques in the MPC model to solve problems in the data mining area.