Symbolic Computation Group
David R. Cheriton School of Computer Science
|
|
|
Monday, March 3, 2025, at the University of Waterloo Connections between k-coloring and k-means Enver Aman, New Jersey Abstract: Given a finite set P of points in Euclidean space R^d and a parameter k > 0, the goal of the k-means problem is to identify k "center points" in R^d so that the sum of the square distance between each point in P and its closest center point is as small as possible. The k-means problem is part of a general class of clustering problems, which are widely studied in data science and machine learning. It is well-known that k-means is NP-hard via reductions from structured 3-SAT instances [Dasgupta-Freund 2018]. Furthermore, a recent algorithm [Björkland et al. 2022] solves k-means in time O(2^n). A natural question to ask is whether k-means has an exact algorithm in faster than 2^n time. (Say, k-means in 1.99^n time for each k?) We first present a simpler proof of NP-hardness of the k-means problem by a reduction from graph k-coloring. Similarly, we prove NP-hardness of 2-means by a reduction to a structured max-cut problem. Then, we present a faster-than-2^n algorithm for 2-means by reduction to a weighted triangle problem (inspired by a max-cut reduction to weighted triangle by [Williams 2005]). The runtime of this algorithm, with the current matrix multiplication bounds, runs in 1.72^n time. This is joint work with Karthik C. S. and Sharath Punna.
|
Last modified on Saturday, 21 February 2026, at 20:58 hours.