We present a fast collision culling algorithm for performing inter- and intra-object collision detection among complex models using graphics hardware. Our algorithm utilizes visibility queries on the GPUs to eliminate a subset of geometric primitives that are not in close proximity and computes a potentially colliding set (PCS) of primitives. We perform no precomputation and the algorithm proceeds in multiple stages: object-level PCS computation, subobject level PCS computation, followed by exact collision detection. We extend our PCS computation algorithm to perform intra-object or self-collisions between complex models. Furthermore, we describe a novel visibility-based classification scheme to reduce the size of potentially-colliding sets of objects and primitives, and the number of visibility queries for further improving the performance and culling efficiency. We have implemented our algorithm on a PC with an NVIDIA GeForce FX 6800 Ultra graphics card and applied it to three complex simulations, each consisting of objects with tens of thousands of triangles. In practice, we are able to compute all the self-collisions for cloth simulation up to image-space precision at interactive rates.