The symmetric information bottleneck (SIB), an extension of the more familiar information bottleneck, is a dimensionality-reduction technique that simultaneously compresses two random variables to preserve information between their compressed versions. We introduce the generalized symmetric information bottleneck (GSIB), which explores different functional forms of the cost of such simultaneous reduction. We then explore the data set size requirements of such simultaneous compression. We do this by deriving bounds and root-mean-squared estimates of statistical fluctuations of the involved loss functions. We show that in typical situations, the simultaneous GSIB compression requires qualitatively less data to achieve the same errors compared to compressing variables one at a time. We suggest that this is an example of a more general principle that simultaneous compression is more data efficient than independent compression of each of the input variables.

You do not currently have access to this content.