States in hypercube interior get pulled into spurious basins of attraction. PGD is in green and Multiplicative Weights in orange. Network is initialized at a distance from the center of the simplex (see equation 6.14), and allowed to converge. The -axis is the accuracy of the factorization implied by the converged state. Triangles indicate initialization slightly away from toward any of the other simplex vertices, which is most directions in the space. These initial states get quickly pulled into a spurious basin of attraction.
This site uses cookies. By continuing to use our website, you are agreeing to our privacy policy. No content on this site may be used to train artificial intelligence systems without permission in writing from the MIT Press.