Skip to Main Content
In this section, we analyze the impact of the encoder/decoder architecture on the generation quality of considered models. The generation quality experiment of section 5 is repeated on the fMNIST and MNIST data set, where the architecture and hyperparameters are adapted from Dupont (2018). From Table 5 and Figure 9, we see that the overall FID scores and generation quality have improved; however, the relative scores among the models did not change significantly.
Table 5:

FID Scores Computed on Randomly Generated 8000 Images When Trained with Architecture and Hyperparameters.

St-RKMVAEβ-VAEFactorVAEInfoGAN
MNIST 24.63 (0.22) 36.11 (1.01) 42.81 (2.01) 35.48 (0.07) 45.74 (2.93) 
fMNIST 61.44 (1.02) 73.47 (0.73) 75.21 (1.11) 69.73 (1.54) 84.11 (2.58) 
St-RKMVAEβ-VAEFactorVAEInfoGAN
MNIST 24.63 (0.22) 36.11 (1.01) 42.81 (2.01) 35.48 (0.07) 45.74 (2.93) 
fMNIST 61.44 (1.02) 73.47 (0.73) 75.21 (1.11) 69.73 (1.54) 84.11 (2.58) 

Notes: Lower is better with standard deviations. Adapted from Dupont (2018).

Table 6:

Computing the Diagonalization Scores (see Figure 3).

ModelsdSprites3DShapes3D cars
St-RKM-sl (σ=10-3, U0.17 (0.05) 0.23 (0.03) 0.21 (0.04) 
St-RKM (σ=10-3, U0.26 (0.05) 0.30 (0.10) 0.31 (0.09) 
St-RKM (σ=10-3, random U0.61 (0.02) 0.72 (0.01) 0.69 (0.03) 
ModelsdSprites3DShapes3D cars
St-RKM-sl (σ=10-3, U0.17 (0.05) 0.23 (0.03) 0.21 (0.04) 
St-RKM (σ=10-3, U0.26 (0.05) 0.30 (0.10) 0.31 (0.09) 
St-RKM (σ=10-3, random U0.61 (0.02) 0.72 (0.01) 0.69 (0.03) 

Notes: Denote M=1|C|iCUψ(yi)ψ(yi)U,withyi=PUϕθ(xi) (cf. equation 3.6). Then we compute the score as M-diag(M)F/MF, where diag:Rm×mRm×m sets the off-diagonal elements of matrix to zero. The scores are computed for each model over 10 random seeds and show the mean (standard deviation). Lower scores indicate better diagonalization.

Figure 8:

Samples of randomly generated batch of images used to compute FID scores and SWD scores (see Figure 4).

Figure 8:

Samples of randomly generated batch of images used to compute FID scores and SWD scores (see Figure 4).

Close modal
Figure 9:

Samples of randomly generated images used to compute the FID scores. See Table 5.

Figure 9:

Samples of randomly generated images used to compute the FID scores. See Table 5.

Close modal
Figure 10:

(a) Loss evolution (log plot) during the training of equation A.2 over 1000 epochs with ɛ=10-5 once with Cayley ADAM optimizer (green curve) and then without (blue curve). (b) Traversals along the principal components when the model was trained with a fixed U, that is, with the objective given by equation A.2 and ɛ=10-5. There is no clear isolation of a feature along any of the principal components, indicating further that optimizing over U is key to better disentanglement.

Figure 10:

(a) Loss evolution (log plot) during the training of equation A.2 over 1000 epochs with ɛ=10-5 once with Cayley ADAM optimizer (green curve) and then without (blue curve). (b) Traversals along the principal components when the model was trained with a fixed U, that is, with the objective given by equation A.2 and ɛ=10-5. There is no clear isolation of a feature along any of the principal components, indicating further that optimizing over U is key to better disentanglement.

Close modal
Close Modal

or Create an Account

Close Modal
Close Modal