Skip to Main Content
Table 3:

Model Architectures.

Data SetArchitecture
MNIST/fMNIST//SVHN/3DShapes/sDprites/3Dcars ϕθ(·)=Conv[c]×4×4;Conv[c×2]×4×4;Conv[c×4]×k^×k^;FC256;FC50(Linear) ψζ(·)=FC256;FC[c×4]×k^×k^;Conv[c×2]×4×4;Conv[c]×4×4;Conv[c](Sigmoid) 
Data SetArchitecture
MNIST/fMNIST//SVHN/3DShapes/sDprites/3Dcars ϕθ(·)=Conv[c]×4×4;Conv[c×2]×4×4;Conv[c×4]×k^×k^;FC256;FC50(Linear) ψζ(·)=FC256;FC[c×4]×k^×k^;Conv[c×2]×4×4;Conv[c]×4×4;Conv[c](Sigmoid) 

Notes: All convolutions and transposed convolutions are with stride 2 and padding 1. Unless stated otherwise, layers have parametric-RELU (α=0.2) activation functions, except output layers of the preimage maps, which have sigmoid activation functions (since input data are normalized [0, 1]). Adam and Cayley ADAM optimizers have learning rates 2×10-4 and 10-4, respectively. The preimage map/decoder network is always taken as transposed of the feature map/encoder network. c=48 for 3D cars; and c=64 for all others. Further, k^=3 and stride 1 for MNIST, fMNIST, SVHN and 3DShapes; and k^=4 for others. SVHN and 3DShapes are resized to 28×28 input dimensions.

Close Modal

or Create an Account

Close Modal
Close Modal