Task . | Param . | Search Space . | Selected Value . |
---|---|---|---|
Open Domain (Local) | Learning Rate | 2e-6,5e-6,2e-5,5e-5 | 2e-5 |
Batch size | 32 (Max. Mem) | 32 | |
Patience | 1,3,5 | 3 | |
Optimizer | SGD,Adam,AdamW | AdamW | |
Hidden Units | 128,512 | 512 | |
Non-linearity | − | ReLU | |
Open Domain (Global) | Learning Rate | 2e-6,5e-6,2e-5,5e-5 | 2e-6 |
Batch size | − | Full instance | |
Stance Pred. (Local) | Learning Rate | 2e-6,5e-6,2e-5,5e-5 | 5e-5 |
Patience | 1,3,5 | 3 | |
Batch size | 16 (Max. Mem) | 16 | |
Optimizer | SGD,Adam,AdamW | AdamW | |
Stance Pred. (Global) | Learning Rate | 2e-6,5e-6,2e-5,5e-5 | 2e-6 |
Batch size | − | Full instance | |
Arg. Mining (Local) | Learning Rate | 1e-4,5e-4,5e-3,1e-3,5e-2,1e-2 | 5e-2 |
Patience | 5,10,20 | 20 | |
Batch size | 16,32,64,128 | 64 | |
Dropout | 0.01,0.05,0.1 | 0.05 | |
Optimizer | SGD,Adam,AdamW | SGD | |
Hidden Units | 128,512 | 128 | |
Non-linearity | − | ReLU | |
Arg. Mining (Global) | Learning Rate | 1e-4,5e-4,5e-3,1e-3,5e-2,1e-2 | 1e-4 |
Patience | 5,10,20 | 10 | |
Batch size | − | Full instance |
Task . | Param . | Search Space . | Selected Value . |
---|---|---|---|
Open Domain (Local) | Learning Rate | 2e-6,5e-6,2e-5,5e-5 | 2e-5 |
Batch size | 32 (Max. Mem) | 32 | |
Patience | 1,3,5 | 3 | |
Optimizer | SGD,Adam,AdamW | AdamW | |
Hidden Units | 128,512 | 512 | |
Non-linearity | − | ReLU | |
Open Domain (Global) | Learning Rate | 2e-6,5e-6,2e-5,5e-5 | 2e-6 |
Batch size | − | Full instance | |
Stance Pred. (Local) | Learning Rate | 2e-6,5e-6,2e-5,5e-5 | 5e-5 |
Patience | 1,3,5 | 3 | |
Batch size | 16 (Max. Mem) | 16 | |
Optimizer | SGD,Adam,AdamW | AdamW | |
Stance Pred. (Global) | Learning Rate | 2e-6,5e-6,2e-5,5e-5 | 2e-6 |
Batch size | − | Full instance | |
Arg. Mining (Local) | Learning Rate | 1e-4,5e-4,5e-3,1e-3,5e-2,1e-2 | 5e-2 |
Patience | 5,10,20 | 20 | |
Batch size | 16,32,64,128 | 64 | |
Dropout | 0.01,0.05,0.1 | 0.05 | |
Optimizer | SGD,Adam,AdamW | SGD | |
Hidden Units | 128,512 | 128 | |
Non-linearity | − | ReLU | |
Arg. Mining (Global) | Learning Rate | 1e-4,5e-4,5e-3,1e-3,5e-2,1e-2 | 1e-4 |
Patience | 5,10,20 | 10 | |
Batch size | − | Full instance |