I am training ImageNet-1k using a CNN model built by myself. The training configuration is as follows:
- Optimizer: AdamW
- Learning rate: 4e-5
- weight decay: 0.1
- data augmentation: random crop + horizontal flip
- batch size: 256
- GPU: 4x 2080Ti
The accuracy of training and validation is shown in the figure below: enter image description here I was wondering if anyone has had this kind of validation set accuracy oscillation? What is the solution?
I tried setting drop_path=0.2, but still have similar problems. I'm a novice, so I don't have many solutions. If anyone can tell me the reason or solution of this phenomenon, I will be very grateful.