Skip to content

ResNet strikes back: What about training set? #903

Discussion options

You must be logged in to vote

@alexander-soare Unfortunately I don't believe that is the case. I believe the general outline of the featured recipes would work well for many combinations of arch + dataset, there would need to be adjustment for optimal outcome on a different dataset, just as you would for a different architecture. We didn't test that explicity here, but I have observed that in the past, and I've had conversations with others who have.

Thinking of concrete examples I have in my head, for the 'How to train your ViT' paper, the pretraining recipe for the imagenet21k was lower aug than the optimal for 1k from scratch. The observation there was that increased augreg could roughly make up for an order of mag…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@alexander-soare
Comment options

Answer selected by alexander-soare
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants