Skip to content

Add finetuning #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Add finetuning #58

wants to merge 4 commits into from

Conversation

sanealytics
Copy link

@sanealytics sanealytics commented Jan 26, 2017

... INCORPORATING SUGGESTIONS ...
Finetune for alexnet and resnet. It does not work for vgg yet (no pretrained weights).
This will pick up the number of classes from the training directory and finetune the classification layer.

This does 'hard' finetune by freezing feature layers. Another option is 'soft' finetune where small changes to higher layers are allowed. This PR only does the former.

Used hints from https://discuss.pytorch.org/t/how-to-extract-features-of-an-image-from-a-trained-model/119/3 and @apaszke

imagenet/main.py Outdated
def forward(self, x):
f = self.features(x)
if self.modelName == 'alexnet' :
f = f.view(f.size(0), 256 * 6 * 6)

This comment was marked as off-topic.

This comment was marked as off-topic.

imagenet/main.py Outdated
@@ -48,18 +49,66 @@
help='evaluate model on validation set')
parser.add_argument('--pretrained', dest='pretrained', action='store_true',
help='use pre-trained model')
parser.add_argument('--finetune', dest='finetune', action='store_true',

This comment was marked as off-topic.

imagenet/main.py Outdated

if arch.startswith('alexnet') :
self.features = original_model.features
self.classifier = nn.Sequential(

This comment was marked as off-topic.

This comment was marked as off-topic.

@panovr
Copy link

panovr commented Feb 27, 2017

@sanealytics because vgg has pretrained weights now, can you add fine-tune support for vgg?

@macaodha
Copy link

Very helpful script. Although I noticed one problem for resnets.

It works fine for resnet18 and 34 because their last conv layers are 512 in depth. However, for resnet50, 101, and 152 this should be increased to 2048. The offending line is:
nn.Linear(512, num_classes)

Copy link
Author

@sanealytics sanealytics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my other username...

@sanealytics
Copy link
Author

Requesting @fmassa to review changes.

@rajanandthakur
Copy link

rajanandthakur commented May 21, 2017

@sanealytics It would be good to have -predict option to make predictions on test datasets. Currently, I see an option for evaluating using validation dataset. Option to predict on test dataset and right output back to CSV file containing test images and probability for each class would be great functionality.

@panovr
Copy link

panovr commented May 28, 2017

When finetuning with ResNet 50, it seems that

self.features = nn.Sequential(*list(original_model.children())[:-1])
self.classifier = nn.Sequential(nn.Linear(512, num_classes)

should be

self.features = nn.Sequential(*list(original_model.children())[:-1])
self.classifier = nn.Sequential(nn.Linear(2048, num_classes)

@panovr
Copy link

panovr commented May 29, 2017

When finetuning with inception_v3 model, there is an error:

python main.py -a inception_v3 -b 16 --lr 0.01 --pretrained data
=> using pre-trained model 'inception_v3'
Traceback (most recent call last):
File "main.py", line 352, in
main()
File "main.py", line 194, in main
train(train_loader, model, criterion, optimizer, epoch)
File "main.py", line 231, in train
output = model(input_var)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/data_parallel.py", line 59, in forward
return self.module(*inputs[0], **kwargs[0])
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "main.py", line 104, in forward
f = self.features(x)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/container.py", line 64, in forward
input = module(input)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torchvision/models/inception.py", line 311, in forward
x = self.fc(x)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/linear.py", line 54, in forward
return self._backend.Linear()(input, self.weight, self.bias)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functions/linear.py", line 10, in forward
output.addmm
(0, 1, input, weight.t())
RuntimeError: size mismatch at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMathBlas.cu:243

# Everything except the last linear layer
self.features = nn.Sequential(*list(original_model.children())[:-1])
self.classifier = nn.Sequential(
nn.Linear(512, num_classes)

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@panovr
Copy link

panovr commented May 30, 2017

@saurabhRTR May you test the inception_v3 model finetune? I still can't fintune the inception_v3 model.

@sanealytics
Copy link
Author

sanealytics commented May 30, 2017

@panovr will do, expect an update this week

@panovr
Copy link

panovr commented Jun 3, 2017

@sanealytics Thanks! May you also add finetune for densenet?

@mratsim
Copy link

mratsim commented Jun 3, 2017

@panovr @sanealytics:

Here is my fine-tuning code for DenseNet-121. It must be parameterized for all DenseNets. Also it only gave me nan on the dataset I tried to use it on so it needs testing on MNIST / CIFAR-10 I think.

class DenseNet121(nn.Module):
    def __init__(self, num_classes):
        super(DenseNet121, self).__init__()
        
        original_model = models.densenet121(pretrained=True)
        
        # Everything except the last linear layer
        self.features = nn.Sequential(*list(original_model.children())[:-1])
        
        # Get number of features of last layer
        num_feats = original_model.classifier.in_features
        
        # Plug our classifier
        self.classifier = nn.Sequential(
        nn.Linear(num_feats, num_classes)
        )

        # Init of last layer
        for m in self.classifier:
            kaiming_normal(m.weight)
            
        # Freeze weights
        # for p in self.features.parameters():
        #     p.requires_grad = False

    def forward(self, x):
        f = self.features(x)
        out = F.relu(f, inplace=True)
        out = F.avg_pool2d(out, kernel_size=7).view(f.size(0), -1)
        out = self.classifier(out)
        return out

@panovr
Copy link

panovr commented Jun 5, 2017

@mratsim Thanks for the code, and I will test it on my custom dataset tomorrow.

@ahkarami
Copy link

ahkarami commented Jun 8, 2017

@mratsim Thanks for the code. I have tested it, and that was so good. However, I can't finetune 'SqueezeNet 1.0' model. would you please help me. In fact, my code for fine tuning the squeezenet1_0, is as follows:

model_conv = torchvision.models.squeezenet1_0(pretrained=True)
mod = list(model_conv.classifier.children())
mod.pop()
mod.append(torch.nn.Linear(1000, 7))
new_classifier = torch.nn.Sequential(*mod)
print( list(list(new_classifier.children())[1].parameters()) )
model_conv.classifier = new_classifier
for p in model_conv.features.parameters():
    p.requires_grad = False

@panovr
Copy link

panovr commented Jun 9, 2017

@mratsim
I use your code to finetune the densenet121 on my own dataset, and I can validate it works. Thanks for sharing the code!
By the way, what the meaning of It must be parameterized for all DenseNets?

@micklexqg
Copy link

@fmassa , thanks for your reference. I have one question. how to fullfil finetune in imagenet/main.py ?
set pretrained to be true just like the below code?
parser.add_argument('--pretrained', default='true', dest='pretrained', action='store_true',
help='use pre-trained model')

@saurabhRTR
Copy link

saurabhRTR commented Sep 28, 2017

Just say --pretrained

@micklexqg
Copy link

@saurabhRTR , thank you.

@panovr
Copy link

panovr commented Jan 8, 2018

May you add code for fine-tuning squeezenet1_0 and squeezenet1_1?

dribnet added a commit to dribnet/examples that referenced this pull request Nov 17, 2018
The finetune option is used to freeze features layers on a pretrained network

refactored and updated code from pytorch#58
@sanealytics
Copy link
Author

Follow updates on #446

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants