Open
Description
I am having a difficult time in understanding transformers. Everything is getting clear bit by bit but one thing that makes my head scratch is what is the difference between src_mask and src_key_padding_mask which is passed as an argument in forward function in both encoder layer and decoder layer.
https://pytorch.org/docs/master/_modules/torch/nn/modules/transformer.html#Transformer