Missing weights for example code
I install cramming and use the latest transformers.
The example code can be run, but many weights cannot be loaded.
Is this normal?
Some weights of the model checkpoint at ./models/pbelcak_FastBERT-1x11-long were not used when initializing ScriptableLMForPreTraining: ['encoder.layers.7.ffn.linear_in.bias', 'encoder.layers.12.ffn.linear_out.weight', 'encoder.layers.13.ffn.linear_out.weight', 'encoder.layers.6.ffn.linear_out.weight', 'encoder.layers.3.ffn.linear_in.weight', 'encoder.layers.12.ffn.linear_in.weight', 'encoder.layers.14.ffn.linear_in.bias', 'encoder.layers.2.ffn.linear_out.weight', 'encoder.layers.9.ffn.linear_in.bias', 'encoder.layers.0.ffn.linear_in.bias', 'encoder.layers.4.ffn.linear_out.weight', 'encoder.layers.6.ffn.linear_in.weight', 'encoder.layers.4.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_in.weight', 'encoder.layers.14.ffn.linear_in.weight', 'encoder.layers.2.ffn.linear_in.bias', 'encoder.layers.5.ffn.linear_out.weight', 'encoder.layers.10.ffn.linear_in.bias', 'encoder.layers.3.ffn.linear_out.weight', 'encoder.layers.7.ffn.linear_in.weight', 'encoder.layers.8.ffn.linear_out.weight', 'encoder.layers.9.ffn.linear_out.weight', 'encoder.layers.15.ffn.linear_in.bias', 'encoder.layers.13.ffn.linear_in.weight', 'encoder.layers.0.ffn.linear_in.weight', 'encoder.layers.10.ffn.linear_out.weight', 'encoder.layers.5.ffn.linear_in.weight', 'encoder.layers.6.ffn.linear_in.bias', 'encoder.layers.4.ffn.linear_in.bias', 'encoder.layers.15.ffn.linear_out.weight', 'encoder.layers.10.ffn.linear_in.weight', 'encoder.layers.13.ffn.linear_in.bias', 'encoder.layers.5.ffn.linear_in.bias', 'encoder.layers.2.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_in.bias', 'encoder.layers.1.ffn.linear_in.bias', 'encoder.layers.12.ffn.linear_in.bias', 'encoder.layers.8.ffn.linear_in.bias', 'encoder.layers.8.ffn.linear_in.weight', 'encoder.layers.1.ffn.linear_in.weight', 'encoder.layers.1.ffn.linear_out.weight', 'encoder.layers.3.ffn.linear_in.bias', 'encoder.layers.9.ffn.linear_in.weight', 'encoder.layers.0.ffn.linear_out.weight', 'encoder.layers.14.ffn.linear_out.weight', 'encoder.layers.15.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_out.weight', 'encoder.layers.7.ffn.linear_out.weight']
- This IS expected if you are initializing ScriptableLMForPreTraining from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing ScriptableLMForPreTraining from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of ScriptableLMForPreTraining were not initialized from the model checkpoint at ./models/pbelcak_FastBERT-1x11-long and are newly initialized: ['encoder.layers.14.ffn.dense_in.weight', 'encoder.layers.15.ffn.dense_out.weight', 'encoder.layers.15.ffn.dense_in.weight', 'encoder.layers.13.ffn.dense_in.weight', 'encoder.layers.11.ffn.dense_in.weight', 'encoder.layers.5.ffn.dense_out.weight', 'encoder.layers.12.ffn.dense_out.weight', 'encoder.layers.9.ffn.dense_out.weight', 'encoder.layers.1.ffn.dense_out.weight', 'encoder.layers.5.ffn.dense_in.weight', 'encoder.layers.8.ffn.dense_out.weight', 'encoder.layers.0.ffn.dense_out.weight', 'encoder.layers.8.ffn.dense_in.weight', 'encoder.layers.6.ffn.dense_in.weight', 'encoder.layers.4.ffn.dense_in.weight', 'encoder.layers.10.ffn.dense_out.weight', 'encoder.layers.4.ffn.dense_out.weight', 'encoder.layers.2.ffn.dense_out.weight', 'encoder.layers.11.ffn.dense_out.weight', 'encoder.layers.14.ffn.dense_out.weight', 'encoder.layers.0.ffn.dense_in.weight', 'encoder.layers.3.ffn.dense_out.weight', 'encoder.layers.13.ffn.dense_out.weight', 'encoder.layers.3.ffn.dense_in.weight', 'encoder.layers.1.ffn.dense_in.weight', 'encoder.layers.6.ffn.dense_out.weight', 'encoder.layers.10.ffn.dense_in.weight', 'encoder.layers.12.ffn.dense_in.weight', 'encoder.layers.2.ffn.dense_in.weight', 'encoder.layers.9.ffn.dense_in.weight', 'encoder.layers.7.ffn.dense_out.weight', 'encoder.layers.7.ffn.dense_in.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Hello,
What your warnings are saying is that they found weights for the FFF module (training/cramming/architectures/fff.py) but the model is trying to load the weights for the FFNComponent module (training/cramming/crammed_bert.py). I just tried running the README example with a fresh instance and I could not reproduce your warnings.
You're most likely using cramming installed from the original cramming repository and not from the training directory of this project.
To recap, these are the steps:
pip uninstall crammingto remove the previous version of cramming installed in your environment -- or just start with a fresh environment.cd trainingpip install .- Create
minimal_example.py - Paste
import cramming
from transformers import AutoModelForMaskedLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("pbelcak/FastBERT-1x11-long")
model = AutoModelForMaskedLM.from_pretrained("pbelcak/FastBERT-1x11-long")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
python minimal_example.py.
there is no training folder in your repo. Am I looking to the wrong place?
Hi batrlatom,
This is the folder in the repo:
Hello,
What your warnings are saying is that they found weights for the
FFFmodule (training/cramming/architectures/fff.py) but the model is trying to load the weights for theFFNComponentmodule (training/cramming/crammed_bert.py). I just tried running the README example with a fresh instance and I could not reproduce your warnings.You're most likely using
cramminginstalled from the originalcrammingrepository and not from thetrainingdirectory of this project.To recap, these are the steps:
pip uninstall crammingto remove the previous version of cramming installed in your environment -- or just start with a fresh environment.cd trainingpip install .- Create
minimal_example.py- Paste
import cramming from transformers import AutoModelForMaskedLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("pbelcak/FastBERT-1x11-long") model = AutoModelForMaskedLM.from_pretrained("pbelcak/FastBERT-1x11-long") text = "Replace me by any text you'd like." encoded_input = tokenizer(text, return_tensors='pt') output = model(**encoded_input)
python minimal_example.py.
Yes, you are right. I do install the original cramming. Sorry that I missed it in the README and thank you very much for the reply.