Instructions to use lukasweber/WG_BERT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lukasweber/WG_BERT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="lukasweber/WG_BERT")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("lukasweber/WG_BERT") model = AutoModelForTokenClassification.from_pretrained("lukasweber/WG_BERT") - Notebooks
- Google Colab
- Kaggle
WG-BERT (Warranty and Goodwill) is a pretrained encoder based model to analyze automotive entities in automotive-related texts. WG-BERT is trained by continually pretraining the BERT language model in the automotive domain by using a corpus of automotive (workshop feedback) texts via the masked language modeling (MLM) approach. WG-BERT is further fine-tuned for automotive entity recognition (subtask of Named Entity Recognition (NER)) to extract components and their complaints out of automotive texts. The dataset for continual pretraining consists of 1.8 million workshop feedback texts which contain ~4 million sentences. The dataset for fine-tuning consists of ~5.500 gold annotated sentences by automotive domain experts. We choose as the training architecture the BERT-base-uncased version.
Please contact Lukas Weber lukas-weber[at]hotmail[dot]de / lukas.l.weber[at]mercedes-benz[dot]com about any WG-BERT related issues and questions.
- Downloads last month
- 31