mBERT_WOR
Model Description
mBERT_WOR is a Telugu sentiment classification model built on Google’s BERT-base-multilingual-cased (mBERT) architecture. The base model consists of 12 Transformer encoder layers with approximately 110 million parameters and is pretrained on Wikipedia text from 104 languages, including Telugu, using Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) objectives.
The suffix WOR stands for Without Rationale supervision. This model represents a standard fine-tuned baseline, trained solely on sentiment labels without incorporating human-annotated rationales or explanation-based supervision.
Pretraining Details
- Pretraining corpus: Multilingual Wikipedia (104 languages)
- Training objectives:
- Masked Language Modeling (MLM)
- Next Sentence Prediction (NSP)
- Language coverage: Telugu is included, but the model is not exclusively trained on Telugu data
Training Data
- Fine-tuning dataset: Telugu-Dataset
- Task: Sentiment classification
- Supervision type: Label-only (no rationale supervision)
Intended Use
This model is intended for:
- Telugu sentiment classification
- Cross-lingual and multilingual NLP experiments
- Baseline comparisons in explainability and rationale-supervision studies
- Research in low-resource Telugu NLP settings
The model leverages shared multilingual representations, enabling effective cross-lingual transfer even when Telugu-specific labeled data is limited.
Performance Characteristics
Although mBERT is not explicitly optimized for Telugu, it has demonstrated stable and competitive performance in sentiment classification tasks due to its multilingual generalization capability.
Strengths
- Strong cross-lingual transfer learning
- Reliable and reproducible baseline
- Widely adopted in academic research
Limitations
- Not optimized for Telugu morphology or syntax
- May underperform compared to Telugu-specialized models such as IndicBERT or L3Cube-Telugu-BERT
- Limited ability to capture fine-grained, region-specific linguistic nuances
Use as a Baseline
Despite these limitations, mBERT_WOR remains a strong and widely accepted baseline, particularly for:
- Comparing models trained with vs. without rationale supervision
- Multilingual sentiment analysis pipelines
- Benchmarking in low-resource Telugu NLP research
References
- Devlin et al., 2019
- Hedderich et al., 2021
- Wu and Dredze, 2020
- Kalyan et al., 2021
- Marreddy et al., 2021, 2022
- Duggenpudi et al., 2022
- Rajalakshmi et al., 2023
- Downloads last month
- 16
Model tree for DSL-13-SRMAP/mBERT_WOR
Base model
google-bert/bert-base-multilingual-cased