mBERT_WOR

Model Description

mBERT_WOR is a Telugu sentiment classification model built on Google’s BERT-base-multilingual-cased (mBERT) architecture. The base model consists of 12 Transformer encoder layers with approximately 110 million parameters and is pretrained on Wikipedia text from 104 languages, including Telugu, using Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) objectives.

The suffix WOR stands for Without Rationale supervision. This model represents a standard fine-tuned baseline, trained solely on sentiment labels without incorporating human-annotated rationales or explanation-based supervision.

Pretraining Details

Pretraining corpus: Multilingual Wikipedia (104 languages)
Training objectives:
- Masked Language Modeling (MLM)
- Next Sentence Prediction (NSP)
Language coverage: Telugu is included, but the model is not exclusively trained on Telugu data

Training Data

Fine-tuning dataset: Telugu-Dataset
Task: Sentiment classification
Supervision type: Label-only (no rationale supervision)

Intended Use

This model is intended for:

Telugu sentiment classification
Cross-lingual and multilingual NLP experiments
Baseline comparisons in explainability and rationale-supervision studies
Research in low-resource Telugu NLP settings

The model leverages shared multilingual representations, enabling effective cross-lingual transfer even when Telugu-specific labeled data is limited.

Performance Characteristics

Although mBERT is not explicitly optimized for Telugu, it has demonstrated stable and competitive performance in sentiment classification tasks due to its multilingual generalization capability.

Strengths

Strong cross-lingual transfer learning
Reliable and reproducible baseline
Widely adopted in academic research

Limitations

Not optimized for Telugu morphology or syntax
May underperform compared to Telugu-specialized models such as IndicBERT or L3Cube-Telugu-BERT
Limited ability to capture fine-grained, region-specific linguistic nuances

Use as a Baseline

Despite these limitations, mBERT_WOR remains a strong and widely accepted baseline, particularly for:

Comparing models trained with vs. without rationale supervision
Multilingual sentiment analysis pipelines
Benchmarking in low-resource Telugu NLP research

References

Devlin et al., 2019
Hedderich et al., 2021
Wu and Dredze, 2020
Kalyan et al., 2021
Marreddy et al., 2021, 2022
Duggenpudi et al., 2022
Rajalakshmi et al., 2023

Downloads last month: 16

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for DSL-13-SRMAP/mBERT_WOR

Base model

google-bert/bert-base-multilingual-cased

Finetuned

(930)

this model

DSL-13-SRMAP
/

mBERT_WOR