mBERT_WOR

Model Description

mBERT_WOR is a Telugu sentiment classification model built on Google’s BERT-base-multilingual-cased (mBERT) architecture. The base model consists of 12 Transformer encoder layers with approximately 110 million parameters and is pretrained on Wikipedia text from 104 languages, including Telugu, using Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) objectives.

The suffix WOR stands for Without Rationale supervision. This model represents a standard fine-tuned baseline, trained solely on sentiment labels without incorporating human-annotated rationales or explanation-based supervision.


Pretraining Details

  • Pretraining corpus: Multilingual Wikipedia (104 languages)
  • Training objectives:
    • Masked Language Modeling (MLM)
    • Next Sentence Prediction (NSP)
  • Language coverage: Telugu is included, but the model is not exclusively trained on Telugu data

Training Data

  • Fine-tuning dataset: Telugu-Dataset
  • Task: Sentiment classification
  • Supervision type: Label-only (no rationale supervision)

Intended Use

This model is intended for:

  • Telugu sentiment classification
  • Cross-lingual and multilingual NLP experiments
  • Baseline comparisons in explainability and rationale-supervision studies
  • Research in low-resource Telugu NLP settings

The model leverages shared multilingual representations, enabling effective cross-lingual transfer even when Telugu-specific labeled data is limited.


Performance Characteristics

Although mBERT is not explicitly optimized for Telugu, it has demonstrated stable and competitive performance in sentiment classification tasks due to its multilingual generalization capability.

Strengths

  • Strong cross-lingual transfer learning
  • Reliable and reproducible baseline
  • Widely adopted in academic research

Limitations

  • Not optimized for Telugu morphology or syntax
  • May underperform compared to Telugu-specialized models such as IndicBERT or L3Cube-Telugu-BERT
  • Limited ability to capture fine-grained, region-specific linguistic nuances

Use as a Baseline

Despite these limitations, mBERT_WOR remains a strong and widely accepted baseline, particularly for:

  • Comparing models trained with vs. without rationale supervision
  • Multilingual sentiment analysis pipelines
  • Benchmarking in low-resource Telugu NLP research

References

  • Devlin et al., 2019
  • Hedderich et al., 2021
  • Wu and Dredze, 2020
  • Kalyan et al., 2021
  • Marreddy et al., 2021, 2022
  • Duggenpudi et al., 2022
  • Rajalakshmi et al., 2023
Downloads last month
16
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DSL-13-SRMAP/mBERT_WOR

Finetuned
(930)
this model

Dataset used to train DSL-13-SRMAP/mBERT_WOR