Dual-Encoder for Drug-ADR Relation Extraction

Model Description

This model uses a dual-encoder architecture with two PubMedBERT towers to score drug-adverse drug reaction (ADR) relationships for causal graph construction.

Architecture:

  • Drug Encoder: PubMedBERT
  • ADR Encoder: PubMedBERT (separate weights)
  • Fusion: Bilinear layer + MLP classifier

Training:

  • Phase 1: Contrastive pre-training (InfoNCE loss)
  • Phase 2: Binary classification fine-tuning

Performance

  • Best Validation F1: 0.8831
  • Dataset: ADE Corpus v2 (drug_ade_relation)

Usage

import torch
from transformers import AutoTokenizer

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("chrisvoncsefalvay/drug-adr-dual-encoder")

# Load model (requires DualEncoderADE class definition)
from huggingface_hub import hf_hub_download
import json

# Download config and weights
config_path = hf_hub_download("chrisvoncsefalvay/drug-adr-dual-encoder", "config.json")
weights_path = hf_hub_download("chrisvoncsefalvay/drug-adr-dual-encoder", "pytorch_model.bin")

with open(config_path) as f:
    config = json.load(f)

# Initialize and load model
model = DualEncoderADE(
    config["model_name"],
    hidden_dim=config["hidden_dim"],
    fusion_dim=config["fusion_dim"],
    dropout=config["dropout"]
)
model.load_state_dict(torch.load(weights_path, map_location="cpu"))
model.eval()

# Score a drug-ADR pair
drug_context = "Patient was prescribed metformin for diabetes"
adr_context = "Patient developed lactic acidosis"

drug_enc = tokenizer(drug_context, return_tensors="pt", max_length=128, truncation=True, padding="max_length")
adr_enc = tokenizer(adr_context, return_tensors="pt", max_length=128, truncation=True, padding="max_length")

with torch.no_grad():
    logits, _, _ = model(
        drug_enc["input_ids"], drug_enc["attention_mask"],
        adr_enc["input_ids"], adr_enc["attention_mask"]
    )
    score = torch.sigmoid(logits).item()

print(f"Relation score: {score:.4f}")

Training Details

  • Base Model: microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext
  • Contrastive Epochs: 3
  • Classification Epochs: 5
  • Learning Rate: 2e-05
  • Batch Size: 32

Citation

If you use this model, please cite:

@misc{drug-adr-dual-encoder,
  author = {von Csefalvay, Chris},
  title = {Dual-Encoder for Drug-ADR Relation Extraction},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/chrisvoncsefalvay/drug-adr-dual-encoder}
}
Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train chrisvoncsefalvay/drug-adr-dual-encoder