UBio-MolFM: A Universal Molecular Foundation Model for Bio-Systems
Paper • 2602.17709 • Published • 1
UBio-MolFM is a foundation model suite for molecular modeling, specifically designed for bio-systems. This model, UBio-MolFM-V1 (Stage 3), is built on the E2Former-V2 linear-scaling equivariant transformer architecture. Refer to the technique report for more details: UBio-MolFM (arXiv:2602.17709).
molfm-v1-stage-3.ptmolfm-v1-stage-3.pt: Pretrained model checkpoint.config.yaml: Model and inference configuration.To use this model, you need to install the molfm codebase. Please refer to the official repository for installation instructions.
from ase.build import molecule
from molfm.interface.ase.calculator.e2former_calculator import E2FormerCalculator
# 1. Setup atoms
atoms = molecule("H2O")
atoms.set_cell([10, 10, 10])
atoms.pbc = [True, True, True]
# 2. Load the model using the provided checkpoint and config
calc = E2FormerCalculator(
checkpoint_path="path/to/molfm-v1-stage-3.pt",
config_name="path/to/config.yaml", # Or local config name if in search path
head_name="omol25",
device="cuda",
use_tf32=True,
use_compile=True,
)
# 3. Perform calculation
atoms.calc = calc
energy = atoms.get_potential_energy()
forces = atoms.get_forces()
print(f"Energy: {energy} eV")
print(f"Forces:\n{forces}")
from ase import units
from ase.md.langevin import Langevin
from ase.md.velocitydistribution import MaxwellBoltzmannDistribution
# Initialize velocities
MaxwellBoltzmannDistribution(atoms, temperature_K=300)
# Setup Langevin integrator
dyn = Langevin(atoms, 1 * units.fs, temperature_K=300, friction=0.01)
# Run MD
dyn.run(100)
use_tf32=True to enable TF32 on supported NVIDIA GPUs for higher throughput.use_compile=True to enable torch.compile for faster execution.The model was trained using a three-stage curriculum learning strategy on a combination of datasets:
If you use UBio-MolFM-V1 in your research, please cite:
@misc{huang2026ubiomolfm,
title={UBio-MolFM: A Universal Molecular Foundation Model for Bio-Systems},
author={Lin Huang and Arthur Jiang and XiaoLi Liu and Zion Wang and Jason Zhao and Chu Wang and HaoCheng Lu and ChengXiang Huang and JiaJun Cheng and YiYue Du and Jia Zhang},
year={2026},
eprint={2602.17709},
url={https://arxiv.org/abs/2602.17709},
archivePrefix={arXiv},
primaryClass={physics.chem-ph}
}
@misc{huang2026e2formerv2,
title={E2Former-V2: On-the-Fly Equivariant Attention with Linear Activation Memory},
author={Lin Huang and Chengxiang Huang and Ziang Wang and Yiyue Du and Chu Wang and Haocheng Lu and Yunyang Li and Xiaoli Liu and Arthur Jiang and Jia Zhang},
year={2026},
eprint={2601.16622},
url={https://arxiv.org/abs/2601.16622},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
This model and the associated code are released under the MIT License.