SCLM Option B - Deep Integration

Architecture

  • State dimension: 384
  • Injection layers: [4, 8, 12, 16, 20, 24]
  • EARCP params: 94.6M (5.25% overhead)
  • Experts: 3

Features vs Option A

  • Deeper integration with attention and FFN
  • More injection points (6 layers)
  • Larger state dimension (384)
  • Multi-head attention pooling for encapsulation

Author

Mike Amega (Ame Web Studio)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ 2 Ask for provider support

Model tree for amewebstudio/sclm-modelEarcp-optionB

Finetuned
(833)
this model

Spaces using amewebstudio/sclm-modelEarcp-optionB 2