AIT-86M โ Audio, Image, Text Embeddings (Depth-2)
AIT-86M maps image, audio, and text into a shared 1280-dim embedding space for cross-modal retrieval with a single vector index. All three modalities share one space with full Matryoshka truncation support down to 128 dims.
Built for edge deployment, with a single combined safetensors artifact.
Successor to TE-75M.
Also available in GGUF format for quantized edge deployment.
Why This Matters
The notable result for this family is preserving the shared semantic retrieval path while adding anchor-style decision behavior in downstream variants. In practice, the hard part is usually keeping the retrieval backbone flat while adding new decision surfaces on top of it.
File layout
AIT-86M.safetensors
Notes
- shared trimodal embedding space
- Matryoshka truncation:
1280 / 768 / 512 / 256 / 128 - intended for retrieval and embedding use, not generation
Historical Local Gate Baseline
The exact local gate baseline that was previously attached under the TE-86M release directory is restored here for continuity in the AIT-86M artifact line.
Attached JSON:
teacher_dual_mn20whisper_exact_gate_baseline_20260424T155324Z.json
Seeded split-excluded baseline at 1280d:
| Slice | Metric |
|---|---|
| Speech holdout A->T R@1 | 0.5652 |
| Speech holdout T->A R@1 | 0.5992 |
| Speech holdout avg R@1 | 0.5822 |
| WavCaps FSD A->T R@1 | 0.1078 |
| WavCaps FSD T->A R@1 | 0.1030 |
| WavCaps FSD avg R@1 | 0.1054 |
| SALT A->I R@1 | 0.1692 |
| SALT I->A R@1 | 0.1261 |
Scope note:
- These are the canonical local gate numbers used for bounded continuation and recovery experiments in this model family.
- They are not a claim of broad public benchmark superiority.
- They are restored here because the prior card revision dropped the attached evaluation summary.
Evaluation Scope
Published evaluations for this model family include targeted retrieval and anchor-style discrimination tasks. Those targeted evaluations are useful, but they are not a substitute for a published adversarial or out-of-distribution benchmark. Downstream runtime validation remains application-specific.
Files
| File | Purpose |
|---|---|
AIT-86M.safetensors |
Base trimodal checkpoint |
teacher_dual_mn20whisper_exact_gate_baseline_20260424T155324Z.json |
Restored canonical local gate baseline summary |
License
Apache 2.0