AIT-86M โ€” Audio, Image, Text Embeddings (Depth-2)

AIT-86M maps image, audio, and text into a shared 1280-dim embedding space for cross-modal retrieval with a single vector index. All three modalities share one space with full Matryoshka truncation support down to 128 dims.

Built for edge deployment, with a single combined safetensors artifact.

Successor to TE-75M.

Also available in GGUF format for quantized edge deployment.

Why This Matters

The notable result for this family is preserving the shared semantic retrieval path while adding anchor-style decision behavior in downstream variants. In practice, the hard part is usually keeping the retrieval backbone flat while adding new decision surfaces on top of it.

File layout

AIT-86M.safetensors

Notes

  • shared trimodal embedding space
  • Matryoshka truncation: 1280 / 768 / 512 / 256 / 128
  • intended for retrieval and embedding use, not generation

Historical Local Gate Baseline

The exact local gate baseline that was previously attached under the TE-86M release directory is restored here for continuity in the AIT-86M artifact line.

Attached JSON:

  • teacher_dual_mn20whisper_exact_gate_baseline_20260424T155324Z.json

Seeded split-excluded baseline at 1280d:

Slice Metric
Speech holdout A->T R@1 0.5652
Speech holdout T->A R@1 0.5992
Speech holdout avg R@1 0.5822
WavCaps FSD A->T R@1 0.1078
WavCaps FSD T->A R@1 0.1030
WavCaps FSD avg R@1 0.1054
SALT A->I R@1 0.1692
SALT I->A R@1 0.1261

Scope note:

  • These are the canonical local gate numbers used for bounded continuation and recovery experiments in this model family.
  • They are not a claim of broad public benchmark superiority.
  • They are restored here because the prior card revision dropped the attached evaluation summary.

Evaluation Scope

Published evaluations for this model family include targeted retrieval and anchor-style discrimination tasks. Those targeted evaluations are useful, but they are not a substitute for a published adversarial or out-of-distribution benchmark. Downstream runtime validation remains application-specific.

Files

File Purpose
AIT-86M.safetensors Base trimodal checkpoint
teacher_dual_mn20whisper_exact_gate_baseline_20260424T155324Z.json Restored canonical local gate baseline summary

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for augmem/AIT-86M

Quantizations
1 model