⚠️ DEPRECATED — Experimental Model

This model is an early experimental release from the kniv cascade research program and is no longer maintained. It predates the current 5-head cascade architecture (POS, NER, DEP, SRL, CLS) and the bottom-up layer-selective training methodology that produces our current production teacher.

Use the current production model instead: dragonscale-ai/kniv-deberta-nlp-base-en-large

The current model offers significantly better quality across all tasks, includes Semantic Role Labeling and Dialog Act Classification heads, and has reproducible benchmarks against standard public test sets.

This repository is preserved for reproducibility and historical reference. No further updates, bug fixes, or evaluation runs are planned.

kniv-deberta-v3-large-nlp-en

Multi-task NLP teacher model for English: NER + POS tagging + dependency parsing + sentence classification in a single forward pass.

Part of the kniv-nlp-models project, powering the uniko cognitive memory system.

Model Details


Base model	microsoft/deberta-v3-large
Parameters	435M (24 layers, 1024 hidden)
Max sequence length	128 tokens
Format	PyTorch + ONNX (FP32 + INT8)
Training data	kniv-corpus-en (gold-filtered) + UD English EWT
License	Apache-2.0
Use	Server-side NLP; teacher for knowledge distillation

Results

Head	Task	Metric	Score
NER	Named entity recognition (18 types)	F1	0.725
POS	Part-of-speech tagging (17 UPOS)	Accuracy	0.984
DEP	Dependency parsing (dep2label)	UAS	0.871
CLS	Dialog act classification (9 labels)	Macro F1	0.493
		Composite	0.823

Architecture

Shared DeBERTa-v3-large encoder with four linear heads. One forward pass, four outputs.

DeBERTa-v3-large encoder (435M params, 24 layers, 1024 hidden)
  +-- NER head: Linear(1024, 37)     -- per-token BIO entity tags
  +-- POS head: Linear(1024, 17)     -- per-token UPOS tags
  +-- Dep head: Linear(1024, 1411)   -- per-token dep2label tags
  +-- CLS head: Linear(1024, 9)      -- per-sequence dialog act

NER Entity Types (18)

PERSON, ORG, GPE, LOC, DATE, TIME, MONEY, PERCENT, QUANTITY, ORDINAL, CARDINAL, NORP, FAC, PRODUCT, EVENT, WORK_OF_ART, LAW, LANGUAGE

CLS Dialog Act Labels (9)

inform, correction, agreement, question, plan_commit, request, feedback, social, filler

dep2label Encoding

Dependencies encoded as token labels using rel-pos (Strzyz et al., 2019):

+1@nsubj@VERB   ->  "1st VERB to the right, relation=nsubj"
-2@det@NOUN     ->  "2nd NOUN to the left, relation=det"
 0@root@ROOT    ->  "root of the sentence"

Training

Trained on gold-filtered kniv-corpus-en:

NER: 45,000 examples (gold-filtered, domain-balanced from 237K)
POS + DEP: 12,544 examples (UD English EWT v2.14, expert-annotated)
CLS: 57,544 examples (NER + UD combined, GPT-5.4-nano classified)

Parameter	Value
Batch size	64
Learning rate	1e-5
Epochs	5
Precision	fp32 (gradient checkpointing)
Warmup	10%
Loss weights	NER: 1.0, POS: 1.0, Dep: 1.0, CLS: 0.5
Hardware	NVIDIA A100 40GB

Usage

Python (ONNX Runtime)

import onnxruntime as ort
import numpy as np
from transformers import AutoTokenizer
import json

# Load
session = ort.InferenceSession("model-int8.onnx")
tokenizer = AutoTokenizer.from_pretrained(".")
with open("label_maps.json") as f:
    labels = json.load(f)

# Tokenize
text = "Caroline went to the hospital in New York."
enc = tokenizer(text, return_tensors="np", padding="max_length", max_length=128)

# Inference (single forward pass -> 4 outputs)
outputs = session.run(None, {
    "input_ids": enc["input_ids"],
    "attention_mask": enc["attention_mask"],
})
ner_logits, pos_logits, dep_logits, cls_logits = outputs

# Decode NER
tokens = tokenizer.convert_ids_to_tokens(enc["input_ids"][0])
ner_preds = [labels["ner_labels"][i] for i in ner_logits[0].argmax(axis=-1)]
for tok, ner in zip(tokens, ner_preds):
    if ner != "O":
        print(f"  {tok}: {ner}")

Rust (ONNX Runtime)

use ort::{Session, Value};
use ndarray::Array2;
use tokenizers::Tokenizer;

let session = Session::builder()?
    .with_optimization_level(ort::GraphOptimizationLevel::Level3)?
    .commit_from_file("model-int8.onnx")?;

let tokenizer = Tokenizer::from_file("tokenizer.json")?;
let encoding = tokenizer.encode("Caroline went to the hospital.", true)?;

let outputs = session.run(ort::inputs![
    Array2::from_shape_vec((1, 128), encoding.get_ids().to_vec())?,
    Array2::from_shape_vec((1, 128), encoding.get_attention_mask().to_vec())?,
]?)?;

// outputs: ner_logits, pos_logits, dep_logits, cls_logits

Files

File	Size	Description
`model.onnx`	1,663 MB	FP32 ONNX model
`model-int8.onnx`	612 MB	INT8 quantized (dynamic)
`model.pt`	1,670 MB	PyTorch weights
`label_maps.json`	<1 MB	NER/POS/DEP/CLS label vocabularies
`tokenizer.json`	8 MB	DeBERTa-v3 tokenizer

Important: Use This Model's Tokenizer

Always load the tokenizer from this repo, not from microsoft/deberta-v3-large. The upstream HuggingFace tokenizer may omit BOS/EOS special tokens, shifting all positions and producing incorrect results.

# Correct
tokenizer = AutoTokenizer.from_pretrained("dragonscale-ai/kniv-deberta-v3-large-nlp-en")

# WRONG — may omit special tokens
tokenizer = AutoTokenizer.from_pretrained("microsoft/deberta-v3-large")

Limitations

English only
Max 128 tokens — longer inputs truncated
CLS labels are GPT-classified — not human-annotated, macro F1 reflects imbalanced rare labels
Server-side model — 435M params, not for edge/mobile. Use the distilled student for that.

Source

Code: rustic-ai/kniv-nlp-models
Dataset: dragonscale-ai/kniv-corpus-en

Citation

@misc{kniv-deberta-v3-large-2026,
  title={kniv-deberta-v3-large-nlp-en: Multi-task NLP Teacher Model},
  author={Dragonscale Industries Inc.},
  year={2026},
  url={https://huggingface.co/dragonscale-ai/kniv-deberta-v3-large-nlp-en}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Datasets used to train dragonscale-ai/kniv-deberta-v3-large-nlp-en

Evaluation results

F1
self-reported

0.725
Accuracy
self-reported

0.984
UAS
self-reported

0.871
Macro F1
self-reported

0.493