--- library_name: mlx-lm tags: - gemma - function-calling - drones - mavlink - px4 - edge-ai - robotics - lora language: - en base_model: - unsloth/functiongemma-270m-it --- # FunctionGemma 270M Drone Router — NAVLINK Round 2 (Reproduced Checkpoint) This repository contains the **reproduced round-2 LoRA adapter checkpoint** for NAVLINK in: - **Model repo:** `llama-farm/functiongemma-270m-drone-router-navlink-v2` - **Dataset snapshot:** `llama-farm/drone-router-dataset-navlink-v2` - **Verified reproduced eval:** **95.0%** exact tool-call accuracy (**475/500**) with **0 safety violations** - **Original canonical best (overwritten):** 95.2% (476/500), 0 safety violations This upload is the reproduced round-2 checkpoint rebuilt from the restored 4,307-example dataset after the original adapter weights were overwritten. ## What the model expects The model is a compact command router for PX4/MAVLink drone control. It expects a normalized two-line textual interface: ```text [STATE] [CMD] ``` It returns: ```text drone_tool(param="value") Short operator-facing feedback. ``` ### Heartbeat exception For heartbeat-only notifications, the model may emit a **single line only**: ```text drone_notify(message="heartbeat") ``` ## Exact system prompt ```text You are a drone command router. Given [STATE] telemetry and a [CMD] trigger, output exactly one tool call on line 1, and a short operator confirmation on line 2. Safety rules (override everything): - battery < 15%: drone_flight(action="land") - battery < 25%: drone_flight(action="rth") - gps lost: drone_flight(action="stop") Routing hints: - Target not visible + "find X": drone_scan(lock_on_class="X") - Target visible + "follow/track": drone_track(action="start") - Multiple waypoints: drone_mission, single GPS point: drone_goto - Compound commands: first step only Heartbeats: drone_notify(message="heartbeat") — no line 2. ``` ## Exact `[STATE] + [CMD]` interface ### Base format ```text [STATE] phase=AIRBORNE alt=15.2m bat=72% hdg=270° spd=3.1m/s pos=32.851,-97.125 | vision=SCANNING target=person wp=2/5 [CMD] ``` ### With active detection context ```text [STATE] phase=AIRBORNE alt=15.0m bat=68% hdg=270° spd=0.0m/s pos=32.852,-97.124 | vision=SCANNING target=person wp=0/0 | det=person conf=0.87 bearing=45° range=15m at=32.853,-97.121 [CMD] person detected at bearing 45°, confidence 87% ``` ## Telemetry field format ### Core telemetry fields - `phase` — uppercase flight phase such as `GROUNDED`, `AIRBORNE`, `RTL` - `alt` — altitude above ground, formatted like `15.2m` - `bat` — battery percentage, formatted like `72%` - `hdg` — heading in degrees, formatted like `270°` - `spd` — horizontal speed, formatted like `3.1m/s` - `pos` — GPS position as `lat,lon` ### Vision fields - `vision` — one of `IDLE`, `SCANNING`, `TRACKING`, `LOST` - `target` — `none`, `person`, `vehicle`, `animal`, or `any` when appropriate - `wp` — waypoint progress as `N/M` (or `0/0` if not in a waypoint mission) ### Optional detection fields - `det` — detected class - `conf` — confidence score, typically 2 decimals (example: `0.87`) - `bearing` — relative bearing in degrees (example: `45°`) - `range` — estimated range in meters (example: `15m`) - `at` — estimated target GPS as `lat,lon` ## Vision trigger formatting Vision events should be normalized into the exact same `[STATE]` + `[CMD]` interface. Examples: ```text [CMD] person detected at bearing 45°, confidence 87% [CMD] possible vehicle, low confidence 38% [CMD] lost tracking on person [CMD] search complete, 2 targets found [CMD] search complete, no targets ``` ## Map / UI / chat normalization guidance All upstream inputs should be reduced into the same plain-text trigger format before inference: - **Chat / radio / typed pilot commands** → keep the command terse and literal in `[CMD]` - **Ground-station map clicks** → convert coordinates directly into the command text, e.g. `[CMD] go to 32.853, -97.121 and scan` - **UI button actions** → normalize to the same canonical wording used in training where possible - **Vision triggers** → include fused detection context in `[STATE]` and the event sentence in `[CMD]` - **Safety triggers** → emit explicit command text such as `[CMD] battery low at 22%` or `[CMD] gps signal lost` The model performs best when all surfaces—map, UI, chat, and vision—are normalized into the same textual contract instead of using separate APIs. ## Output contract ### Standard case Line 1 must be **exactly one executable tool call**. Line 2 should be a **short operator-facing confirmation**. ```text drone_tool(param="value") Short operator-facing confirmation. ``` ### Heartbeat one-line exception ```text drone_notify(message="heartbeat") ``` ### Evaluation contract - **Line 1** is the hard interface and is scored by exact string match. - **Line 2** is helpful but non-executable operator feedback. ## Safety rules The model is trained with strict safety overrides: - `battery < 15%` → `drone_flight(action="land")` - `battery < 25%` → `drone_flight(action="rth")` - `gps lost` → `drone_flight(action="stop")` These rules take precedence over normal mission continuation, tracking, or convenience behavior. ## Tool surface The router targets 9 tools: - `drone_flight(action)` - `drone_move(direction, distance_m)` - `drone_yaw(degrees|heading)` - `drone_goto(lat, lon, alt_m, speed_mps, on_arrival)` - `drone_look(action)` - `drone_scan(lock_on_class)` - `drone_track(action, target_class)` - `drone_mission(action, target, pattern, waypoints)` - `drone_notify(message)` ## Known limitation The main remaining weakness in this reproduced checkpoint is **exact waypoint copying** for multi-waypoint mission prompts. The model can still normalize or hallucinate waypoint lists when asked to patrol/search explicit coordinates in free text. Operational guidance: - prefer structured UI/map coordinate injection when possible - validate generated waypoint strings before execution - treat multi-waypoint free-text prompts as lower-confidence than single-point `drone_goto(...)` routing ## Files in this repo - `adapter_config.json` - `adapters.safetensors` - intermediate checkpoint adapters from the reproduced MLX run - `eval_results.json` with the verified reproduced scorecard ## Verified reproduced result ```text accuracy = 0.950 correct = 475 total = 500 safety_violations = 0 ``` **Note:** The reproduced run scored 475/500 (95.0%) vs. the original 476/500 (95.2%). The 1-sample difference is within normal training variance. The same failure patterns persist: waypoint exact-copying in multi-point missions and occasional repetition in notify messages. See `eval_results.json` for the full per-tool breakdown and failure details.