Migrate policy to PolicyProcessorPipeline system

**Automated Policy Migration to PolicyProcessorPipeline**

This PR migrates your model to the new LeRobot policy format using the modern PolicyProcessorPipeline architecture.

## What Changed

### **New Architecture - PolicyProcessorPipeline**
Your model now uses external PolicyProcessorPipeline components for data processing instead of built-in normalization layers. This provides:
- **Modularity**: Separate preprocessing and postprocessing pipelines
- **Flexibility**: Easy to swap, configure, and debug processing steps
- **Compatibility**: Works with the latest LeRobot ecosystem

### **Normalization Extraction**
We've extracted normalization statistics from your model's state_dict and removed the built-in normalization layers:
- **Extracted patterns**: `normalize_inputs.*`, `unnormalize_outputs.*`, `normalize.*`, `unnormalize.*`, `input_normalizer.*`, `output_normalizer.*`
- **Statistics preserved**: Mean, std, min, max values for all features
- **Clean model**: State dict now contains only core model weights

### **Files Added**
- **preprocessor_config.json**: Configuration for input preprocessing pipeline
- **postprocessor_config.json**: Configuration for output postprocessing pipeline
- **model.safetensors**: Clean model weights without normalization layers
- **config.json**: Updated model configuration
- **train_config.json**: Training configuration
- **README.md**: Updated model card with migration information

### **Benefits**
- **Backward Compatible**: Your model behavior remains identical
- **Future Ready**: Compatible with latest LeRobot features and updates
- **Debuggable**: Easy to inspect and modify processing steps
- **Portable**: Processors can be shared and reused across models

### **Usage**
```python
# Load your migrated model
from lerobot.policies import get_policy_class
from lerobot.processor import PolicyProcessorPipeline

# The preprocessor and postprocessor are now external
preprocessor = PolicyProcessorPipeline.from_pretrained("your-model-repo", config_filename="preprocessor_config.json")
postprocessor = PolicyProcessorPipeline.from_pretrained("your-model-repo", config_filename="postprocessor_config.json")
policy = get_policy_class("your-policy-type").from_pretrained("your-model-repo")

# Process data through the pipeline
processed_batch = preprocessor(raw_batch)
action = policy(processed_batch)
final_action = postprocessor(action)
```

*Generated automatically by the LeRobot policy migration script*

Files changed (5) hide show

README.md +62 -0
config.json +302 -0
model.safetensors +3 -0
policy_postprocessor.json +31 -0
policy_preprocessor.json +74 -0

README.md ADDED Viewed

	@@ -0,0 +1,62 @@

+---
+datasets: unknown
+library_name: lerobot
+license: apache-2.0
+model_name: xvla
+pipeline_tag: robotics
+tags:
+- xvla
+- lerobot
+- robotics
+---
+# Model Card for xvla
+<!-- Provide a quick summary of what the model is/does. -->
+_Model type not recognized — please update this template._
+This policy has been trained and pushed to the Hub using [LeRobot](https://github.com/huggingface/lerobot).
+See the full documentation at [LeRobot Docs](https://huggingface.co/docs/lerobot/index).
+---
+## How to Get Started with the Model
+For a complete walkthrough, see the [training guide](https://huggingface.co/docs/lerobot/il_robots#train-a-policy).
+Below is the short version on how to train and run inference/eval:
+### Train from scratch
+```bash
+lerobot-train \
+  --dataset.repo_id=${HF_USER}/<dataset> \
+  --policy.type=act \
+  --output_dir=outputs/train/<desired_policy_repo_id> \
+  --job_name=lerobot_training \
+  --policy.device=cuda \
+  --policy.repo_id=${HF_USER}/<desired_policy_repo_id>
+  --wandb.enable=true
+```
+_Writes checkpoints to `outputs/train/<desired_policy_repo_id>/checkpoints/`._
+### Evaluate the policy/run inference
+```bash
+lerobot-record \
+  --robot.type=so100_follower \
+  --dataset.repo_id=<hf_user>/eval_<dataset> \
+  --policy.path=<hf_user>/<desired_policy_repo_id> \
+  --episodes=10
+```
+Prefix the dataset repo with **eval\_** and supply `--policy.path` pointing to a local or hub checkpoint.
+---
+## Model Details
+- **License:** apache-2.0

config.json ADDED Viewed

	@@ -0,0 +1,302 @@

+{
+    "type": "xvla",
+    "n_obs_steps": 1,
+    "input_features": {
+        "observation.images.image": {
+            "type": "VISUAL",
+            "shape": [
+                3,
+                256,
+                256
+            ]
+        },
+        "observation.images.image2": {
+            "type": "VISUAL",
+            "shape": [
+                3,
+                256,
+                256
+            ]
+        },
+        "observation.state": {
+            "type": "STATE",
+            "shape": [
+                8
+            ]
+        }
+    },
+    "output_features": {
+        "action": {
+            "type": "ACTION",
+            "shape": [
+                20
+            ]
+        }
+    },
+    "device": "cuda",
+    "use_amp": false,
+    "push_to_hub": true,
+    "repo_id": "jadechoghari/X-VLA-Libero",
+    "private": null,
+    "tags": null,
+    "license": null,
+    "pretrained_path": null,
+    "chunk_size": 32,
+    "n_action_steps": 32,
+    "num_actions": 32,
+    "normalization_mapping": {
+        "VISUAL": "MEAN_STD",
+        "STATE": "IDENTITY",
+        "ACTION": "IDENTITY"
+    },
+    "florence_config": {},
+    "tokenizer_name": "facebook/bart-large",
+    "tokenizer_max_length": 64,
+    "tokenizer_padding_side": "right",
+    "pad_language_to": "max_length",
+    "hidden_size": 1024,
+    "depth": 24,
+    "num_heads": 16,
+    "mlp_ratio": 4.0,
+    "num_domains": 30,
+    "len_soft_prompts": 32,
+    "dim_time": 32,
+    "max_len_seq": 512,
+    "use_hetero_proj": false,
+    "action_mode": "ee6d",
+    "num_denoising_steps": 10,
+    "use_proprio": true,
+    "max_state_dim": 20,
+    "domain_feature_key": null,
+    "resize_imgs_with_padding": [
+        518,
+        518
+    ],
+    "num_image_views": 2,
+    "empty_cameras": 0,
+    "optimizer_lr": 0.0001,
+    "optimizer_betas": [
+        0.9,
+        0.95
+    ],
+    "optimizer_eps": 1e-08,
+    "optimizer_weight_decay": 0.0001,
+    "optimizer_grad_clip_norm": 10.0,
+    "scheduler_warmup_steps": 1000,
+    "scheduler_decay_steps": 30000,
+    "scheduler_decay_lr": 2.5e-06,
+    "vision_config": {
+        "_attn_implementation_autoset": false,
+        "_name_or_path": "",
+        "add_cross_attention": false,
+        "architectures": null,
+        "bad_words_ids": null,
+        "begin_suppress_tokens": null,
+        "bos_token_id": null,
+        "chunk_size_feed_forward": 0,
+        "cross_attention_hidden_size": null,
+        "decoder_start_token_id": null,
+        "depths": [
+            1,
+            1,
+            9,
+            1
+        ],
+        "dim_embed": [
+            256,
+            512,
+            1024,
+            2048
+        ],
+        "diversity_penalty": 0.0,
+        "do_sample": false,
+        "drop_path_rate": 0.1,
+        "early_stopping": false,
+        "enable_checkpoint": false,
+        "encoder_no_repeat_ngram_size": 0,
+        "eos_token_id": null,
+        "exponential_decay_length_penalty": null,
+        "finetuning_task": null,
+        "forced_bos_token_id": null,
+        "forced_eos_token_id": null,
+        "id2label": {
+            "0": "LABEL_0",
+            "1": "LABEL_1"
+        },
+        "image_feature_source": [
+            "spatial_avg_pool",
+            "temporal_avg_pool"
+        ],
+        "image_pos_embed": {
+            "max_pos_embeddings": 50,
+            "type": "learned_abs_2d"
+        },
+        "is_decoder": false,
+        "is_encoder_decoder": false,
+        "label2id": {
+            "LABEL_0": 0,
+            "LABEL_1": 1
+        },
+        "length_penalty": 1.0,
+        "max_length": 20,
+        "min_length": 0,
+        "model_type": "davit",
+        "no_repeat_ngram_size": 0,
+        "num_beam_groups": 1,
+        "num_beams": 1,
+        "num_groups": [
+            8,
+            16,
+            32,
+            64
+        ],
+        "num_heads": [
+            8,
+            16,
+            32,
+            64
+        ],
+        "num_return_sequences": 1,
+        "output_attentions": false,
+        "output_hidden_states": false,
+        "output_scores": false,
+        "pad_token_id": null,
+        "patch_padding": [
+            3,
+            1,
+            1,
+            1
+        ],
+        "patch_prenorm": [
+            false,
+            true,
+            true,
+            true
+        ],
+        "patch_size": [
+            7,
+            3,
+            3,
+            3
+        ],
+        "patch_stride": [
+            4,
+            2,
+            2,
+            2
+        ],
+        "prefix": null,
+        "problem_type": null,
+        "projection_dim": 1024,
+        "pruned_heads": {},
+        "remove_invalid_values": false,
+        "repetition_penalty": 1.0,
+        "return_dict": true,
+        "return_dict_in_generate": false,
+        "sep_token_id": null,
+        "suppress_tokens": null,
+        "task_specific_params": null,
+        "temperature": 1.0,
+        "tf_legacy_loss": false,
+        "tie_encoder_decoder": false,
+        "tie_word_embeddings": true,
+        "tokenizer_class": null,
+        "top_k": 50,
+        "top_p": 1.0,
+        "torch_dtype": null,
+        "torchscript": false,
+        "typical_p": 1.0,
+        "use_bfloat16": false,
+        "visual_temporal_embedding": {
+            "max_temporal_embeddings": 100,
+            "type": "COSINE"
+        },
+        "window_size": 12
+    },
+    "text_config": {
+        "_attn_implementation_autoset": true,
+        "_name_or_path": "",
+        "activation_dropout": 0.1,
+        "activation_function": "gelu",
+        "add_cross_attention": false,
+        "architectures": null,
+        "attention_dropout": 0.1,
+        "bad_words_ids": null,
+        "begin_suppress_tokens": null,
+        "bos_token_id": 0,
+        "chunk_size_feed_forward": 0,
+        "classifier_dropout": 0.0,
+        "cross_attention_hidden_size": null,
+        "d_model": 1024,
+        "decoder_attention_heads": 16,
+        "decoder_ffn_dim": 4096,
+        "decoder_layerdrop": 0.0,
+        "decoder_layers": 12,
+        "decoder_start_token_id": 2,
+        "diversity_penalty": 0.0,
+        "do_sample": false,
+        "dropout": 0.1,
+        "early_stopping": false,
+        "encoder_attention_heads": 16,
+        "encoder_ffn_dim": 4096,
+        "encoder_layerdrop": 0.0,
+        "encoder_layers": 12,
+        "encoder_no_repeat_ngram_size": 0,
+        "eos_token_id": 2,
+        "exponential_decay_length_penalty": null,
+        "finetuning_task": null,
+        "forced_bos_token_id": null,
+        "forced_eos_token_id": 2,
+        "id2label": {
+            "0": "LABEL_0",
+            "1": "LABEL_1",
+            "2": "LABEL_2"
+        },
+        "init_std": 0.02,
+        "is_decoder": false,
+        "is_encoder_decoder": true,
+        "label2id": {
+            "LABEL_0": 0,
+            "LABEL_1": 1,
+            "LABEL_2": 2
+        },
+        "length_penalty": 1.0,
+        "max_length": 20,
+        "max_position_embeddings": 4096,
+        "min_length": 0,
+        "model_type": "florence2_language",
+        "no_repeat_ngram_size": 0,
+        "num_beam_groups": 1,
+        "num_beams": 3,
+        "num_hidden_layers": 12,
+        "num_return_sequences": 1,
+        "output_attentions": false,
+        "output_hidden_states": false,
+        "output_scores": false,
+        "pad_token_id": 1,
+        "prefix": null,
+        "problem_type": null,
+        "pruned_heads": {},
+        "remove_invalid_values": false,
+        "repetition_penalty": 1.0,
+        "return_dict": true,
+        "return_dict_in_generate": false,
+        "scale_embedding": false,
+        "sep_token_id": null,
+        "suppress_tokens": null,
+        "task_specific_params": null,
+        "temperature": 1.0,
+        "tf_legacy_loss": false,
+        "tie_encoder_decoder": false,
+        "tie_word_embeddings": true,
+        "tokenizer_class": null,
+        "top_k": 50,
+        "top_p": 1.0,
+        "torch_dtype": null,
+        "torchscript": false,
+        "typical_p": 1.0,
+        "use_bfloat16": false,
+        "use_cache": true,
+        "vocab_size": 51289
+    }
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a0e29448111df56b9485cb1f964db799205794c49baeda48ef960589ce649ab
+size 3519073692

policy_postprocessor.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "name": "policy_postprocessor",
+  "steps": [
+    {
+      "registry_name": "unnormalizer_processor",
+      "config": {
+        "eps": 1e-08,
+        "features": {
+          "action": {
+            "type": "ACTION",
+            "shape": [
+              20
+            ]
+          }
+        },
+        "norm_map": {
+          "VISUAL": "MEAN_STD",
+          "STATE": "IDENTITY",
+          "ACTION": "IDENTITY"
+        }
+      }
+    },
+    {
+      "registry_name": "device_processor",
+      "config": {
+        "device": "cpu",
+        "float_dtype": null
+      }
+    }
+  ]
+}

policy_preprocessor.json ADDED Viewed

	@@ -0,0 +1,74 @@

+{
+  "name": "policy_preprocessor",
+  "steps": [
+    {
+      "registry_name": "rename_observations_processor",
+      "config": {
+        "rename_map": {}
+      }
+    },
+    {
+      "registry_name": "to_batch_processor",
+      "config": {}
+    },
+    {
+      "registry_name": "tokenizer_processor",
+      "config": {
+        "max_length": 64,
+        "task_key": "task",
+        "padding_side": "right",
+        "padding": "max_length",
+        "truncation": true,
+        "tokenizer_name": "facebook/bart-large"
+      }
+    },
+    {
+      "registry_name": "device_processor",
+      "config": {
+        "device": "cuda",
+        "float_dtype": null
+      }
+    },
+    {
+      "registry_name": "normalizer_processor",
+      "config": {
+        "eps": 1e-08,
+        "features": {
+          "observation.images.image": {
+            "type": "VISUAL",
+            "shape": [
+              3,
+              256,
+              256
+            ]
+          },
+          "observation.images.image2": {
+            "type": "VISUAL",
+            "shape": [
+              3,
+              256,
+              256
+            ]
+          },
+          "observation.state": {
+            "type": "STATE",
+            "shape": [
+              8
+            ]
+          },
+          "action": {
+            "type": "ACTION",
+            "shape": [
+              20
+            ]
+          }
+        },
+        "norm_map": {
+          "VISUAL": "MEAN_STD",
+          "STATE": "IDENTITY",
+          "ACTION": "IDENTITY"
+        }
+      }
+    }
+  ]
+}