jadechoghari commited on
Commit
f99b895
·
verified ·
1 Parent(s): b56d216

Migrate policy to PolicyProcessorPipeline system

Browse files

**Automated Policy Migration to PolicyProcessorPipeline**

This PR migrates your model to the new LeRobot policy format using the modern PolicyProcessorPipeline architecture.

## What Changed

### **New Architecture - PolicyProcessorPipeline**
Your model now uses external PolicyProcessorPipeline components for data processing instead of built-in normalization layers. This provides:
- **Modularity**: Separate preprocessing and postprocessing pipelines
- **Flexibility**: Easy to swap, configure, and debug processing steps
- **Compatibility**: Works with the latest LeRobot ecosystem

### **Normalization Extraction**
We've extracted normalization statistics from your model's state_dict and removed the built-in normalization layers:
- **Extracted patterns**: `normalize_inputs.*`, `unnormalize_outputs.*`, `normalize.*`, `unnormalize.*`, `input_normalizer.*`, `output_normalizer.*`
- **Statistics preserved**: Mean, std, min, max values for all features
- **Clean model**: State dict now contains only core model weights

### **Files Added**
- **preprocessor_config.json**: Configuration for input preprocessing pipeline
- **postprocessor_config.json**: Configuration for output postprocessing pipeline
- **model.safetensors**: Clean model weights without normalization layers
- **config.json**: Updated model configuration
- **train_config.json**: Training configuration
- **README.md**: Updated model card with migration information

### **Benefits**
- **Backward Compatible**: Your model behavior remains identical
- **Future Ready**: Compatible with latest LeRobot features and updates
- **Debuggable**: Easy to inspect and modify processing steps
- **Portable**: Processors can be shared and reused across models

### **Usage**
```python
# Load your migrated model
from lerobot.policies import get_policy_class
from lerobot.processor import PolicyProcessorPipeline

# The preprocessor and postprocessor are now external
preprocessor = PolicyProcessorPipeline.from_pretrained("your-model-repo", config_filename="preprocessor_config.json")
postprocessor = PolicyProcessorPipeline.from_pretrained("your-model-repo", config_filename="postprocessor_config.json")
policy = get_policy_class("your-policy-type").from_pretrained("your-model-repo")

# Process data through the pipeline
processed_batch = preprocessor(raw_batch)
action = policy(processed_batch)
final_action = postprocessor(action)
```

*Generated automatically by the LeRobot policy migration script*

README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets: unknown
3
+ library_name: lerobot
4
+ license: apache-2.0
5
+ model_name: xvla
6
+ pipeline_tag: robotics
7
+ tags:
8
+ - xvla
9
+ - lerobot
10
+ - robotics
11
+ ---
12
+
13
+ # Model Card for xvla
14
+
15
+ <!-- Provide a quick summary of what the model is/does. -->
16
+
17
+
18
+ _Model type not recognized — please update this template._
19
+
20
+
21
+ This policy has been trained and pushed to the Hub using [LeRobot](https://github.com/huggingface/lerobot).
22
+ See the full documentation at [LeRobot Docs](https://huggingface.co/docs/lerobot/index).
23
+
24
+ ---
25
+
26
+ ## How to Get Started with the Model
27
+
28
+ For a complete walkthrough, see the [training guide](https://huggingface.co/docs/lerobot/il_robots#train-a-policy).
29
+ Below is the short version on how to train and run inference/eval:
30
+
31
+ ### Train from scratch
32
+
33
+ ```bash
34
+ lerobot-train \
35
+ --dataset.repo_id=${HF_USER}/<dataset> \
36
+ --policy.type=act \
37
+ --output_dir=outputs/train/<desired_policy_repo_id> \
38
+ --job_name=lerobot_training \
39
+ --policy.device=cuda \
40
+ --policy.repo_id=${HF_USER}/<desired_policy_repo_id>
41
+ --wandb.enable=true
42
+ ```
43
+
44
+ _Writes checkpoints to `outputs/train/<desired_policy_repo_id>/checkpoints/`._
45
+
46
+ ### Evaluate the policy/run inference
47
+
48
+ ```bash
49
+ lerobot-record \
50
+ --robot.type=so100_follower \
51
+ --dataset.repo_id=<hf_user>/eval_<dataset> \
52
+ --policy.path=<hf_user>/<desired_policy_repo_id> \
53
+ --episodes=10
54
+ ```
55
+
56
+ Prefix the dataset repo with **eval\_** and supply `--policy.path` pointing to a local or hub checkpoint.
57
+
58
+ ---
59
+
60
+ ## Model Details
61
+
62
+ - **License:** apache-2.0
config.json ADDED
@@ -0,0 +1,302 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "type": "xvla",
3
+ "n_obs_steps": 1,
4
+ "input_features": {
5
+ "observation.images.image": {
6
+ "type": "VISUAL",
7
+ "shape": [
8
+ 3,
9
+ 256,
10
+ 256
11
+ ]
12
+ },
13
+ "observation.images.image2": {
14
+ "type": "VISUAL",
15
+ "shape": [
16
+ 3,
17
+ 256,
18
+ 256
19
+ ]
20
+ },
21
+ "observation.state": {
22
+ "type": "STATE",
23
+ "shape": [
24
+ 8
25
+ ]
26
+ }
27
+ },
28
+ "output_features": {
29
+ "action": {
30
+ "type": "ACTION",
31
+ "shape": [
32
+ 20
33
+ ]
34
+ }
35
+ },
36
+ "device": "cuda",
37
+ "use_amp": false,
38
+ "push_to_hub": true,
39
+ "repo_id": "jadechoghari/X-VLA-Libero",
40
+ "private": null,
41
+ "tags": null,
42
+ "license": null,
43
+ "pretrained_path": null,
44
+ "chunk_size": 32,
45
+ "n_action_steps": 32,
46
+ "num_actions": 32,
47
+ "normalization_mapping": {
48
+ "VISUAL": "MEAN_STD",
49
+ "STATE": "IDENTITY",
50
+ "ACTION": "IDENTITY"
51
+ },
52
+ "florence_config": {},
53
+ "tokenizer_name": "facebook/bart-large",
54
+ "tokenizer_max_length": 64,
55
+ "tokenizer_padding_side": "right",
56
+ "pad_language_to": "max_length",
57
+ "hidden_size": 1024,
58
+ "depth": 24,
59
+ "num_heads": 16,
60
+ "mlp_ratio": 4.0,
61
+ "num_domains": 30,
62
+ "len_soft_prompts": 32,
63
+ "dim_time": 32,
64
+ "max_len_seq": 512,
65
+ "use_hetero_proj": false,
66
+ "action_mode": "ee6d",
67
+ "num_denoising_steps": 10,
68
+ "use_proprio": true,
69
+ "max_state_dim": 20,
70
+ "domain_feature_key": null,
71
+ "resize_imgs_with_padding": [
72
+ 518,
73
+ 518
74
+ ],
75
+ "num_image_views": 2,
76
+ "empty_cameras": 0,
77
+ "optimizer_lr": 0.0001,
78
+ "optimizer_betas": [
79
+ 0.9,
80
+ 0.95
81
+ ],
82
+ "optimizer_eps": 1e-08,
83
+ "optimizer_weight_decay": 0.0001,
84
+ "optimizer_grad_clip_norm": 10.0,
85
+ "scheduler_warmup_steps": 1000,
86
+ "scheduler_decay_steps": 30000,
87
+ "scheduler_decay_lr": 2.5e-06,
88
+ "vision_config": {
89
+ "_attn_implementation_autoset": false,
90
+ "_name_or_path": "",
91
+ "add_cross_attention": false,
92
+ "architectures": null,
93
+ "bad_words_ids": null,
94
+ "begin_suppress_tokens": null,
95
+ "bos_token_id": null,
96
+ "chunk_size_feed_forward": 0,
97
+ "cross_attention_hidden_size": null,
98
+ "decoder_start_token_id": null,
99
+ "depths": [
100
+ 1,
101
+ 1,
102
+ 9,
103
+ 1
104
+ ],
105
+ "dim_embed": [
106
+ 256,
107
+ 512,
108
+ 1024,
109
+ 2048
110
+ ],
111
+ "diversity_penalty": 0.0,
112
+ "do_sample": false,
113
+ "drop_path_rate": 0.1,
114
+ "early_stopping": false,
115
+ "enable_checkpoint": false,
116
+ "encoder_no_repeat_ngram_size": 0,
117
+ "eos_token_id": null,
118
+ "exponential_decay_length_penalty": null,
119
+ "finetuning_task": null,
120
+ "forced_bos_token_id": null,
121
+ "forced_eos_token_id": null,
122
+ "id2label": {
123
+ "0": "LABEL_0",
124
+ "1": "LABEL_1"
125
+ },
126
+ "image_feature_source": [
127
+ "spatial_avg_pool",
128
+ "temporal_avg_pool"
129
+ ],
130
+ "image_pos_embed": {
131
+ "max_pos_embeddings": 50,
132
+ "type": "learned_abs_2d"
133
+ },
134
+ "is_decoder": false,
135
+ "is_encoder_decoder": false,
136
+ "label2id": {
137
+ "LABEL_0": 0,
138
+ "LABEL_1": 1
139
+ },
140
+ "length_penalty": 1.0,
141
+ "max_length": 20,
142
+ "min_length": 0,
143
+ "model_type": "davit",
144
+ "no_repeat_ngram_size": 0,
145
+ "num_beam_groups": 1,
146
+ "num_beams": 1,
147
+ "num_groups": [
148
+ 8,
149
+ 16,
150
+ 32,
151
+ 64
152
+ ],
153
+ "num_heads": [
154
+ 8,
155
+ 16,
156
+ 32,
157
+ 64
158
+ ],
159
+ "num_return_sequences": 1,
160
+ "output_attentions": false,
161
+ "output_hidden_states": false,
162
+ "output_scores": false,
163
+ "pad_token_id": null,
164
+ "patch_padding": [
165
+ 3,
166
+ 1,
167
+ 1,
168
+ 1
169
+ ],
170
+ "patch_prenorm": [
171
+ false,
172
+ true,
173
+ true,
174
+ true
175
+ ],
176
+ "patch_size": [
177
+ 7,
178
+ 3,
179
+ 3,
180
+ 3
181
+ ],
182
+ "patch_stride": [
183
+ 4,
184
+ 2,
185
+ 2,
186
+ 2
187
+ ],
188
+ "prefix": null,
189
+ "problem_type": null,
190
+ "projection_dim": 1024,
191
+ "pruned_heads": {},
192
+ "remove_invalid_values": false,
193
+ "repetition_penalty": 1.0,
194
+ "return_dict": true,
195
+ "return_dict_in_generate": false,
196
+ "sep_token_id": null,
197
+ "suppress_tokens": null,
198
+ "task_specific_params": null,
199
+ "temperature": 1.0,
200
+ "tf_legacy_loss": false,
201
+ "tie_encoder_decoder": false,
202
+ "tie_word_embeddings": true,
203
+ "tokenizer_class": null,
204
+ "top_k": 50,
205
+ "top_p": 1.0,
206
+ "torch_dtype": null,
207
+ "torchscript": false,
208
+ "typical_p": 1.0,
209
+ "use_bfloat16": false,
210
+ "visual_temporal_embedding": {
211
+ "max_temporal_embeddings": 100,
212
+ "type": "COSINE"
213
+ },
214
+ "window_size": 12
215
+ },
216
+ "text_config": {
217
+ "_attn_implementation_autoset": true,
218
+ "_name_or_path": "",
219
+ "activation_dropout": 0.1,
220
+ "activation_function": "gelu",
221
+ "add_cross_attention": false,
222
+ "architectures": null,
223
+ "attention_dropout": 0.1,
224
+ "bad_words_ids": null,
225
+ "begin_suppress_tokens": null,
226
+ "bos_token_id": 0,
227
+ "chunk_size_feed_forward": 0,
228
+ "classifier_dropout": 0.0,
229
+ "cross_attention_hidden_size": null,
230
+ "d_model": 1024,
231
+ "decoder_attention_heads": 16,
232
+ "decoder_ffn_dim": 4096,
233
+ "decoder_layerdrop": 0.0,
234
+ "decoder_layers": 12,
235
+ "decoder_start_token_id": 2,
236
+ "diversity_penalty": 0.0,
237
+ "do_sample": false,
238
+ "dropout": 0.1,
239
+ "early_stopping": false,
240
+ "encoder_attention_heads": 16,
241
+ "encoder_ffn_dim": 4096,
242
+ "encoder_layerdrop": 0.0,
243
+ "encoder_layers": 12,
244
+ "encoder_no_repeat_ngram_size": 0,
245
+ "eos_token_id": 2,
246
+ "exponential_decay_length_penalty": null,
247
+ "finetuning_task": null,
248
+ "forced_bos_token_id": null,
249
+ "forced_eos_token_id": 2,
250
+ "id2label": {
251
+ "0": "LABEL_0",
252
+ "1": "LABEL_1",
253
+ "2": "LABEL_2"
254
+ },
255
+ "init_std": 0.02,
256
+ "is_decoder": false,
257
+ "is_encoder_decoder": true,
258
+ "label2id": {
259
+ "LABEL_0": 0,
260
+ "LABEL_1": 1,
261
+ "LABEL_2": 2
262
+ },
263
+ "length_penalty": 1.0,
264
+ "max_length": 20,
265
+ "max_position_embeddings": 4096,
266
+ "min_length": 0,
267
+ "model_type": "florence2_language",
268
+ "no_repeat_ngram_size": 0,
269
+ "num_beam_groups": 1,
270
+ "num_beams": 3,
271
+ "num_hidden_layers": 12,
272
+ "num_return_sequences": 1,
273
+ "output_attentions": false,
274
+ "output_hidden_states": false,
275
+ "output_scores": false,
276
+ "pad_token_id": 1,
277
+ "prefix": null,
278
+ "problem_type": null,
279
+ "pruned_heads": {},
280
+ "remove_invalid_values": false,
281
+ "repetition_penalty": 1.0,
282
+ "return_dict": true,
283
+ "return_dict_in_generate": false,
284
+ "scale_embedding": false,
285
+ "sep_token_id": null,
286
+ "suppress_tokens": null,
287
+ "task_specific_params": null,
288
+ "temperature": 1.0,
289
+ "tf_legacy_loss": false,
290
+ "tie_encoder_decoder": false,
291
+ "tie_word_embeddings": true,
292
+ "tokenizer_class": null,
293
+ "top_k": 50,
294
+ "top_p": 1.0,
295
+ "torch_dtype": null,
296
+ "torchscript": false,
297
+ "typical_p": 1.0,
298
+ "use_bfloat16": false,
299
+ "use_cache": true,
300
+ "vocab_size": 51289
301
+ }
302
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2a0e29448111df56b9485cb1f964db799205794c49baeda48ef960589ce649ab
3
+ size 3519073692
policy_postprocessor.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "policy_postprocessor",
3
+ "steps": [
4
+ {
5
+ "registry_name": "unnormalizer_processor",
6
+ "config": {
7
+ "eps": 1e-08,
8
+ "features": {
9
+ "action": {
10
+ "type": "ACTION",
11
+ "shape": [
12
+ 20
13
+ ]
14
+ }
15
+ },
16
+ "norm_map": {
17
+ "VISUAL": "MEAN_STD",
18
+ "STATE": "IDENTITY",
19
+ "ACTION": "IDENTITY"
20
+ }
21
+ }
22
+ },
23
+ {
24
+ "registry_name": "device_processor",
25
+ "config": {
26
+ "device": "cpu",
27
+ "float_dtype": null
28
+ }
29
+ }
30
+ ]
31
+ }
policy_preprocessor.json ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "policy_preprocessor",
3
+ "steps": [
4
+ {
5
+ "registry_name": "rename_observations_processor",
6
+ "config": {
7
+ "rename_map": {}
8
+ }
9
+ },
10
+ {
11
+ "registry_name": "to_batch_processor",
12
+ "config": {}
13
+ },
14
+ {
15
+ "registry_name": "tokenizer_processor",
16
+ "config": {
17
+ "max_length": 64,
18
+ "task_key": "task",
19
+ "padding_side": "right",
20
+ "padding": "max_length",
21
+ "truncation": true,
22
+ "tokenizer_name": "facebook/bart-large"
23
+ }
24
+ },
25
+ {
26
+ "registry_name": "device_processor",
27
+ "config": {
28
+ "device": "cuda",
29
+ "float_dtype": null
30
+ }
31
+ },
32
+ {
33
+ "registry_name": "normalizer_processor",
34
+ "config": {
35
+ "eps": 1e-08,
36
+ "features": {
37
+ "observation.images.image": {
38
+ "type": "VISUAL",
39
+ "shape": [
40
+ 3,
41
+ 256,
42
+ 256
43
+ ]
44
+ },
45
+ "observation.images.image2": {
46
+ "type": "VISUAL",
47
+ "shape": [
48
+ 3,
49
+ 256,
50
+ 256
51
+ ]
52
+ },
53
+ "observation.state": {
54
+ "type": "STATE",
55
+ "shape": [
56
+ 8
57
+ ]
58
+ },
59
+ "action": {
60
+ "type": "ACTION",
61
+ "shape": [
62
+ 20
63
+ ]
64
+ }
65
+ },
66
+ "norm_map": {
67
+ "VISUAL": "MEAN_STD",
68
+ "STATE": "IDENTITY",
69
+ "ACTION": "IDENTITY"
70
+ }
71
+ }
72
+ }
73
+ ]
74
+ }