codemichaeld commited on
Commit
2ebf369
·
verified ·
1 Parent(s): c6b6c70

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +118 -0
README.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: diffusers
3
+ tags:
4
+ - fp8
5
+ - safetensors
6
+ - precision-recovery
7
+ - mixed-method
8
+ - converted-by-gradio
9
+ ---
10
+ # FP8 Model with Per-Tensor Precision Recovery
11
+ - **Source**: `https://huggingface.co/MiaoshouAI/Florence-2-base-PromptGen-v1.5`
12
+ - **Original File**: `model.safetensors`
13
+ - **FP8 Format**: `E5M2`
14
+ - **FP8 File**: `model-fp8-e5m2.safetensors`
15
+ - **Recovery File**: `model-recovery.safetensors`
16
+
17
+ ## Recovery Rules Used
18
+ ```json
19
+ [
20
+ {
21
+ "key_pattern": "vae",
22
+ "dim": 4,
23
+ "method": "diff"
24
+ },
25
+ {
26
+ "key_pattern": "encoder",
27
+ "dim": 4,
28
+ "method": "diff"
29
+ },
30
+ {
31
+ "key_pattern": "decoder",
32
+ "dim": 4,
33
+ "method": "diff"
34
+ },
35
+ {
36
+ "key_pattern": "text",
37
+ "dim": 2,
38
+ "min_size": 10000,
39
+ "method": "lora",
40
+ "rank": 64
41
+ },
42
+ {
43
+ "key_pattern": "emb",
44
+ "dim": 2,
45
+ "min_size": 10000,
46
+ "method": "lora",
47
+ "rank": 64
48
+ },
49
+ {
50
+ "key_pattern": "attn",
51
+ "dim": 2,
52
+ "min_size": 10000,
53
+ "method": "lora",
54
+ "rank": 128
55
+ },
56
+ {
57
+ "key_pattern": "conv",
58
+ "dim": 4,
59
+ "method": "diff"
60
+ },
61
+ {
62
+ "key_pattern": "resnet",
63
+ "dim": 4,
64
+ "method": "diff"
65
+ },
66
+ {
67
+ "key_pattern": "all",
68
+ "method": "none"
69
+ }
70
+ ]
71
+ ```
72
+
73
+ ## Usage (Inference)
74
+ ```python
75
+ from safetensors.torch import load_file
76
+ import torch
77
+
78
+ # Load FP8 model
79
+ fp8_state = load_file("model-fp8-e5m2.safetensors")
80
+
81
+ # Load recovery weights if available
82
+ recovery_state = load_file("model-recovery.safetensors") if "model-recovery.safetensors" and os.path.exists("model-recovery.safetensors") else {}
83
+
84
+ # Reconstruct high-precision weights
85
+ reconstructed = {}
86
+ for key in fp8_state:
87
+ fp8_weight = fp8_state[key].to(torch.float32) # Convert to float32 for computation
88
+
89
+ # Apply LoRA recovery if available
90
+ lora_a_key = f"lora_A.{key}"
91
+ lora_b_key = f"lora_B.{key}"
92
+ if lora_a_key in recovery_state and lora_b_key in recovery_state:
93
+ A = recovery_state[lora_a_key].to(torch.float32)
94
+ B = recovery_state[lora_b_key].to(torch.float32)
95
+ # Reconstruct the low-rank approximation
96
+ lora_weight = B @ A
97
+ fp8_weight = fp8_weight + lora_weight
98
+
99
+ # Apply difference recovery if available
100
+ diff_key = f"diff.{key}"
101
+ if diff_key in recovery_state:
102
+ diff = recovery_state[diff_key].to(torch.float32)
103
+ fp8_weight = fp8_weight + diff
104
+
105
+ reconstructed[key] = fp8_weight
106
+
107
+ # Use reconstructed weights in your model
108
+ model.load_state_dict(reconstructed)
109
+ ```
110
+
111
+ > **Note**: For best results, use the same recovery configuration during inference as was used during extraction.
112
+ > Requires PyTorch ≥ 2.1 for FP8 support.
113
+
114
+ ## Statistics
115
+ - **Total layers**: 667
116
+ - **Layers with recovery**: 177
117
+ - LoRA recovery: 125
118
+ - Difference recovery: 52