AdityaNarayan commited on
Commit
f74af5a
ยท
verified ยท
1 Parent(s): f549109

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: zai-org/GLM-4.6
3
+ tags:
4
+ - rust
5
+ - Hyperswitch
6
+ - LoRA
7
+ - CPT
8
+ - Fine-Tuned
9
+ - Causal-LM
10
+ pipeline_tag: text-generation
11
+ language:
12
+ - en
13
+ datasets:
14
+ - AdityaNarayan/HyperSwitch-Repo-CPT-Dataset
15
+ ---
16
+ # GLM-4.6-CPT-LoRA-HyperSwitch-v1
17
+
18
+ A LoRA fine-tuned model based on **zai-org/GLM-4.6** specialized for the [Hyperswitch](https://github.com/juspay/hyperswitch) Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices.
19
+
20
+ ## ๐ŸŽฏ Model Description
21
+
22
+ This LoRA adapter was trained on **16,731 samples** extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain.
23
+
24
+ - **Base Model**: zai-org/GLM-4.6
25
+ - **Training Type**: Causal Language Modeling (CLM) with LoRA
26
+ - **Domain**: Payment Processing, Rust Development
27
+ - **Specialization**: Hyperswitch codebase patterns and architecture
28
+
29
+ ## ๐Ÿ“Š Training Details
30
+
31
+ ### LoRA Configuration
32
+ ```yaml
33
+ r: 16 # LoRA rank
34
+ alpha: 32 # LoRA alpha (2*r)
35
+ dropout: 0.05 # LoRA dropout
36
+ target_modules:
37
+ - "q_proj"
38
+ - "k_proj"
39
+ - "v_proj"
40
+ - "o_proj"
41
+
42
+ exclude_modules:
43
+ - "block_sparse_moe"
44
+ - "w1"
45
+ - "w2"
46
+ - "w3"
47
+ - "gate"
48
+ ```
49
+
50
+ ### Training Hyperparameters
51
+ - **Epochs**: 5
52
+ - **Learning Rate**: 2e-4 (cosine schedule)
53
+ - **Hardware**: 8 x NVIDIA H200
54
+
55
+
56
+ ## ๐Ÿ› ๏ธ Technical Specifications
57
+
58
+ - **Precision**: bfloat16
59
+ - **Inference Speed**: Optimized with Flash Attention 2
60
+
61
+ ## ๐Ÿ™ Acknowledgments
62
+
63
+ - **Zai Team** for the excellent GLM 4.6 base model
64
+ - **Hyperswitch Team** for the open-source payment processing platform
65
+ - **Hugging Face** for the transformers and PEFT libraries
66
+
67
+ ## ๐Ÿ“ž Citation
68
+
69
+ ```bibtex
70
+ @misc{GLM-4.6-CPT-LoRA-HyperSwitch-v2,
71
+ title={AdityaNarayan/GLM-4.6-CPT-LoRA-HyperSwitch-v2},
72
+ author={Aditya Narayan},
73
+ year={2024},
74
+ publisher={Hugging Face},
75
+ url={https://huggingface.co/AdityaNarayan/GLM-4.6-CPT-LoRA-HyperSwitch-v2}
76
+ }
77
+ ```