cycloneboy
/

SLM-SQL-0.6B

@@ -1,20 +1,21 @@
 ---
-pipeline_tag: text-generation
 library_name: transformers
 license: cc-by-nc-4.0
 tags:
 - text-to-sql
 - reinforcement-learning
 ---
 # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
 ### Important Links
-📖[Arxiv Paper](https://arxiv.org/abs/2507.22478) |
-🤗[HuggingFace](https://huggingface.co/collections/cycloneboy/slm-sql-688b02f99f958d7a417658dc) |
-🤖[ModelScope](https://modelscope.cn/collections/SLM-SQL-624bb6a60e9643) |
 ## News
@@ -36,24 +37,94 @@ tags:
 > and generalizability of our method, SLM-SQL. On the BIRD development set, the five evaluated models achieved an
 > average
 > improvement of 31.4 points. Notably, the 0.5B model reached 56.87\% execution accuracy (EX), while the 1.5B model
-> achieved 67.08\% EX. We will release our dataset, model, and code to github: https://github.com/CycloneBoy/slm_sql.
 ### Framework
-<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_framework.png"  height="500" alt="slmsql_framework">
 ### Main Results
-<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_result.png"  height="500" alt="slm_sql_result">
-<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_main.png"  height="500" alt="slmsql_bird_main">
-<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_spider_main.png"  height="500" alt="slmsql_spider_main">
 Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset.
-<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_ablation_study.png"  height="300" alt="slmsql_ablation_study">
 ## Model
@@ -115,5 +186,4 @@ Performance Comparison of different Text-to-SQL methods on BIRD dev and test dat
       archivePrefix={arXiv},
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2505.13271},
-}
-```

 ---
 library_name: transformers
 license: cc-by-nc-4.0
+pipeline_tag: text-generation
 tags:
 - text-to-sql
 - reinforcement-learning
 ---
 # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
 ### Important Links
+📖[Hugging Face Paper](https://huggingface.co/papers/2507.22478) |
+📚[arXiv Paper](https://arxiv.org/abs/2507.22478) |
+💻[GitHub Repository](https://github.com/CycloneBoy/slm_sql) |
+🤗[Hugging Face Models Collection](https://huggingface.co/collections/cycloneboy/slm-sql-688b02f99f958d7a417658dc) |
+🤖[ModelScope Models Collection](https://modelscope.cn/collections/SLM-SQL-624bb6a60e9643) |
 ## News
 > and generalizability of our method, SLM-SQL. On the BIRD development set, the five evaluated models achieved an
 > average
 > improvement of 31.4 points. Notably, the 0.5B model reached 56.87\% execution accuracy (EX), while the 1.5B model
+> achieved 67.08\% EX.
 ### Framework
+<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_framework.png" height="500" alt="slmsql_framework">
 ### Main Results
+<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_result.png" height="500" alt="slm_sql_result">
+<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_main.png" height="500" alt="slmsql_bird_main">
+<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_spider_main.png" height="500" alt="slmsql_spider_main">
 Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset.
+<img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_ablation_study.png" height="300" alt="slm_sql_ablation_study">
+## Usage
+This model can be used with the Hugging Face `transformers` library for text-to-SQL generation.
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+# Load model and tokenizer
+# Replace "cycloneboy/SLM-SQL-0.5B" with the specific model checkpoint you want to use.
+model_id = "cycloneboy/SLM-SQL-0.5B"
+tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True)
+# Set the model to evaluation mode
+model.eval()
+# Define the natural language question and database schema (replace with your data)
+user_query = "What are the names of all employees who earn more than 50000?"
+database_schema = """
+CREATE TABLE employees (
+    employee_id INT PRIMARY KEY,
+    name VARCHAR(255),
+    salary DECIMAL(10, 2)
+);
+"""
+# Construct the conversation using the model's chat template
+# The model expects schema and question to generate the SQL query.
+# The prompt format below is a common way to combine schema and question for Text-to-SQL.
+full_prompt = f"""
+You are a Text-to-SQL model.
+Given the following database schema:
+{database_schema}
+Generate the SQL query for the question:
+{user_query}
+"""
+messages = [
+    {"role": "user", "content": full_prompt.strip()}
+]
+# Apply the chat template and tokenize inputs
+input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
+# Generate the SQL query
+with torch.no_grad():
+    outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.6, top_p=0.9, do_sample=True,
+                             eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|im_end|>")])
+# Decode the generated text and extract the assistant's response
+generated_text = tokenizer.decode(outputs[0], skip_special_tokens=False)
+# The Qwen-style chat template wraps assistant's response between <|im_start|>assistant
+ and <|im_end|>
+assistant_prefix = "<|im_start|>assistant\
+"
+if assistant_prefix in generated_text:
+    sql_query = generated_text.split(assistant_prefix, 1)[1].strip()
+    # Remove any trailing special tokens like <|im_end|>
+    sql_query = sql_query.split("<|im_end|>", 1)[0].strip()
+else:
+    sql_query = generated_text # Fallback in case prompt format differs unexpectedly
+print(f"User Query: {user_query}
+Generated SQL: {sql_query}")
+# Example of a potential output for the given query and schema:
+# Generated SQL: SELECT name FROM employees WHERE salary > 50000;
+```
 ## Model
       archivePrefix={arXiv},
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2505.13271},
+}