| | --- |
| | base_model: unsloth/Qwen3-0.6B |
| | library_name: peft |
| | license: mit |
| | datasets: |
| | - unsloth/OpenMathReasoning-mini |
| | - mlabonne/FineTome-100k |
| | language: |
| | - en |
| | pipeline_tag: text-generation |
| | tags: |
| | - math |
| | - transformers |
| | - unsloth |
| | - sft |
| | - trl |
| | --- |
| | |
| | # Model Card for Qwen3-0.6B-OpenMathReason |
| |
|
| | ### Model Description |
| |
|
| | This model is fine-tuned version of Qwen/Qwen3-0.6B using the Unsloth library and LoRA for parameter-efficient training. |
| | This model is trained on two datasets: |
| | - unsloth/OpenMathReason-mini — for enhancing mathematical reasoning skills. |
| | - mlabonne/FineTome-100k — to improve general conversational abilities. |
| |
|
| | #### Model Details |
| |
|
| | - **Developed by:** Rustam Shiriyev |
| | - **Language(s) (NLP):** English |
| | - **License:** MIT |
| | - **Finetuned from model:** unsloth/Qwen3-0.6B |
| |
|
| |
|
| | ## Uses |
| |
|
| | ### Direct Use |
| |
|
| | This model can be used as a lightweight assistant capable of solving basic to intermediate math problems (OpenMathReason tasks). |
| |
|
| | ### Downstream Use |
| |
|
| | - Can be integrated into educational chatbots for STEM learning. |
| |
|
| | ### Out-of-Scope Use |
| |
|
| | - Not suitable for high-stakes decision-making. |
| |
|
| | ## Bias, Risks, and Limitations |
| |
|
| | - Mathematical reasoning is limited to the scope of the OpenMathReason-mini dataset. |
| | - Conversational quality may degrade with complex or multi-turn inputs. |
| |
|
| |
|
| | ## How to Get Started with the Model |
| |
|
| | ```python |
| | from transformers import TextStreamer |
| | from huggingface_hub import login |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | from peft import PeftModel |
| | |
| | |
| | login(token="") |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-0.6B",) |
| | base_model = AutoModelForCausalLM.from_pretrained( |
| | "unsloth/Qwen3-0.6B", |
| | device_map={"": 0}, token="" |
| | ) |
| | |
| | model = PeftModel.from_pretrained(base_model,"Rustamshry/Qwen3-0.6B-OpenMathReason") |
| | |
| | question = "Solve (x + 2)^2 = 0" |
| | |
| | messages = [ |
| | {"role" : "user", "content" : question} |
| | ] |
| | |
| | text = tokenizer.apply_chat_template( |
| | messages, |
| | tokenize = False, |
| | add_generation_prompt = True, |
| | enable_thinking = True, |
| | ) |
| | |
| | _ = model.generate( |
| | **tokenizer(text, return_tensors = "pt").to(model.device), |
| | max_new_tokens = 2048, |
| | temperature = 0.6, top_p = 0.95, top_k = 20, |
| | streamer = TextStreamer(tokenizer, skip_prompt = True), |
| | ) |
| | ``` |
| | ## Training Details |
| |
|
| | ### Training Data |
| |
|
| | - unsloth/OpenMathReason-mini: 10k+ instruction-following examples focused on math. |
| | - mlabonne/FineTome-100k: 100k examples of diverse, high-quality chat data. |
| |
|
| | ### Training Procedure |
| |
|
| | - batch size=8, |
| | - gradient accumulation steps=2, |
| | - optimizer=adamw_torch, |
| | - learning rate=2e-5, |
| | - warmup steps=100, |
| | - fp16=True, |
| | - dataloader_num_workers=16, |
| | - num_train_epochs=1, |
| | - weight_decay=0.01, |
| | - lr_scheduler_type = "linear" |
| |
|
| |
|
| |
|
| | ### Results |
| |
|
| | - Loss Value >> 0.56 |
| |
|
| | ### Framework versions |
| |
|
| | - PEFT 0.14.0 |