cenfis/alpaca-turkish-combined
Viewer • Updated • 82.4k • 201 • 14
How to use emre570/llama3.2-1b-tr-qlora with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="emre570/llama3.2-1b-tr-qlora") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("emre570/llama3.2-1b-tr-qlora")
model = AutoModelForCausalLM.from_pretrained("emre570/llama3.2-1b-tr-qlora")How to use emre570/llama3.2-1b-tr-qlora with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "emre570/llama3.2-1b-tr-qlora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "emre570/llama3.2-1b-tr-qlora",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/emre570/llama3.2-1b-tr-qlora
How to use emre570/llama3.2-1b-tr-qlora with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "emre570/llama3.2-1b-tr-qlora" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "emre570/llama3.2-1b-tr-qlora",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "emre570/llama3.2-1b-tr-qlora" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "emre570/llama3.2-1b-tr-qlora",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use emre570/llama3.2-1b-tr-qlora with Docker Model Runner:
docker model run hf.co/emre570/llama3.2-1b-tr-qlora
This repo contains the experimental-educational fine-tuned model of Meta's new Llama 3.2-1B that can be used for different purposes.
Trained with NVIDIA RTX 3070 Ti, took around 6 hours.
You can use it from Transformers:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("myzens/llama3-8b-tr-finetuned")
model = AutoModelForCausalLM.from_pretrained("myzens/llama3-8b-tr-finetuned")
alpaca_prompt = """
Instruction:
{}
Input:
{}
Response:
{}"""
inputs = tokenizer([
alpaca_prompt.format(
"",
"Ankara'da gezilebilecek 3 yeri söyle ve ne olduklarını kısaca açıkla.",
"",
)], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=192)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Transformers Pipeline:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
tokenizer = AutoTokenizer.from_pretrained("myzens/llama3-8b-tr-finetuned")
model = AutoModelForCausalLM.from_pretrained("myzens/llama3-8b-tr-finetuned")
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
alpaca_prompt = """
Instruction:
{}
Input:
{}
Response:
{}"""
input = alpaca_prompt.format(
"",
"Ankara'da gezilebilecek 3 yeri söyle ve ne olduklarını kısaca açıkla.",
"",
)
pipe(input)
Output:
Instruction:
Input:
Ankara'da gezilebilecek 3 yeri söyle ve ne olduklarını kısaca açıkla.
Response:
1. Anıtkabir - Mustafa Kemal Atatürk'ün mezarı
2. Gençlik ve Spor Sarayı - spor etkinliklerinin yapıldığı yer
3. Kızılay Meydanı - Ankara'nın merkezinde bulunan bir meydan
Fine-tuned by emre570.
Base model
meta-llama/Llama-3.2-1B