💻 Termius minimal 1.5

A model with an objective of making as much as possible of the bash operations in a small and manageable size with good performance.
To understand what the model is actually good at generating, check out the datasets used to train him.

👋 Getting started

pip install peft torch unsloth

Unsloth will already make patches for transformers, pyarrow, ..., and the get_peft_model()

What this model is actually good at :

Dedicated exactly for bash only operation : The model generates nothing other than bash commands, which shift its focus to only that specific task.
Suitable for complicated use cases : Unlike the first version of him datasetter458/termius-minimal-1.0, this model is way more suitable for complicated commands (check the test samples).
Near perfect command results : Near perfect answers for the commands he is mainly trained for (check the datasets).
Blazingly fast performance : Due to it being bash specific, all the performance goes for only this task, which generate fast no matter how complex the command is (check the testing samples).

🧪 Testing samples :

prompt	answer
check if the file '/etc/nginx/nginx.conf' exists and print its line count	`[[ -f "/etc/nginx/nginx.conf" ]] && wc -l < "/etc/nginx/nginx.conf"`
check if the directory '/opt/myapp' exists and if not create it with permissions 755	`[[ -d "/opt/myapp" ]] \|\| mkdir -p -m 755 "/opt/myapp"`
check if port 443 is listening and log the result to /tmp/port_check.log	`lsof -i :443 > /tmp/port_check.log`
check if python3 is installed and print its version	`[[ -e "/usr/bin/python3" ]] && python3 --version`
find all .tmp files older than 7 days and delete them	`find . -name "*.tmp" -mtime +7 -print \| xargs -I {} rm -f {}`
verify the file 'backup.tar.gz' exists then extract it to /opt/restore/	`if [[ -f "backup.tar.gz" ]]; then tar -xzf backup.tar.gz -C /opt/restore/; fi`
check if a user 'guest' exists and if so lock their account and kill their processes	`if id "guest" &>/dev/null; then passwd -l guest && pkill -u guest; fi`
count how many failed SSH login attempts are in /var/log/auth.log	`grep -c "Failed" /var/log/auth.log`
check if a git repo has uncommitted changes and stash them before pulling	`if git status --porcelain \| grep -q .; then git stash push -m "auto-stash before pull"; fi`
delete all files in /tmp older than 1 day but keep directories	`find /tmp -type f -mtime +1 -print0 \| xargs -0 rm`

this model was trained and tested using unsloth, 8bit quantization and a Tesla T4.

Fine-tuning configs:

Parameter	Value
Framework	Unsloth(DoRA + rsLoRA)
Num GPUs used	1
Num examples	9,351
Num Epochs	3
Total steps	879
Batch size per device	4
Gradient accumulation steps	8
Data Parallel GPUs	1
Total batch size (4 x 8 x 1)	32
Trainable parameters	35,438,592 of 1,756,013,568 (2.02% trained)

This is for the 1st dataset only, the second dataset had :

Parameter	Value
Num examples	4,912
Num Epochs	2

and everything else is the same.

📊 Datasets used :

datasetter458/diverse-bash-dataset-extended : this is the biggest dataset the model is trained on, containing most of the standard bash command set.
datasetter458/diverse-bash-dataset : this is the second biggest dataset the model is trained on, with the same objective and structure as the biggest one.

👩‍💻 Code example :

Which is mostly the same code used for Qwen3-1.7B inference.
Important : When testing or fine-tuning this model, make sure it is loaded in 8bits(or 16bits in '16bit_version_' directory), using unsloth's FastLanguageModel

conversation = [
  {"role" : "system", "content" : "you are an AI assistant specialized in writing bash code."},
  {"role" : "user", "content" : "how to list the content of the directory 'abc/'"}
]

tokens_str = tokenizer.apply_chat_template(
    conversation,
    add_generation_prompt=True,
    tokenize=False
)

model_inputs = tokenizer(tokens_str, return_tensors="pt").to(model.device)

with torch.no_grad():
  output = model.generate(
      **model_inputs,
      max_new_tokens=32768
  )

output_ids = output[0][len(model_inputs.input_ids[0]):].tolist()

# parsing thinking content
try:
    # rindex finding 151668 (</think>)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
print(content)

Citation :

Use this citation ❤🙏

@misc{termius-minimal-1.5,
  author = {datasetter458},
  title  = {termius-minimal-1.5},
  year   = {2026},
  url    = {https://huggingface.co/datasetter458/termius-minimal-1.5}
}