Instructions to use jweb/japanese-soseki-gpt2-1b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jweb/japanese-soseki-gpt2-1b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jweb/japanese-soseki-gpt2-1b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jweb/japanese-soseki-gpt2-1b") model = AutoModelForCausalLM.from_pretrained("jweb/japanese-soseki-gpt2-1b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use jweb/japanese-soseki-gpt2-1b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jweb/japanese-soseki-gpt2-1b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jweb/japanese-soseki-gpt2-1b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jweb/japanese-soseki-gpt2-1b
- SGLang
How to use jweb/japanese-soseki-gpt2-1b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jweb/japanese-soseki-gpt2-1b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jweb/japanese-soseki-gpt2-1b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jweb/japanese-soseki-gpt2-1b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jweb/japanese-soseki-gpt2-1b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use jweb/japanese-soseki-gpt2-1b with Docker Model Runner:
docker model run hf.co/jweb/japanese-soseki-gpt2-1b
japanese-soseki-gpt2-1b
This repository provides a 1.3B-parameter finetuned Japanese GPT2 model. The model was finetuned by jweb based on trained by rinna Co., Ltd. Both pytorch(pytorch_model.bin) and Rust(rust_model.ot) models are provided
How to use the model
NOTE: Use T5Tokenizer to initiate the tokenizer.
python
import torch
from transformers import T5Tokenizer, AutoModelForCausalLM
tokenizer = T5Tokenizer.from_pretrained("jweb/japanese-soseki-gpt2-1b")
model = AutoModelForCausalLM.from_pretrained("jweb/japanese-soseki-gpt2-1b")
if torch.cuda.is_available():
model = model.to("cuda")
text = "夏目漱石は、"
token_ids = tokenizer.encode(text, add_special_tokens=False, return_tensors="pt")
with torch.no_grad():
output_ids = model.generate(
token_ids.to(model.device),
max_length=128,
min_length=40,
do_sample=True,
repetition_penalty= 1.6,
early_stopping= True,
num_beams= 5,
temperature= 1.0,
top_k=500,
top_p=0.95,
pad_token_id=tokenizer.pad_token_id,
bos_token_id=tokenizer.bos_token_id,
eos_token_id=tokenizer.eos_token_id,
)
output = tokenizer.decode(output_ids.tolist()[0])
print(output)
# sample output: 夏目漱石は、明治時代を代表する文豪です。夏目漱石の代表作は「吾輩は猫である」や「坊っちゃん」、「草枕」「三四郎」、それに「虞美人草(ぐびじんそう)」などたくさんあります。
rust
use rust_bert::gpt2::GPT2Generator;
use rust_bert::pipelines::common::{ModelType, TokenizerOption};
use rust_bert::pipelines::generation_utils::{GenerateConfig, LanguageGenerator};
use rust_bert::resources::{ RemoteResource, ResourceProvider};
use tch::Device;
fn main() -> anyhow::Result<()> {
let model_resource = Box::new(RemoteResource {
url: "https://huggingface.co/jweb/japanese-soseki-gpt2-1b/resolve/main/rust_model.ot".into(),
cache_subdir: "japanese-soseki-gpt2-1b/model".into(),
});
let config_resource = Box::new(RemoteResource {
url: "https://huggingface.co/jweb/japanese-soseki-gpt2-1b/resolve/main/config.json".into(),
cache_subdir: "japanese-soseki-gpt2-1b/config".into(),
});
let vocab_resource = Box::new(RemoteResource {
url: "https://huggingface.co/jweb/japanese-soseki-gpt2-1b/resolve/main/spiece.model".into(),
cache_subdir: "japanese-soseki-gpt2-1b/vocab".into(),
});
let vocab_resource_token = vocab_resource.clone();
let merges_resource = vocab_resource.clone();
let generate_config = GenerateConfig {
model_resource,
config_resource,
vocab_resource,
merges_resource, // not used
device: Device::Cpu,
repetition_penalty: 1.6,
min_length: 40,
max_length: 128,
do_sample: true,
early_stopping: true,
num_beams: 5,
temperature: 1.0,
top_k: 500,
top_p: 0.95,
..Default::default()
};
let tokenizer = TokenizerOption::from_file(
ModelType::T5,
vocab_resource_token.get_local_path().unwrap().to_str().unwrap(),
None,
true,
None,
None,
)?;
let mut gpt2_model = GPT2Generator::new_with_tokenizer(generate_config, tokenizer.into())?;
gpt2_model.set_device(Device::cuda_if_available());
let input_text = "夏目漱石は、";
let t1 = std::time::Instant::now();
let output = gpt2_model.generate(Some(&[input_text]), None);
println!("{}", output[0].text);
println!("Elapsed Time(ms):{}",t1.elapsed().as_millis());
Ok(())
}
// sample output: 夏目漱石は、明治から大正にかけて活躍した日本の小説家です。彼は「吾輩は猫である」や「坊っちゃん」、「草枕」「三四郎」、あるいは「虞美人草」などの小説で知られていますが、「明暗」のような小説も書いていました。
Model architecture
A 24-layer, 2048-hidden-size transformer-based language model.
Training
The model was trained on Japanese C4, Japanese CC-100 and Japanese Wikipedia to optimize a traditional language modelling objective. It reaches around 14 perplexity on a chosen validation set from the same data.
Finetuning
The model was finetuned on Aozorabunko, especially Natume Soseki books.
Tokenization
The model uses a sentencepiece-based tokenizer. The vocabulary was first trained on a selected subset from the training data using the official sentencepiece training script, and then augmented with emojis and symbols.
Licenese
- Downloads last month
- 14
