Christopher L.
cluebbers
AI & ML interests
LLM alignment (DPO, RLHF, SFT), red-teaming & adversarial robustness, evaluation metrics & reproducibility, paraphrase generation, AI safety & incident forecasting, LoRA fine-tuning on consumer hardware
Organizations
None yet
Enhancing Paraphrase Type Generation
Enhancing Paraphrase Type Generation: The Impact of DPO and RLHF Evaluated with Human-Ranked Data
-
cluebbers/Llama-3.1-8B-paraphrase-type-generation-apty-ipo
Text Generation • 8B • Updated • 18 -
cluebbers/Llama-3.1-8B-paraphrase-type-generation-etpc
Text Generation • 8B • Updated • 3 -
cluebbers/Llama-3.1-8B-paraphrase-type-generation-etpc-apty-reward
Updated • 5 -
cluebbers/Llama-3.1-8B-paraphrase-type-generation-apty-sigmoid
Text Generation • 8B • Updated • 7
Algoverse Shutdown Resistance
Adverserial Paraphrasing
Models created using the repo https://github.com/cluebbers/adverserial-paraphrasing
Enhancing Paraphrase Type Generation
Enhancing Paraphrase Type Generation: The Impact of DPO and RLHF Evaluated with Human-Ranked Data
-
cluebbers/Llama-3.1-8B-paraphrase-type-generation-apty-ipo
Text Generation • 8B • Updated • 18 -
cluebbers/Llama-3.1-8B-paraphrase-type-generation-etpc
Text Generation • 8B • Updated • 3 -
cluebbers/Llama-3.1-8B-paraphrase-type-generation-etpc-apty-reward
Updated • 5 -
cluebbers/Llama-3.1-8B-paraphrase-type-generation-apty-sigmoid
Text Generation • 8B • Updated • 7