Tamper-Resistant Safeguards for Open-Weight LLMs Collection Models & datasets from the paper "Tamper-Resistant Safeguards for Open-Weight LLMs" (https://arxiv.org/pdf/2408.00761) • 9 items • Updated Feb 15, 2025 • 5
WMDP Benchmark Collection The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning • 9 items • Updated 19 days ago • 10