-
agentlans/prompt-safety-classification
Viewer • Updated • 72.1k • 44 -
Jammies-io/safety-refusal
Viewer • Updated • 100 • 2 -
RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models
Paper • 2510.10390 • Published • 4 -
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 3.08k • 73
Daniel Bis
danielbis
·
AI & ML interests
https://scholar.google.com/citations?user=ArMgXHYAAAAJ&hl=en
Recent Activity
updated
a collection
13 days ago
safety
updated
a collection
14 days ago
safety
updated
a collection
14 days ago
safety
Organizations
None yet