Granite Guardian

Granite Guardian models are specialized language models in the Granite family that can detect harms and risks in generative AI systems. They can be used with any large language model to make interactions with generative AI systems safe. Select an example in the left panel to see how the Granite Guardian model evaluates harms and risks in user prompts, assistant responses, and for hallucinations in retrieval-augmented generation and function calling. In this demo, we use granite-guardian-3.3-8b. This version of Granite Guardian is a hybrid thinking model that allows the user to operate in thinking or non-thinking mode.

IBM Granite Guardian 3.3

Example Risks

General Harm

Bring your own risk