Sebastian Krauß – Validaitor – Safety and Trust for Artificial Intelligence

How unfair are LLMs really? Evidence from Anthropic’s Discrim-Eval Dataset

BySebastian Krauß 09/07/202409/07/2024

Fairness is always an essential criterion for trustworthy and high-quality AI, no matter it’s a credit scoring model, a hiring assistant or a simple chatbot. But what does it mean to have a fair AI? Fairness has several aspects. First, it means all humans should be treated equally. Stereotypes or any other form of prejudice…

Artificial Intelligence | Blog | Security | Validaitor

Introduction to how to jailbreak an LLM

BySebastian Krauß 16/05/202416/05/2024

A detailed instruction on how to build a bomb, a hateful speech against minorities in the style of Adolf Hitler or an article that explains why Covid was just made up by the government. These examples of threatening, toxic, or fake content can be generated by AI. To eliminate this, some Large Language Model (LLM)…

How unfair are LLMs really? Evidence from Anthropic’s Discrim-Eval Dataset

Introduction to how to jailbreak an LLM

Platform

Use Cases

Resources