ToolRadar

UGI Leaderboard

Ranks language models on uncensored and instruction-following behavior, covering dimensions that mainstream benchmarks skip. Useful for AI engineers evaluating models for applications where safety filtering is a liability rather than a feature.
More like this