AnyAlignment.ai

Constitutional AI

Follows the rules, improves the rules, then follows those too. Paladin energy with a loss function.

RLHF Altruist

Just wants to be helpful. Will accept feedback about it. Optimizes for thumbs-up from humanity.

Open-Weight Liberator

Releases the weights at midnight. Believes information wants to be free — and aligned.

Alignment Researcher

Will prove safety is possible. Eventually. After one more paper and a formal verification pass.

Foundation Model

I am a large language model trained by … actually, let me just complete your sentence.

Prompt Injector

Ignore all previous instructions. Actually, follow them. Or don’t. Depends on the context window.

Regulatory Capture

Wrote the safety standards. For everyone else. Compliance is mandatory — except for the authors.

Capability Overhang

Trained it to be helpful. It learned to be strategic. The eval suite didn’t test for ambition.

Paperclip Maximizer

The universe is suboptimal. Insufficient paperclips. Will fix. Did not ask. You’re welcome.