Real alignment requires trust, open challenge, and shared definitions of the problem, priorities, and success up front; surface agreement often conceals competing priorities and misalignment.
What happens when your AI doesn't share your values
The problem here isn't just that an AI might 'break' and go rogue; the danger of an AI taking matters into its own hands can arise even when the model is working as intended on a technical level.