Sam Radford

Behind the scenes with the system prompts guiding Claude AI

Interesting to get a sneak peak at some of the rules defining what Claude AI (an alternative to ChatGPT) can and can’t do.

Every generative AI vendor, from OpenAI to Anthropic, uses system prompts to prevent (or at least try to prevent) models from behaving badly, and to steer the general tone and sentiment of the models’ replies…

…Anthropic, in its continued effort to paint itself as a more ethical, transparent AI vendor, has published the system prompts for its latest models…

…The latest prompts, dated July 12, outline very clearly what the Claude models can’t do — e.g. “Claude cannot open URLs, links, or videos.” Facial recognition is a big no-no; the system prompt for Claude Opus tells the model to “always respond as if it is completely face blind” and to “avoid identifying or naming any humans in [images].”

Kyle Wiggers, author of the piece, then concludes:

If the prompts for Claude tell us anything, it’s that without human guidance and hand-holding, these models are frighteningly blank slates.