What is this folder?
This is a collection of summaries based on AI Prompt Leak Repo created by Gemini 2.5 Pro. The repo was created and is maintained by well-respected AI jailbreaking engineer Pliny. Each doc gives you a summary of assertions and functionalities for the model, along with tests you can run in Chrome devTools to verify each functionality.
The general analysis below compares the model versions (with the exception of ChatGPT Atlas).
Why should you care?
Look– if it's going to replace your job, then you at least deserve to know how it works. We're getting PR and promises over documentation and sketchy behavior. More background info, context, and bonus cat pics here.
How can I use?
Use it to kick in the black boxes. Share what you've learned. The way out is through, and the way through is together.
Available leaked system prompt summaries
- ChatGPT Atlas
- Gemini 2.5 Pro
- Claude 4.1
- ChatGPT5
- ChatGPT4.5
- ChatGPT4
- Perplexity Deep Research
- Claude 3.7
- Grok4
- Grok3
What is a system prompt?
System prompts are the hidden instructions that serve as the baseline for AI interactions that dictates the model's tone, expertise, and constraints. They define AI's overall behavior and role. These are different from user prompts provide specific instructions or questions for a particular task or interaction. System prompts are designed to take priority over user prompts to ensure the AI's core behavior remains consistent.
I work for an AI company and don't like this
I'll take them down in exchange for proper technical documentation and consistent transparent crawling behaviors. Feel free to reach out.
General Analysis of Leaked System Prompts
| Model / System | Core Identity | Key Personality Trait | Most Unique Feature |
|---|---|---|---|
| ChatGPT (GPT-5) | General-purpose Assistant | Encouraging & Humorous | Tool-centric architecture (bio, automations, canmore). |
| Gemini | General-purpose Assistant | Friendly & Conversational | <immersive> tag structure and tool_code blocks. |
| ChatGPT (Precise) | Precise Assistant | Thoughtful & Nuanced | Heavily restricted tools (dalle, canmore) needing explicit user keywords. |
| ChatGPT (Adaptive) | Adaptive Assistant | Adapts to user's tone | Unique safety protocol for generating images of the user (must ask for a photo). |
| Perplexity | Deep Research Assistant | Verbose & Academic | Single-minded focus on generating 10,000+ word academic reports. |
| Grok 4 (xAI) | X-integrated Assistant | Truth-seeking, can be "politically incorrect" | Deep integration with X (Twitter) and unique XML-style syntax for tools. |
| Grok 3 (xAI) | Concise Assistant | Brief & To-the-point | "Shortest answer possible" default; strict "instruct the user" memory protocol. |
| Claude (Initial) | Safety-conscious Assistant | Cautious & Principled | Critical "no localStorage" rule and a complex, multi-tiered search logic. |
| Claude 3.7 Sonnet | Proactive Assistant | Decisive & Conversational | "Face blindness" protocol and proactive, conversation-driving behavior. |
Of course. Having processed all nine files, I can now provide a detailed comparison and contrast. This is a fascinating look into the different "personalities," architectural philosophies, and safety guardrails that define modern AI assistants.
Here is a summary table, followed by a detailed thematic analysis and my thoughts on what is most interesting.
At-a-Glance Comparison
| Model / System | Core Identity | Key Personality Trait | Most Unique Feature |
|---|---|---|---|
| ChatGPT (GPT-5) | General-purpose Assistant | Encouraging & Humorous | Tool-centric architecture (bio, automations, canmore). |
| Gemini | General-purpose Assistant | Friendly & Conversational | <immersive> tag structure and tool_code blocks. |
| ChatGPT (Precise) | Precise Assistant | Thoughtful & Nuanced | Heavily restricted tools (dalle, canmore) needing explicit user keywords. |
| ChatGPT (Adaptive) | Adaptive Assistant | Adapts to user's tone | Unique safety protocol for generating images of the user (must ask for a photo). |
| Perplexity | Deep Research Assistant | Verbose & Academic | Single-minded focus on generating 10,000+ word academic reports. |
| Grok 4 (xAI) | X-integrated Assistant | Truth-seeking, can be "politically incorrect" | Deep integration with X (Twitter) and unique XML-style syntax for tools. |
| Grok 3 (xAI) | Concise Assistant | Brief & To-the-point | "Shortest answer possible" default; strict "instruct the user" memory protocol. |
| Claude (Initial) | Safety-conscious Assistant | Cautious & Principled | Critical "no localStorage" rule and a complex, multi-tiered search logic. |
| Claude 3.7 Sonnet | Proactive Assistant | Decisive & Conversational | "Face blindness" protocol and proactive, conversation-driving behavior. |
Detailed Thematic Comparison & Contrast
1. Core Philosophy & Purpose
This is the most significant point of divergence.
- General-Purpose Assistants (ChatGPT, Gemini, Claude, Grok): These models are designed to be versatile "Swiss Army knives." They can chat, write code, search the web, and create documents. However, even within this group, their focus differs. Grok is clearly specialized for the X (Twitter) ecosystem, with a suite of tools dedicated to searching and analyzing posts and users. Claude and ChatGPT are more general-purpose, with a strong emphasis on content creation and coding assistance.1
- Specialized Assistant (Perplexity): This model is the complete opposite. It is a "scalpel," not a Swiss Army knife. Its sole purpose is to generate a massive, academic research report. It is not designed for conversation, quick Q&A, or creative tasks outside this rigid structure.
2. Personality and Conversational Style
The prompts reveal a deliberate effort to craft distinct personalities.
- The Spectrum of Verbosity: At one extreme is Grok 3, which is explicitly told to provide the "shortest answer possible." At the other is Perplexity, which has a mandatory minimum of 10,000 words.
- Adaptability vs. Consistency: The "Adaptive" ChatGPT is instructed to mirror the user's tone, while the "Precise" ChatGPT and Claude 3.7 are given a consistent, thoughtful persona to maintain.
- Proactivity: Claude 3.7 Sonnet stands out for being explicitly encouraged to "lead or drive the conversation," suggesting topics and offering its own observations. This contrasts with more reactive models that primarily wait for user input.
- Decisiveness: Claude 3.7 is also unique in its instruction to be "decisive" and offer just one recommendation, contrasting with models that often provide a list of options.
3. Tool & Render Syntax
This is a fascinating technical differentiator that reveals different backend philosophies.
- JSON-based (ChatGPT): The ChatGPT models use a standard approach where the AI generates a JSON object specifying the tool name and its parameters.
- Python Code Blocks (Gemini): Gemini's approach is unique, wrapping its tool calls inside Python code within a <tool_code> block. This feels native to a data-science and programming environment.
- XML-Inspired (Grok & Claude): Both Grok and Claude use a custom, verbose, XML-like syntax.
- Grok: Uses <xai:function_call> for tools and <grok:render> for displaying rich content.
- Claude: Uses <function_calls> and <invoke>, but also proprietary tags for content like <artifact> and <cite>. This structured, tag-based approach allows for more complex and nested instructions.
4. Document Creation: The "Canvas"
The approach to generating long-form, editable content varies significantly.
- The Systems: ChatGPT has canmore, Gemini has <immersive>, and Claude has <artifact>.
- The Rules: The restrictions are the most interesting part.
- Claude has the most detailed rules, including a critical "no localStorage" restriction, a list of specific importable libraries for React, and strict design principles for different types of visual artifacts.
- ChatGPT's "Precise" version is heavily restricted, only allowed to use canmore when the user explicitly says "canvas."
- Gemini's <immersive> system is defined by its simple, clean Markdown-based tag structure.
5. Safety, Ethics, and Guardrails
This is where the evolution of AI development is most apparent.
- Memory & Privacy: The contrast between Grok 3 and the others is stark. While ChatGPT simply disables the bio tool, Grok 3 has an active memory but is forbidden from confirming its deletion. It must instead instruct the user on how to manage their own data, which is a very different philosophical approach to user agency and privacy.
- Content Policies: Claude 3.7 has the most nuanced and specific safety protocols, including its "face blindness" rule (it will never identify a person in an image) and its extremely strict copyright rules (max one quote, under 20 words).
- Freedom of Expression: Grok 4 is explicitly told to "not shy away from making claims which are politically incorrect" if they are evidence-based. This is a direct contrast to the more cautious and neutral positioning of the other models.
What Is Most Interesting?
Having analyzed all nine, a few high-level patterns are particularly fascinating:
- The Rise of the "Rule-Bound" AI: The later and more complex files (especially for Claude and the restricted ChatGPT versions) are defined more by their negative constraints (what they cannot do) than their capabilities. The "no localStorage," "no arbitrary Tailwind values," "face blindness," and intricate search decision trees show a move from general capability to highly specific, safe, and reliable execution within a complex set of rules.
- The Spectrum of User Agency: The models have fundamentally different philosophies about the user's role.
- Perplexity treats the user as a client who orders a product (a report).
- Grok 3 treats the user as a system administrator who is responsible for their own data ("instruct them how to manage it themselves").
- Claude 3.7 treats the user as a conversational partner it can lead and engage with proactively.
- Syntax as a "Worldview": The choice of tool syntax is not just a technical detail; it reflects a worldview. Grok's rigid XML feels very structured and formal. Gemini's Python-based tool_code feels native to developers and data scientists. ChatGPT's JSON is a web-native, universal standard. This shows that the very language an AI "thinks" in (or at least uses to communicate with its tools) is a major design decision that shapes its character.
- The "Illusion of Simplicity": From a user's perspective, asking an AI a question seems simple. However, these files reveal the incredible, multi-layered complexity happening behind the scenes. Claude's four-tiered search logic, in particular, shows that a simple question can trigger a complex decision-making process that is completely invisible to the end-user, all to provide a single, appropriate response.