The Evolution of Prompts: From Completion to Systems

Prompts have evolved from pattern-based completion to outcome-focused instructions, and the practical takeaway is to provide the simplest, clearest description of the finished product and its success criteria so the model can deliver the desired outcome.

What Is a Prompt, Really? How Our Relationship With AI Instructions Has Evolved

Prompt is an interesting word, and it’s changed a lot over time.

When I first started at OpenAI, my job title was “creative applications,” which was pretty meaningless. It mostly reflected that I was the person running around trying a bunch of different ways you could use the model—building demo stuff on different devices and in different applications, just to see what was possible.

Internally, though, I was also known as “the prompt whisperer.” I took some pride in that, because I’d spent a lot of time thinking about how the models worked, how they interacted with text, and how you might coax them into doing specific things. I became the person people went to for questions like, “Hey, how would you do this?” or “What would you try here?”

In all honesty, it was probably because I was the only person who had the time to sit with it and experiment.

Over time, I’ve come to define a prompt as:

The most efficient set of instructions you need to get a specific outcome from a model.

That sounds simple, but what counts as “efficient” and what counts as “instructions” has changed dramatically as the models have evolved.

In this article, I’ll walk through that evolution:

  1. Early prompting with base models: patterns, not instructions
  2. The shift to conversational models: talking to it like a coworker
  3. The age of tools and agents: describing outcomes instead of steps
  4. What a prompt is now: the simplest, clearest definition
  5. How to think about prompting going forward

When people use ChatGPT today, they’re interacting with a model that has been trained to behave like an assistant in a user–assistant role. It has seen countless examples of Q&A, back-and-forth dialogue, and helpful explanations.

That’s not how it was in the beginning.

Early on, we worked with base models. These weren’t trained to be chatbots. They were trained on a giant corpus of text and were essentially doing one thing: continuing a pattern.

To get them to do what you wanted, you didn’t say:

“Explain how quartz works.”

You’d have to do something more like:

“You are an article in a Western science journal explaining the properties of quartz.
Western Science Journal – Properties of Quartz
1.”

You were basically writing the first few bars of a song and asking the model to keep singing in key. If you gave it a good “first three lines,” it would try to complete that pattern in a way that felt consistent with the text it had seen during training.

That was what early prompting was:
Not “giving instructions,” but creating a starting point for a pattern the model could recognize and continue.

Why It Was So Weird

Back then, you could tell a model:

“Write me a story.”

And it might respond:

“No, write it yourself.”

Not because it was stubborn, but because “write me a story” didn’t clearly tell it what pattern it was in. That phrase could easily appear:

  • In a forum thread (“Stop being lazy, write it yourself.”)
  • In a chat log
  • In a public domain book
  • In some random discussion online

The model was just trying to find the most likely way to complete that pattern. It didn’t have an explicit rule like:

If user asks for a story, always produce a story.

It wasn’t trained to think, “I am an assistant. My job is to fulfill requests.”
It was just trying to predict: “What text usually comes next after text like this?”

So early prompting was about figuring out the patterns you could feed it to get the behavior you wanted, rather than simply telling it what to do.

The “Professor Trick”

One of the things we discovered was that if you shaped the pattern differently, you got much better answers.

For example:

  • If I asked the model to explain something directly, I might get a very basic explanation.
  • But if I wrote:

    “Professor So-and-so, head of Stanford’s Department of Chemistry, is explaining to his students how [X] works…”
    and then let it continue,
    suddenly the explanation would be much smarter, more detailed, and more coherent.

People found this confusing:
“Why doesn’t the model just give the good answer in the first place?”

The answer is: it wasn’t trained to be a chatbot. It was trained to continue text. When you gave it the “professor explaining to students” pattern, it could lock into that part of its internal space of knowledge and behavior and continue in a way that matched it.

This didn’t mean it “thinks” or “doesn’t think” in some human sense. It just meant:

  • It does interesting things with information.
  • It’s very sensitive to the pattern of the input.
  • Your job, as a “prompt whisperer,” was to find the patterns that unlock specific capabilities.

2. From Alien Text Engine to Conversational Partner

As time went on, models were trained not just on raw text, but also on examples of conversations, instructions, Q&A, and so on. They learned:

  • When a human asks a question, you answer it.
  • When given a task, you try to complete it.
  • When given a role (“You are a helpful assistant…”), you behave accordingly.

We moved from:

  • “Communicating with a very alien text engine”
    to
  • “Talking to something that understands the idea of being helpful.”

Now, instead of constructing a rich pattern like a fake journal article just to get a decent answer, you could give the model general instructions—the kind you’d give a capable coworker or friend:

“Explain this to me at a high school level.”
“Compare these two approaches and recommend one.”
“Write an outline for a 5,000-word article about X.”

Because the model had seen so many examples of interactions like that, it had a much better idea of:

  • What you meant
  • What kind of answer you were expecting
  • How to format it in a way that felt familiar and useful

Prompting became less about “impersonating a genre” and more about plain-language instructions.

Where It Still Goes Wrong

Even now, models are still fundamentally trying to find the most likely good answer. That works very well in the middle of a domain, but can fall apart at the edges.

Some reasons:

  • Ambiguous terms. For example, “NLP” can mean:

    • Natural Language Processing in computer science
    • Neuro-Linguistic Programming in a completely different context

    If the model can’t tell which world you’re in, it might mix them up or give an answer from the wrong domain.

  • Edge cases. If you’re working in lesser-known territory or using unusual definitions, the model may simply not understand what you want, because most of its training data points toward something else.

So even in the “assistant” era, a good prompt still requires clarity:

  • What domain are we in?
  • How should the model interpret key terms?
  • What role should it be taking?

3. The Age of Tools, Skills, and Agents: Prompts as Outcomes

Now we’re in a very different phase.

Models don’t just:

  • Continue patterns, or
  • Act like assistants in a chat.

They can also:

  • Use tools (like web search, code execution, or other APIs)
  • Coordinate with “skills” or external functions
  • Behave more like agents that can plan and execute multi-step processes

This shifts what a “prompt” needs to be.

Instead of painstakingly telling the model:

  • What pattern to mimic, and
  • Step-by-step how to solve the problem,

you can increasingly just tell it:

“This is the outcome I want.
This is how we’ll know you succeeded.”

For example, instead of:

“Write a fun game.”

You might say:

“Design a game with:

  • 10 levels
  • Increasing difficulty
  • A scoring system
  • Clear win and loss conditions
    I’ll consider it good if:
  • Each level introduces a new mechanic
  • The difficulty curve feels fair
  • A beginner can understand the rules within 2 minutes.”

Or instead of:

“Write me an article about X.”

You might say:

“Write a 5,000-word article on X that:

  • Is structured with an intro, 3–5 main sections, and a conclusion
  • Uses clear subheadings
  • Explains key concepts with concrete examples
  • Ends with 3 actionable recommendations.”

In other words, a modern prompt is less about:

  • “Here’s how to do the task,” and more about
  • “Here’s what done looks like.”

Because the models—and especially the more advanced reasoning models—are good at:

  • Figuring out what resources and tools they have
  • Deciding how to approach the problem
  • Iterating internally until they get close enough to the requested outcome

We even measure models now in terms of how long they can “think” about something—how many steps of internal reasoning they can take. That metric matters because:

  • If you give a model a really good description of the desired outcome,
  • And it has the capacity to think through several steps,
  • It can keep adjusting its reasoning until it gets closer to what you asked for.

So prompting has evolved from:

  1. Providing the pattern (“Here’s the first few lines, now keep going in this style”)
  2. Giving instructions (“Explain X, step by step, in this role…”)
  3. Describing the outcome and success criteria (“Here’s what finished looks like; you figure out how to get there”)

4. So What Is a Prompt Now?

Putting it succinctly:

A prompt is the simplest, most efficient set of instructions you can give a model to get the outcome you want.

The key elements there are:

  • Simplest – Don’t overload it with unnecessary detail.
  • Most efficient – Enough information to be clear, not so much that you confuse or constrain it badly.
  • For an outcome – You’re not just talking; you’re trying to get something specific back.

As models have become more capable, what counts as “efficient” has shifted:

  • Early on:
    You had to provide:

    • The starting pattern
    • Some hints about how to solve the problem
    • Some hints about what the result should look like
      And you hoped it would get somewhere close.
  • Now:
    You often don’t have to tell it:

    • What kind of problem it’s solving
    • Or how to solve it step by step

    You can just say:

    • “This is the finished product I want.
      This is how I’ll judge whether it’s good.”

And if the model can think long enough (internally) and has access to the right tools, it will keep iterating until it reaches something that matches your description reasonably well.


5. Working With Models Like You Work With People

One of the most useful ways to think about prompting is to compare it to working with other people.

Imagine you have an employee:

  • If they’re new, you give them more detailed instructions.
  • If you’ve worked with them for years and know their strengths, you can use a shorthand.
  • If they remember your preferences, you don’t have to restate them every time.

We naturally:

  • Spend more time being explicit with people who are younger, newer, or from different backgrounds.
  • Relax into shorthand and higher-level requests with people we’ve collaborated with a lot.

It’s similar with models:

  • If you’re doing something new or complex, be more explicit.
  • If you’re working within a familiar pattern, you can use simpler prompts.
  • With features like memory, over time the model can “learn” your preferences, and your prompts can become more concise.

A lot of frustration comes from assuming:

“The model should just know what I mean.”

But it doesn’t live in your head. It lives in a giant space of possibilities, trained on countless “worlds” of text. Your job, with a prompt, is to:

  • Narrow down which “world” you want it to operate in
  • Clarify what success looks like
  • Do so with as few, but as clear, instructions as you can

That’s the art.


6. Where Prompting Is Heading

We’ve gone from:

  • Pattern-starting with base models
  • To instruction-giving with assistant-like models
  • To outcome-specifying with tool-using, reasoning models and agents

And we’re now in an era where:

  • You often don’t need to micromanage how the model works.
  • You do need to be very clear about what “good” looks like.
  • A strong prompt is, more than anything, a good description of the finished product and the metrics of success.

So when you think about prompts today, think less about magic incantations and more about this:

“If this were a very capable coworker,
what is the smallest, clearest set of instructions I could give them
so they can deliver exactly what I need?”

That’s what a prompt is now:
the most efficient set of instructions you need for an outcome, evolving right along with the models themselves.