The normative form for interacting with what we think of as "AI" is something like this: there's a chat you type a question you wait for a few seconds you start seeing an answer. you start reading it you read or scan some more tens of seconds longer, while the rest of the response appears you maybe study the response in more detail you respond the loop continues
I recently implemented a feature here on my own blog that uses OpenAI's GPT to help me correct spelling and punctuation in posted blog comments. Because I was curious, and because the scale is so small, I take the same prompt and fire it off three times. The pseudo code looks like this: for model in ("gpt-5", "gpt-5-mini", "gpt-5-nano"): response = completion( model=model, api_key=settings.OPENAI_API_KEY, messages=messages, ) record_response(response)
The first time I built an agentic workflow, it was like watching magic, i.e., until it took 38 seconds to answer a simple customer query and cost me $1.12 per request.