This is their first model to only be available via the new Responses API - if you have code that uses Chat Completions you'll need to upgrade to Responses in order to support this.
Could take me a while to add support for it to my LLM tool: https://github.com/simonw/llm/issues/839
It cost me 94 cents to render a pelican riding a bicycle SVG with this one!
Notes and SVG output here: https://simonwillison.net/2025/Mar/19/o1-pro/
Assuming a highly motivated office worker spends 6 hours per day listening or speaking, at a salary of $160k per year, that works out to a cost of ≈$10k per 1M tokens.
OpenAI is now within an order of magnitude of a highly skilled humans with their frontier model pricing. o3 pro may change this but at the same time I don’t think they would have shipped this if o3 was right around the corner.
It has a 2023 knowledge cut-off, and 200k context window... ? That's pretty underwhelming.
o1-pro still holds up to every other release, including Grok 3 think and Claude 3.7 think (haven't tried Max out though), and that's over 3 months ago, practically an eternity in Ai time.
Ironic since I was getting ready to cancel my Pro subscription, but 4.5 is too nice for non-coding/math tasks.
God I can't wait for o3 pro.
Those that have tested it and liked it. I feel very confident with Sonnet 3.7 right now,if I would wish for something its it to be faster. Most of the problems I’m facing are like execution problems I just want AI to do it faster than me coding everything on my own.
To me it seems like o1-pro would be to be used as a switch-in tool or to double-check your codebase, than a constant coding assistant? (Even with lower price), as I assume I would need to get done a tremendous amount of work including domain knowledge done to come up for the 10x more speed (estimated) of Sonnet?
I have always suspected that the o1-Pro is some kind of workflow on the o1 model. Is it possible that it dispatches to say 8 instances of o1 then do some type of aggregation over the results?
Did not know it was that expensive to run. I'm going to use it more in my Pro subscription now. I frankly do not notice a huge difference between o1 Pro and o3-mini-high - both fail on the fairly straightforward practical problems I give them.
At first I thought, great, we can add it now to our platform. Now that I have seen the price, I am hesitant enabling the model for the majority of users (except rich enterprises) as they will most certainly shoot themselves in the foot.
> $150/Mtok input, $600/Mtok output
What use case could possibly justify this price?
o1-pro doesn't support streaming, so it's reasonable to assume that they doing some kind of best-of-n type technique to search over multiple answers.
I think you can probably get similar results for a much lower price using llm-consortium. This lets you prompt as many models as you can afford and then chooses or synthesises the best response from all of them. And it can loop until a confidence threshold is reached.
Seems underwhelming when openai's best model, o3, was demoed almost 4 months ago.
[dead]
[dead]
Deepseek r1 is much better than this.
Pricing: $150 / 1M input tokens, $600 / 1M output tokens. (Not a typo.)
Very expensive, but I've been using it with my ChatGPT Pro subscription and it's remarkably capable. I'll give it 100,000 token codebases and it'll find nuanced bugs I completely overlooked.
(Now I almost feel bad considering the API price vs. the price I pay for the subscription.)