12 Days of OpenAI: Day 1 – Full Transcript, FAQs & Must-Know Insights

Full Video:

Product & Pricing

What is the price of ChatGPT Pro?

OpenAI has set ChatGPT Pro’s price at $200 per month. This new tier offers significantly more computing power than the $20 ChatGPT Plus subscription. The pricing reflects its positioning as a premium service for power users who need extensive computing resources.

What’s included in the ChatGPT Pro subscription?

ChatGPT Pro includes unlimited access to all OpenAI models, including O1 and GPT-4. Subscribers get access to advanced voice mode and the exclusive O1 Pro mode, which offers enhanced performance for complex problems. OpenAI plans to add more features to the Pro tier during their “12 Days of OpenAI” initiative.

What’s the difference between ChatGPT Plus and Pro?

While ChatGPT Plus offers basic access to OpenAI’s models for $20 per month, ChatGPT Pro provides unlimited access to all models and includes the exclusive O1 Pro mode for enhanced performance. The Pro tier is specifically designed for power users who need more computing power for complex tasks in math, programming, and writing. Pro users also get access to advanced voice mode and upcoming features that will be announced during the 12-day launch event.

When will these new features be available?

OpenAI has made ChatGPT Pro and the full version of O1 immediately available as part of their Day 1 announcement. Additional features for ChatGPT Pro will be rolled out during the remaining days of the “12 Days of OpenAI” initiative. The company is also working on bringing O1 to their API, along with features like web browsing, file uploads, and structured outputs.

O1 Model

What are the main improvements in the full version of O1 compared to O1 Preview?

OpenAI’s O1 makes 34% fewer major mistakes while operating 50% faster than O1 Preview. The model now includes multimodal capabilities, processing both images and text, and demonstrates significantly improved performance in math competitions, coding, and GPT QA Diamond benchmarks. O1 also features intelligent response timing, no longer spending excessive time on simple queries.

How much faster and more accurate is the new O1?

OpenAI’s testing shows O1 responding 60% faster than O1 Preview in practical applications, with one demonstration showing a history query processed in 14 seconds compared to O1 Preview’s 33 seconds. The model achieves 34% fewer major mistakes while maintaining a 50% faster overall processing speed.

What types of tasks is O1 best suited for?

O1 excels at technical tasks including complex mathematics, coding, and scientific problems. The model demonstrates state-of-the-art performance on benchmarks like MMU and Math Vista, handles multimodal reasoning with diagrams and calculations, and performs exceptionally well in history, chemistry, and engineering problems.

Does O1 support multimodal inputs?

O1 fully supports multimodal capabilities, allowing users to process both images and text simultaneously. OpenAI researchers demonstrated this through a complex space data center problem where O1 successfully analyzed hand-drawn diagrams, interpreted numerical values from images, and performed complex thermodynamic calculations.

How does O1’s response time work for different types of queries?

O1 implements an adaptive response system where simple queries like greetings receive immediate responses, while complex problems can take several minutes of computation time. The model intelligently allocates thinking time based on the query’s complexity, eliminating the previous issue where O1 Preview would spend excessive time on basic questions.

O1 Pro Mode

What is O1 Pro Mode and how is it different from regular O1?

O1 Pro Mode is an enhanced version of O1 that allows for increased compute power and longer thinking time on complex problems. OpenAI designed this mode to provide even better performance than standard O1, showing improved results in competition math and GPT QA Diamond benchmarks, with particularly significant improvements in reliability for complex workflows.

Who should consider using O1 Pro Mode?

OpenAI targets O1 Pro Mode at power users who consistently push the limits of AI models in technical work, particularly those working on complex math, programming, and writing tasks. These users typically need more computational resources than standard users and require higher reliability in their outputs.

What are the specific performance improvements in Pro Mode?

O1 Pro Mode demonstrates enhanced reliability compared to standard O1, with OpenAI showing benchmark improvements in competition math and GPT QA Diamond tests. During the demonstration, Jason Wei showcased Pro Mode solving a complex chemistry protein identification problem that O1 Preview typically failed to solve correctly.

How long can Pro Mode think about complex problems?

According to Jason Wei’s demonstration, O1 Pro Mode can think for up to several minutes on complex problems, with some tasks taking between 1-3 minutes of computation time. This extended thinking capability allows the model to tackle more challenging problems that require deeper analysis and consideration.

Technical & Developer

What new API features are coming?

OpenAI has announced several crucial API features including web browsing capabilities, file upload functionality, structured outputs, and function calling. The company is also introducing developer messages and API image understanding capabilities, designed to help developers build more sophisticated applications.

Will O1 be available via API?

OpenAI confirmed they are actively working to bring O1 to their API platform. The team mentioned this is a priority development area, aiming to unlock new possibilities for developers to build advanced AI applications using their most capable model yet.

What kind of structured outputs and function calling will be supported?

The announcement didn’t provide specific details about the types of structured outputs and function calling features. However, OpenAI positioned these as key developer-focused features that will be revealed in more detail during their “12 Days of OpenAI” initiative, particularly during the developer-focused announcement scheduled for the next day.

How does the image understanding capability work?

OpenAI demonstrated O1’s image understanding capabilities through a space data center problem, where the model successfully analyzed hand-drawn diagrams, interpreted numerical values, and performed complex calculations. The model can process both text and images simultaneously, showing strong performance on standard benchmarks like MMU and Math Vista.

What are the upcoming developer-focused features?

OpenAI plans to reveal more developer-centric features in their Day 2 announcement. The current announcement confirms web browsing, file uploads, structured outputs, function calling, developer messages, and API image understanding as part of their upcoming developer toolkit.

Performance & Capabilities

How does O1 handle complex mathematical and scientific problems?

OpenAI’s demonstration showed O1 tackling complex problems like thermodynamic calculations for a space data center and intricate protein identification challenges. The model shows significant improvements in competition math, with Jason Wei demonstrating how O1 Pro Mode can spend up to 3 minutes thinking through complex scientific problems that O1 Preview typically failed to solve correctly.

What are the multimodal capabilities of O1?

O1 can now process both images and text simultaneously, as demonstrated by Hyung Won Chung’s space data center problem. The model successfully interpreted hand-drawn diagrams, recognized power specifications, and performed complex thermodynamic calculations while handling ambiguous parameters. OpenAI reports that O1 achieves state-of-the-art performance on multimodal benchmarks like MMU and Math Vista.

How reliable are the responses compared to previous versions?

OpenAI’s testing reveals that O1 makes 34% fewer major mistakes while operating 50% faster than O1 Preview. The reliability improvements are particularly noticeable in complex workflows, with O1 Pro Mode showing even better reliability metrics in competition math and GPT QA Diamond benchmarks.

What are the real-world applications and use cases?

O1 demonstrates exceptional performance in technical tasks including engineering calculations, scientific analysis, coding, and complex mathematical problems. The model’s improved speed and reliability make it particularly valuable for scientists, engineers, and coders who need accurate results for complex technical work, while also handling everyday tasks more efficiently.

Future Updates & Timeline

What other features are planned for ChatGPT Pro?

OpenAI has confirmed they’re working on more compute-intensive tasks for Pro tier users, designed to enable longer and bigger tasks. The development team mentioned these additional features will be revealed throughout their “12 Days of OpenAI” initiative, specifically targeting users who want to push the model’s capabilities even further.

When will the API features be available?

OpenAI is actively working on bringing O1 to their API platform, though no specific release date was mentioned. The upcoming API features will include web browsing, file uploads, structured outputs, function calling, developer messages, and API image understanding capabilities.

What can we expect in the remaining days of the “12 Days of OpenAI”?

OpenAI announced that Day 2 will focus on developer-specific announcements and features. Sam Altman mentioned they have “a lot more stuff to come” and promised something “great for developers” in the next announcement, with additional features being revealed each weekday.

Will there be more tiers or versions in the future?

While OpenAI didn’t explicitly discuss future tiers beyond ChatGPT Pro, they emphasized their commitment to continued development and enhancement of their existing offerings. The team is focusing on expanding capabilities within the current Pro tier rather than announcing additional subscription levels.

Accessibility & Availability

Who can access these new features?

OpenAI has made O1 immediately available to all ChatGPT Plus subscribers, with ChatGPT Pro tier users getting additional access to O1 Pro Mode. The multimodal capabilities, improved speed, and enhanced reliability are available to all Plus and Pro subscribers as part of the Day 1 rollout.

When will O1 be available to different user groups?

OpenAI has already rolled out O1 to replace O1 Preview for all Plus subscribers, with Pro tier access launching simultaneously. The development team mentioned they’re actively working on bringing O1 to their API platform, though no specific timeline was provided for API availability.

Is there a waitlist for ChatGPT Pro?

OpenAI launched ChatGPT Pro immediately as part of their Day 1 announcement, with no mention of a waitlist system. Users can directly subscribe to the Pro tier for $200 per month to access all the premium features, including O1 Pro Mode and advanced voice capabilities.

Will these features be available globally?

While OpenAI didn’t explicitly address global availability in their announcement, they presented ChatGPT Pro and O1 as part of their main platform rollout. The team focused on technical capabilities and features rather than discussing regional availability or restrictions.

Full Transcript

(00:03) [Music]

(00:06) Sam: Hello, welcome to the 12 Days of OpenAI. We’re going to try something that, as far as we know, no tech company has done before, which is every day for the next 12—every weekday—we are going to launch or demo some new thing that we built. And we think we’ve got some great stuff for you, starting today. We hope you’ll really love it. And, you know, we’ll try to make this fun and fast and not take too long, but it’ll be a way to show you what we’ve been working on, and a little holiday present from us. So we’ll jump right into this first day.

(00:46) Sam: Today, we actually have two things to launch. The first one is the full version of 01. We have been very hard at work. We’ve listened to your feedback. You want—you like 01 preview, but you want it to be smarter and faster and be multimodal and be better at instruction following, a bunch of other things. So we’ve put a lot of work into this. And for scientists, engineers, coders, we think they will really love this new model. I’d like to show you quickly about how it performs. So you can see the jump from GPT 40 to 01 preview across math competition, coding, GP QA Diamond, and you can see that 01 is a pretty big step forward. It’s also much better in a lot of other ways, but raw intelligence is something that we care about. Coding performance, in particular, is an area where people are using the model a lot. So in just a minute, these guys will demo some things about 01. They’ll show you how it does at speed, how it does at really hard problems, how it does with multimodality. But first, I want to talk just for a minute about the second thing we’re launching today.

(01:43) Sam: A lot of people—power users of ChatGPT at this point—they really use it a lot, and they want more compute than $20 a month can buy. So we’re launching a new tier, ChatGPT Pro. And Pro has unlimited access to our models, and also things like advanced voice mode. It also has a new thing called 01 Pro mode. So 01 is the smartest model in the world now, except for 01 being used in Pro mode. And for the hardest problems that people have, 01 Pro mode lets you do even a little bit better. So you can see a competition math, you can see a GP QA Diamond, and these boosts may look small, but in complex workflows where you’re really pushing the limits of these models, it’s pretty significant.

(02:15) Sam: I’ll show you one more thing about Pro, about the Pro mode. So one that people really have said they want is reliability. And here you can see how the reliability of an answer from Pro mode compares to 01. And this isn’t even stronger Delta. And again, for our Pro users, we’ve heard a lot about how much people want this. ChatGPT Pro is $200 a month, launches today. Over the course of these 12 days, we have some other things to add to it that we think you also really love, but unlimited model use and this new 01 Pro mode. So I want to jump right in, and we’ll show some of those demos that we talked about. And these are some of the guys that helped build 01, with many other people behind them on the team. Thanks, Sam.

(03:15) Hung: Hi, I’m Hung.

(03:15) Jason: I’m Jason.

(03:15) Max: And I’m Max. We’re all research scientists who worked on building 01. 01 is really distinctive because it’s the first model we’ve trained that thinks before it responds, meaning it gives much better and often more detailed and more correct responses than other models you might have tried. 01 is being rolled out today to all Plus and soon-to-be Pro subscribers on ChatGPT, replacing 01 Pro. 01 model is faster and smarter than the 01 preview model, which we launched in September. After the launch, many people asked about the multimodal input, so we added that. So now the 01 model live today is able to reason through both images and text jointly.

(03:48) Jason: As Sam mentioned, today we’re also going to launch a new tier of ChatGPT called ChatGPT Pro. ChatGPT Pro offers unlimited access to our best models, like 01, 40, and advanced voice. ChatGPT Pro also has a special way of using 01 called 01 Pro mode. With 01 Pro mode, you can ask the model to use even more compute to think even harder on some of the most difficult problems. We think the audience for ChatGPT Pro will be the power users of ChatGPT, those who are already pushing the models to the limits of their capabilities on tasks like math, programming, and writing.

(04:25) Max: It’s been amazing to see how much people are pushing 01 preview, how much people who do technical work all day get out of this, and we’re really excited to let them push it further.

(04:36) Jason: Yeah, sure. We also really think that 01 will be much better for everyday use cases, not necessarily just really hard math and programming problems. In particular, one piece of feedback we received about 01 preview constantly was that it was way too slow. It would think for 10 seconds if you said “Hi” to it, and we fixed that.

(04:51) Max: That was really annoying. It was kind of funny, honestly. It really thought—it cared—really thought hard about saying “Hi” back. Yeah, and so we fixed that. 01 will now think much more intelligently. If you ask it a simple question, it’ll respond really quickly, and if you ask it a really hard question, it’ll think for a really long time. We ran a pretty detailed suite of human evaluations for this model, and what we found was that it made major mistakes about 34% less often than 01 preview while thinking fully about 50% faster. And we think this will be a really, really noticeable difference for all of you.

(05:20) Max: So I really enjoy just talking to these models. I’m a big history buff, and I’ll show you a really quick demo of, for example, a sort of question that I might ask one of these models. So right here, I on the left I have 01, on the right I have 01 preview, and I’m just asking it a really simple history question: list the Roman emperors of the second century, tell me about their dates, what they did. Not hard, but, you know, GPT 40 actually gets this wrong a reasonable fraction of the time. And so I’ve asked 01 this, I’ve asked 01 preview this. I tested this offline a few times, and I found that 01 on average responded about 60% faster than 01 preview. This could be a little bit atypical because right now we’re in the process of swapping all our GPUs from 01 Pro preview to 01. So actually, 01 thought for about 14 seconds, 01 preview still going.

(06:12) Jason: There’s a lot of Roman emperors.

(06:14) Max: There’s a lot of Roman emperors. Yeah, 40 actually gets this wrong a lot of the time. There are a lot of folks who ruled for like 6 days, 12 days a month, and it sometimes forgets those.

(06:21) Jason: Can you do them all for memory, including the six-day people?

(06:23) Max: No. Yep, so here we go. 01 thought for about 14 seconds, preview thought for about 33 seconds. These should both be faster once we finish deploying, but we wanted this to go live right now. Exactly. So yeah, we think you’ll really enjoy talking to this model. We found that it gave great responses, it thought much faster, it should just be a much better user experience for everyone.

(06:48) Max: So one other feature we know that people really wanted for everyday use cases that we’ve had requested a lot is multimodal inputs and image understanding. And Hung is going to talk about that now.

(07:00) Hung: Yep. To illustrate the multimodal input and reasoning, I created this toy problem with some hand-drawn diagrams and so on. So here it is. It’s hard to see, so I already took a photo of this, and so let’s look at this photo in a laptop. So once you upload the image into ChatGPT, you can click on it and see the zoomed-in version. So this is a system of a data center in space. So maybe in the future, we might want to train AI models in space.

(07:25) Jason: I think we should do that, but the power number looks a little low. One G.

(07:29) Hung: Okay, but the general idea…

(07:30) Jason: Rookie numbers.

(07:31) Hung: In this rookie numbers, rookie. Okay, yeah. So we have a sun right here, taking in power on this solar panel, and then there’s a small data center here.

(07:40) Jason: It’s exactly what they look like.

(07:41) Hung: Yeah, GPU racks, and then pump—nice pump here. And one interesting thing about operation in space is that on Earth, we can do air cooling, water cooling to cool the GPUs, but in space, there’s nothing there. So we have to radiate this heat into deep space, and that’s why we need this giant radiator cooling panel. And this problem is about finding the lower bound estimate of the cooling panel area required to operate this 1 GW data center.

(08:16) Jason: Probably going to be very big.

(08:17) Hung: Yeah, let’s see how big is. Let’s see. So that’s the problem. And going to this prompt, and yeah, this is essentially asking for that. So let me hit go, and the model will think for seconds.

(08:31) Jason: By the way, most people don’t know. I’ve been working with Hung for a long time. Hung actually has a PhD in thermodynamics, which it’s totally unrelated to AI, and you always joke that you haven’t been able to use your PhD work in your job until today. So you can trust Hung on this analysis.

(08:46) Hung: Finally, finally. Thanks for hyping up. Now I really have to get this right. Okay, so the model finished thinking, only 10 seconds. It’s a simple problem. So let’s see if how the model did it. So power input, so first of all, this one GW that was only drawn in the paper, so the model was able to pick that up nicely. And then radiative heat transfer only, that’s the thing I mentioned. So in space, nothing else. And then some simplifying choices. And one critical thing is that I intentionally made this problem underspecified, meaning that the critical parameter is a temperature of the cooling panel. I left it out so that we can test out the model’s ability to handle ambiguity and so on. So the model was able to recognize that this is actually a unspecified but important parameter, and it actually picked the right range of temperature, which is about the room temperature. And with that, it continues to the analysis and does a whole bunch of things and then found out the area, which is 2.42 million square meters. Just to get a sense of how big this is, this is about 2% of the land area of San Francisco. This is huge.

(10:13) Jason: Not that bad.

(10:15) Hung: Not that bad. Yeah, oh, okay. Um, yeah, so I guess this this reasonable. I’ll skip through the rest of the details, but I think the model did a great job making nice consistent assumptions that, you know, make the required area as little as possible. And so, yeah, so this is the demonstration of the multimodal reasoning, and this is a simple problem. But 01 is actually very strong, and on standard benchmarks like MMU and MathVista, 01 actually has the state-of-the-art performance. Now Jason will showcase the the pro mode.

(11:00) Jason: Great. So I want to give a short demo of ChatGPT 01 Pro mode. People will find 01 Pro mode the most useful for, say, hard math, science, or programming problems. So here I have a pretty challenging chemistry problem that 01 preview gets usually incorrect, and so I will let the model start thinking. One thing we’ve learned with these models is that for these very challenging problems, the model can think up to a few minutes. I think for this problem, the model usually thinks anywhere from 1 minute to up to 3 minutes. And so we have to provide some entertainment for people while the model is thinking. So I’ll describe the problem a little bit, and then if the model’s still thinking when I’m done, I’ve prepared a dad joke for us to fill the rest of the time.

(11:48) Jason: So I hope it thinks for a long time. You can see the problem asks for a protein that fits a very specific set of criteria. So there are six criteria, and the challenge is each of them ask for pretty chemistry domain-specific knowledge that the model would have to recall. And the other thing to know about this problem is that none of these criteria actually give away what the correct answer is. So for any given criteria, there could be dozens of proteins that might fit that criteria, and so the model has to think through all the candidates and then check if they fit all the criteria. Okay, so you could see the model actually was faster this time. So it finished in 53 seconds. You can click and see some of the thought process that the model went through to get the answer. You could see it’s thinking about different candidates, like neuro Lian initially, and then it arrives at the correct answer, which is retino chiasin, which is great.

(12:51) Jason: Okay, so to summarize, we saw from Max that 01 is smarter and faster than 01 preview. We saw from Hung that 01 can now reason over both text and images. And then finally, we saw with ChatGPT Pro mode, you can use 01 to think about the—to reason about the hardest science and math problems.

(13:21) Max: Yep, there’s more to come for the ChatGPT Pro tier. We’re working on even more computer-intensive tasks to power longer and bigger tasks for those who want to push the model even further. And we’re still working on adding tools to the 01 model, such as web browsing, file uploads, and things like that. We’re also hard at work to bring 01 to the API. We’re going to be adding some new features for developers: structured outputs, function calling, developer messages, and API image understanding, which we think you’ll really enjoy. We expect this to be a great model for developers and really unlock a whole new frontier of agentic things you guys can build. We hope you love it as much as we do.

(14:04) Sam: That was great. Thank you guys so much. Congratulations to you and the team on getting this done. We really hope that you’ll enjoy 01 and Pro mode or Pro tier. We have a lot more stuff to come. Tomorrow, we’ll be back with something great for developers, and we’ll keep going from there. Before we wrap up, can we hear your joke?

(14:23) Jason: Yes. So I made this joke this morning. The joke is this: So Santa was trying to get his large language model to do a math problem, and he was prompting it really hard, but it wasn’t working. How did he eventually fix it?

(14:36) Sam: No idea.

(14:37) Jason: He used reindeer enforcement learning.

(14:40) Sam: Thank you very much. Thank you.