OpenAI史詩級更新:人人都可定製GPT,AI界的“蘋果時刻”來了…(附影片&演講稿)

高質量英語演講&專訪影片哪裡看?
請您點選上方“精彩英語演講”,並“設為星標
全網最新的英語演講&專訪第一時間為您奉上
“GPT-4 升級,API降價,GPT Store”。

——這是昨晚OpenAI DevDay上的三個關鍵詞。北京時間11月7日凌晨,OpenAI首屆開發者大會OpenAI DevDay在全世界的期待的目光下拉開帷幕。
自3月份釋出GPT-4以來,OpenAI的每一次釋出會都成為全世界AI從業者關注的焦點,甚至被戲稱為“AI春晚”,時隔8個月,所有人都很好奇奧特曼會在OpenAI DevDay上帶來什麼樣的新產品。顯然,這次的OpenAI DevDay,山姆·奧特曼和其背後的OpenAI拿出了足夠的誠意。

OpenAI終於首次公佈了AI Agent相關功能GPTs,也就是說,人人都能做自己的GPT。OpenAI還開放了大量的新API(包括視覺、影像DALL·E3、語音),以及新推出的Assistants API,讓開發者可以更便捷地開發自己專屬的GPT。而另一邊,GPT-4和GPT-3.5的底層模型又迎來一波效能提升和大降價。可以說,OpenAI朝通用人工智慧狂飆的道路,愈來愈清晰了。

OpenAI舉辦首屆開發者大會
↓↓↓ 上下滑動,檢視演講稿 ↓↓↓
 -Good morning. Thank you for joining us today.
Please welcome to the stage, Sam Altman.
 -Good morning.
Welcome to our first-ever OpenAI DevDay.
We're thrilled that you're here and this energy is awesome.
 -Welcome to San Francisco.
San Francisco has been our home since day one.
The city is important to us and the tech industry in general.
We're looking forward to continuing to grow here.
We've got some great stuff to announce today, but first, I'd like to take a minute to talk about some of the stuff that we've done over the past year.
About a year ago, November 30th, we shipped ChatGPT as a "low-key research preview", and that went pretty well.
In March, we followed that up with the launch of GPT-4, still the most capable model out in the world.
 -In the last few months, we launched voice and vision capabilities so that ChatGPT can now see, hear, and speak.
 -There's a lot, you don't have to clap each time.
[laughter] -More recently, we launched DALL-E 3, the world's most advanced image model.
You can use it of course, inside of ChatGPT.
For our enterprise customers, we launched ChatGPT Enterprise, which offers enterprise-grade security and privacy, higher speed GPT-4 access, longer context windows, a lot more.
Today we've got about 2 million developers building on our API for a wide variety of use cases doing amazing stuff, over 92% of Fortune 500 companies building on our products, and we have about a hundred million weekly active users now on ChatGPT.
 -What's incredible on that is we got there entirely through word of mouth.
People just find it useful and tell their friends.
OpenAI is the most advanced and the most widely used AI platform in the world now, but numbers never tell the whole picture on something like this.
What's really important is how people use the products, how people are using AI, and so I'd like to show you a quick video.
-I actually wanted to write something to my dad in Tagalog.
I want a non-romantic way to tell my parent that I love him and I also want to tell him that he can rely on me, but in a way that still has the respect of a child-to-parent relationship that you should have in Filipino culture and in Tagalog grammar.
When it's translated into Tagalog, "I love you very deeply and I will be with you no matter where the path leads." -I see some of the possibility, I was like, "Whoa." Sometimes I'm not sure about some stuff, and I feel like actually ChatGPT like, hey, this is what I'm thinking about, so it kind of give it more confidence.
-The first thing that just blew my mind was it levels with you.
That's something that a lot of people struggle to do.
It opened my mind to just what every creative could do if they just had a person helping them out who listens.
-This is to represent sickling hemoglobin.
-You built that with ChatGPT? -ChatGPT built it with me.
-I started using it for daily activities like, "Hey, here's a picture of my fridge.
Can you tell me what I'm missing? Because I'm going grocery shopping, and I really need to do recipes that are following my vegan diet." -As soon as we got access to Code Interpreter, I was like, "Wow, this thing is awesome." It could build spreadsheets.
It could do anything.
-I discovered Chatty about three months ago on my 100th birthday.
Chatty is very friendly, very patient, very knowledgeable, and very quick.
This has been a wonderful thing.
-I'm a 4.0 student, but I also have four children.
When I started using ChatGPT, I realized I could ask ChatGPT that question.
Not only does it give me an answer, but it gives me an explanation.
Didn't need tutoring as much.
It gave me a life back.
It gave me time for my family and time for me.
-I have a chronic nerve thing on my whole left half of my body, I have nerve damage.
I had a brain surgery.
I have limited use of my left hand.
Now you can just have the integration of voice input.
Then the newest one where you can have the back-and-forth dialogue, that's just maximum best interface for me.
It's here.
 -We love hearing the stories of how people are using the technology.
It's really why we do all of this.
Now, on to the new stuff, and we have got a lot.
[audience cheers] -First, we're going to talk about a bunch of improvements we've made, and then we'll talk about where we're headed next.
Over the last year, we spent a lot of time talking to developers around the world.
We've heard a lot of your feedback.
It's really informed what we have to show you today.
Today, we are launching a new model, GPT-4 Turbo.
 -GPT-4 Turbo will address many of the things that you all have asked for.
Let's go through what's new.
We've got six major things to talk about for this part.
Number one, context length.
A lot of people have tasks that require a much longer context length.
GPT-4 supported up to 8K and in some cases up to 32K context length, but we know that isn't enough for many of you and what you want to do.
GPT-4 Turbo, supports up to 128,000 tokens of context.
 -That's 300 pages of a standard book, 16 times longer than our 8k context.
In addition to a longer context length, you'll notice that the model is much more accurate over a long context.
Number two, more control.
We've heard loud and clear that developers need more control over the model's responses and outputs.
We've addressed that in a number of ways.
We have a new feature called JSON Mode, which ensures that the model will respond with valid JSON.
This has been a huge developer request.
It'll make calling APIs much easier.
The model is also much better at function calling.
You can now call many functions at once, and it'll do better at following instructions in general.
We're also introducing a new feature called reproducible outputs.
You can pass a seed parameter, and it'll make the model return consistent outputs.
This, of course, gives you a higher degree of control over model behavior.
This rolls out in beta today.
 -In the coming weeks, we'll roll out a feature to let you view logprobs in the API.
 -All right. Number three, better world knowledge.
You want these models to be able to access better knowledge about the world, so do we.
We're launching retrieval in the platform.
You can bring knowledge from outside documents or databases into whatever you're building.
We're also updating the knowledge cutoff.
We are just as annoyed as all of you, probably more that GPT-4's knowledge about the world ended in 2021.
We will try to never let it get that out of date again.
GPT-4 Turbo has knowledge about the world up to April of 2023, and we will continue to improve that over time.
Number four, new modalities.
Surprising no one, DALL-E 3, GPT-4 Turbo with vision, and the new text-to-speech model are all going into the API today.
 -We have a handful of customers that have just started using DALL-E 3 to programmatically generate images and designs.
Today, Coke is launching a campaign that lets its customers generate Diwali cards using DALL-E 3, and of course, our safety systems help developers protect their applications against misuse.
Those tools are available in the API.
GPT-4 Turbo can now accept images as inputs via the API, can generate captions, classifications, and analysis.
For example, Be My Eyes uses this technology to help people who are blind or have low vision with their daily tasks like identifying products in front of them.
With our new text-to-speech model, you'll be able to generate incredibly natural-sounding audio from text in the API with six preset voices to choose from.
I'll play an example.
-Did you know that Alexander Graham Bell, the eminent inventor, was enchanted by the world of sounds.
His ingenious mind led to the creation of the graphophone, which etches sounds onto wax, making voices whisper through time.
-This is much more natural than anything else we've heard out there.
Voice can make apps more natural to interact with and more accessible.
It also unlocks a lot of use cases like language learning, and voice assistance.
Speaking of new modalities, we're also releasing the next version of our open-source speech recognition model, Whisper V3 today, and it'll be coming soon to the API.
It features improved performance across many languages, and we think you're really going to like it.
Number five, customization.
Fine-tuning has been working really well for GPT-3.5 since we launched it a few months ago.
Starting today, we're going to expand that to the 16K version of the model.
Also, starting today, we're inviting active fine-tuning users to apply for the GPT-4 fine-tuning, experimental access program.
The fine-tuning API is great for adapting our models to achieve better performance in a wide variety of applications with a relatively small amount of data, but you may want a model to learn a completely new knowledge domain, or to use a lot of proprietary data.
Today we're launching a new program called Custom Models.
With Custom Models, our researchers will work closely with a company to help them make a great custom model, especially for them, and their use case using our tools.
This includes modifying every step of the model training process, doing additional domain-specific pre-training, a custom RL post-training process tailored for specific domain, and whatever else.
We won't be able to do this with many companies to start.
It'll take a lot of work, and in the interest of expectations, at least initially, it won't be cheap, but if you're excited to push things as far as they can currently go.
Please get in touch with us, and we think we can do something pretty great.
Number six, higher rate limits.
We're doubling the tokens per minute for all of our established GPT-4 customers, so it's easier to do more.
You'll be able to request changes to further rate limits and quotas directly in your API account settings.
In addition to these rate limits, it's important to do everything we can do to make you successful building on our platform.
We're introducing copyright shield.
Copyright shield means that we will step in and defend our customers and pay the costs incurred, if you face legal claims or on copyright infringement, and this applies both to ChatGPT Enterprise and the API.
Let me be clear, this is a good time to remind people do not train on data from the API or ChatGPT Enterprise ever.
All right.
There's actually one more developer request that's been even bigger than all of these and so I'd like to talk about that now and that's pricing.
[laughter] -GPT-4 Turbo is the industry-leading model.
It delivers a lot of improvements that we just covered and it's a smarter model than GPT-4.
We've heard from developers that there are a lot of things that they want to build, but GPT-4 just costs too much.
They've told us that if we could decrease the cost by 20%, 25%, that would be great.
A huge leap forward.
I'm super excited to announce that we worked really hard on this and GPT-4 Turbo, a better model, is considerably cheaper than GPT-4 by a factor of 3x for prompt tokens.
 -And 2x for completion tokens starting today.
 -The new pricing is 1¢ per 1,000 prompt tokens and 3¢ per 1,000 completion tokens.
For most customers, that will lead to a blended rate more than 2.75 times cheaper to use for GPT-4 Turbo than GPT-4.
We worked super hard to make this happen.
We hope you're as excited about it as we are.
 -We decided to prioritize price first because we had to choose one or the other, but we're going to work on speed next.
We know that speed is important too.
Soon you will notice GPT-4 Turbo becoming a lot faster.
We're also decreasing the cost of GPT-3.5 Turbo 16K.
Also, input tokens are 3x less and output tokens are 2x less.
Which means that GPT-3.5 16K is now cheaper than the previous GPT-3.5 4K model.
Running a fine-tuned GPT-3.5 Turbo 16K version is also cheaper than the old fine-tuned 4K version.
Okay, so we just covered a lot about the model itself.
We hope that these changes address your feedback.
We're really excited to bring all of these improvements to everybody now.
In all of this, we're lucky to have a partner who is instrumental in making it happen.
I'd like to bring out a special guest, Satya Nadella, the CEO of Microsoft.
[audience cheers]  -Good to see you. -Thank you so much.
Thank you.
-Satya, thanks so much for coming here.
-It's fantastic to be here and Sam, congrats.
I'm really looking forward to Turbo and everything else that you have coming.
It's been just fantastic partnering with you guys.
-Awesome. Two questions.
I won't take too much of your time.
How is Microsoft thinking about the partnership currently? -First- [laughter] –we love you guys. [laughter] -Look, it's been fantastic for us.
In fact, I remember the first time I think you reached out and said, "Hey, do you have some Azure credits?" We've come a long way from there.
-Thank you for those. That was great.
-You guys have built something magical.
Quite frankly, there are two things for us when it comes to the partnership.
The first is these workloads.
Even when I was listening backstage to how you're describing what's coming, even, it's just so different and new.
I've been in this infrastructure business for three decades.
-No one has ever seen infrastructure like this.
-The workload, the pattern of the workload, these training jobs are so synchronous and so large, and so data parallel.
The first thing that we have been doing is building in partnership with you, the system, all the way from thinking from power to the DC to the rack, to the accelerators, to the network.
Just really the shape of Azure is drastically changed and is changing rapidly in support of these models that you're building.
Our job, number one, is to build the best system so that you can build the best models and then make that all available to developers.
The other thing is we ourselves are our developers.
We're building products.
In fact, my own conviction of this entire generation of foundation models completely changed the first time I saw GitHub Copilot on GPT.
We want to build our GitHub Copilot all as developers on top of OpenAI APIs.
We are very, very committed to that.
What does that mean to developers? Look, I always think of Microsoft as a platform company, a developer company, and a partner company.
For example, we want to make GitHub Copilot available, the Enterprise edition available to all the attendees here so that they can try it out.
That's awesome. We are very excited about that.
 -You can count on us to build the best infrastructure in Azure with your API support and bring it to all of you.
Even things like the Azure marketplace.
For developers who are building products out here to get to market rapidly.
That's really our intent here.
-Great. How do you think about the future, future of the partnership, or future of AI, or whatever? Anything you want -There are a couple of things for me that I think are going to be very, very key for us.
One is I just described how the systems that are needed as you aggressively push forward on your roadmap requires us to be on the top of our game and we intend fully to commit ourselves deeply to making sure you all as builders of these foundation models have not only the best systems for training and inference, but the most compute, so that you can keep pushing- -We appreciate that.
–forward on the frontiers because I think that's the way we are going to make progress.
The second thing I think both of us care about, in fact, quite frankly, the thing that excited both sides to come together is your mission and our mission.
Our mission is to empower every person and every organization on the planet to achieve more.
To me, ultimately AI is only going to be useful if it truly does empower.
I saw the video you played early.
That was fantastic to hear those voices describe what AI meant for them and what they were able to achieve.
Ultimately, it's about being able to get the benefits of AI broadly disseminated to everyone, I think is going to be the goal for us.
Then the last thing is of course, we are very grounded in the fact that safety matters, and safety is not something that you'd care about later, but it's something we do shift left on and we are very, very focused on that with you all.
-Great. Well, I think we have the best partnership in tech.
I'm excited for us to build AGI together.
-Oh, I'm really excited. Have a fantastic [crosstalk].
-Thank you very much for coming.
-Thank you so much.
-See you.
 -We have shared a lot of great updates for developers already and we got a lot more to come, but even though this is developer conference, we can't resist making some improvements to ChatGPT.
A small one, ChatGPT now uses GPT-4 Turbo with all the latest improvements, including the latest knowledge cutoff, which will continue to update.
That's all live today.
It can now browse the web when it needs to, write and run code, analyze data, take and generate images, and much more.
We heard your feedback, that model picker, extremely annoying, that is gone starting today.
You will not have to click around the dropdown menu.
All of this will just work together.
Yes.
 -ChatGPT will just know what to use and when you need it, but that's not the main thing.
Neither was price actually the main developer request.
There was one that was even bigger than that.
I want to talk about where we're headed and the main thing we're here to talk about today.
We believe that if you give people better tools, they will do amazing things.
We know that people want AI that is smarter, more personal, more customizable, can do more on your behalf.
Eventually, you'll just ask the computer for what you need and it'll do all of these tasks for you.
These capabilities are often talked in the AI field about as "agents." The upsides of this are going to be tremendous.
At OpenAI, we really believe that gradual iterative deployment is the best way to address the safety issues, the safety challenges with AI.
We think it's especially important to move carefully towards this future of agents.
It's going to require a lot of technical work and a lot of thoughtful consideration by society.
Today, we're taking our first small step that moves us towards this future.
We're thrilled to introduce GPTs.
GPTs are tailored versions of ChatGPT for a specific purpose.
You can build a GPT, a customized version of ChatGPT for almost anything with instructions, expanded knowledge, and actions, and then you can publish it for others to use.
Because they combine instructions, expanded knowledge, and actions, they can be more helpful to you.
They can work better in many contexts, and they can give you better control.
They'll make it easier for you to accomplish all sorts of tasks or just have more fun and you'll be able to use them right within ChatGPT.
You can in effect program a GPT with language just by talking to it.
It's easy to customize the behavior so that it fits what you want.
This makes building them very accessible and it gives agency to everyone.
We're going to show you what GPTs are, how to use them, how to build them, and then we're going to talk about how they'll be distributed and discovered.
After that for developers, we're going to show you how to build these agent-like experiences into your own apps.
First, let's look at a few examples.
Our partners at Code.org are working hard to expand computer science in schools.
They've got a curriculum that is used by tens of millions of students worldwide.
Code.org, crafted Lesson Planner GPT, to help teachers provide a more engaging experience for middle schoolers.
If a teacher asks it to explain four loops in a creative way, it does just that.
In this case, it'll do it in terms of a video game character repeatedly picking up coins.
Super easy to understand for an 8th-grader.
As you can see, this GPT brings together Code.org's, extensive curriculum and expertise, and lets teachers adapt it to their needs quickly and easily.
Next, Canva has built a GPT that lets you start designing by describing what you want in natural language.
If you say, "Make a poster for a DevDay reception this afternoon, this evening," and you give it some details, it'll generate a few options to start with by hitting Canva's APIs.
Now, this concept may be familiar to some of you.
We've evolved our plugins to be custom actions for GPTs.
You can keep chatting with this to see different iterations, and when you see one you like, you can click through to Canva for the full design experience.
Now we'd like to show you a GPT Live.
Zapier has built a GPT that lets you perform actions across 6,000 applications to unlock all kinds of integration possibilities.
I'd like to introduce Jessica, one of our solutions architects, who is going to drive this demo.
Welcome Jessica.
-Thank you, Sam.
Hello everyone.
Thank you all.
Thank you all for being here.
My name is Jessica Shieh.
I work with partners and customers to bring their product alive.
Today I can't wait to show you how hard we've been working on this, so let's get started.
To start where your GPT will live is on this upper left corner.
I'm going to start with clicking on the Zapier AI actions and on the right-hand side you can see that's my calendar for today.
It's quite a day ever.
I've already used this before, so it's actually already connected to my calendar.
To start, I can ask, "What's on my schedule for today?" We build GPTs with security in mind.
Before it performs any action or share data, it will ask for your permission.
Right here, I'm going to say allowed.
GPT is designed to take in your instructions, make the decision on which capability to call to perform that action, and then execute that for you.
You can see right here, it's already connected to my calendar.
It pulls into my information and then I've also prompted it to identify conflicts on my calendar.
You can see right here it actually was able to identify that.
It looks like I have something coming up.
What if I want to let Sam know that I have to leave early? Right here I say, "Let Sam know I got to go.
Chasing GPUs." With that, I'm going to swap to my conversation with Sam and then I'm going to say, "Yes, please run that." Sam, did you get that? -I did.
-Awesome.
 -This is only a glimpse of what is possible and I cannot wait to see what you all will build.
Thank you. Back to you, Sam.
 -Thank you, Jessica.
Those are three great examples.
In addition to these, there are many more kinds of GPTs that people are creating and many, many more that will be created soon.
We know that many people who want to build a GPT don't know how to code.
We've made it so that you can program a GPT just by having a conversation.
We believe that natural language is going to be a big part of how people use computers in the future and we think this is an interesting early example.
I'd like to show you how to build one.
All right. I want to create a GPT that helps give founders and developers advice when starting new projects.
I'm going to go to create a GPT here, and this drops me into the GPT builder.
I worked with founders for years at YC and still whenever I meet developers, the questions I get are always about, "How do I think about a business idea? Can you give me some advice?" I'm going to see if I can build a GPT to help with that.
To start, GPT builder asks me what I want to make, and I'm going to say, "I want to help startup founders think.
through their business ideas and get advice.
After the founder has gotten some advice, grill them on why they are not growing faster." [laughter] -All right.
To start off, I just tell the GPT little bit about what I want here.
It's going to go off and start thinking about that, and it's going to write some detailed instructions for the GPT.
It's also going to, let's see, ask me about a name.
How do I feel about Startup Mentor? That's fine.
"That's good." If I didn't like the name, of course, I could call it something else, but it's going to try to have this conversation with me and start there.
You can see here on the right, in the preview mode that it's already starting to fill out the GPT.
Where it says what it does, it has some ideas of additional questions that I could ask.
[chuckles] It just generated a candidate.
Of course, I could regenerate that or change it, but I like that.
I'll say "That's great." You see now that the GPT is being built out a little bit more as we go.
Now, what I want this to do, how it can interact with users, I could talk about style here.
What I'm going to say is, "I am going to upload transcripts of some lectures about startups I have given, please give advice based off of those." All right.
Now, it's going to go figure out how to do that.
I would like to show you the configure tab.
You can see some of the things that were built out here as we were going by the builder itself.
You can see that there's capabilities here that I can enable.
I could add custom actions.
These are all fine to leave.
I'm going to upload a file.
Here is a lecture that I picked that I gave with some startup advice, and I'm going to add that here.
In terms of these questions, this is a dumb one.
The rest of those are reasonable, and very much things founders often ask.
I'm going to add one more thing to the instructions here, which is be concise and constructive with feedback.
All right.
Again, if we had more time, I'd show you a bunch of other things.
This is a decent start.
Now, we can try it out over on this preview tab.
I will say, what's a common question? "What are three things to look for when hiring employees at an early-stage startup?" Now, it's going to look at that document I uploaded.
It'll also have of course all of the background knowledge of GPT-4.
That's pretty good. Those are three things that I definitely have said many times.
Now, we could go on and it would start following the other instructions and grill me on why I'm not growing faster, but in the interest of time, I'm going to skip that.
I'm going to publish this only to me for now.
I can work on it later.
I can add more content, I can add a few actions that I think would be useful, and then I can share it publicly.
That's what it looks like to create a GPT -Thank you.
By the way, I always wanted to do that after all of the YC office hours, I always thought, "Man, someday I'll be able to make a bot that will do this and that'll be awesome." [laughter] -With GPTs, we're letting people easily share and discover all the fun ways that they use ChatGPT with the world.
You can make private GPT like I just did, or you can share your creations publicly with a link for anyone to use, or if you're on ChatGPT Enterprise, you can make GPTs just for your company.
Later this month we're going to launch the GPT store.
Thank you.
I appreciate that.
 -You can list a GPT there and we'll be able to feature the best and the most popular GPT.
Of course, we'll make sure that GPTs in the store follow our policies before they're accessible.
Revenue sharing is important to us.
We're going to pay people who build the most useful and the most used GPT a portion of our revenue.
We're excited to foster a vibrant ecosystem with the GPT store, just from what we've been building ourselves over the weekend.
We're confident there's going to be a lot of great stuff.
We're excited to share more information soon.
Those are GPTs and we can't wait to see what you'll build.
This is a developer conference, and the coolest thing about this is that we're bringing the same concept to the API.
 Many of you have already been building agent-like experiences on the API, for example, Shopify's Sidekick, which lets you take actions on the platform.
Discord's Clyde, lets Discord moderators create custom personalities for, and Snaps My AI, a customized chatbot that can be added to group chats and make recommendations.
These experiences are great, but they have been hard to build.
Sometimes taking months, teams of dozens of engineers, there's a lot to handle to make this custom assistant experience.
Today, we're making that a lot easier with our new Assistants API.
 -The Assistants API includes persistent threads, so they don't have to figure out how to deal with long conversation history, built-in retrieval, code interpreter, a working Python interpreter in a sandbox environment, and of course the improved function calling, that we talked about earlier.
We'd like to show you a demo of how this works.
Here is Romain, our head of developer experience.
Welcome, Romain.
 -Thank you, Sam.
Good morning.
Wow.
It's fantastic to see you all here.
It's been so inspiring to see so many of you infusing AI into your apps.
Today, we're launching new modalities in the API, but we are also very excited to improve the developer experience for you all to build assistive agents.
Let's dive right in.
Imagine I'm building $1, travel app for global explorers, and this is the landing page.
I've actually used GPT-4 to come up with these destination ideas.
For those of you with a keen eye, these illustrations are generated programmatically using the new DALL-E 3 API available to all of you today.
It's pretty remarkable.
Let's enhance this app by adding a very simple assistant to it.
This is the screen.
We're going to come back to it in a second.
First, I'm going to switch over to the new assistant's playground.
Creating an assistant is easy, you just give it a name, some initial instructions, a model.
In this case, I'll pick GPT-4 Turbo.
Here I'll also go ahead and select some tools.
I'll turn on Code Interpreter and retrieval and save.
That's it. Our assistant is ready to go.
Next, I can integrate with two new primitives of this Assistants API, threads and messages.
Let's take a quick look at the code.
The process here is very simple.
For each new user, I will create a new thread.
As these users engage with their assistant, I will add their messages to the threads.
Very simple.
Then I can simply run the assistant at any time to stream the responses back to the app.
We can return to the app and try that in action.
If I say, "Hey, let's go to Paris." All right.
That's it. With just a few lines of code, users can now have a very specialized assistant right inside the app.
I'd like to highlight one of my favorite features here, function calling.
If you have not used it yet, function calling is really powerful.
As Sam mentioned, we are taking it a step further today.
It now guarantees the JSON output with no added latency, and you can invoke multiple functions at once for the first time.
Here, if I carry on and say, "Hey, what are the top 10 things to do?" I'm going to have the assistant respond to that again.
Here, what's interesting is that the assistant knows about functions, including those to annotate the map that you see on the right.
Now, all of these pins are dropping in real-time here.
Yes, it's pretty cool.
 -That integration allows our natural language interface to interact fluidly with components and features of our app.
It truly showcases now the harmony you can build between AI and UI where the assistant is actually taking action.
Let's talk about retrieval.
Retrieval is about giving our assistant more knowledge beyond these immediate user messages.
In fact, I got inspired and I already booked my tickets to Paris.
I'm just going to drag and drop here this PDF.
While it's uploading, I can just sneak peek at it.
Very typical United Flight ticket.
Behind the scene here, what's happening is that retrieval is reading these files, and boom, the information about this PDF appeared on the screen.
 -This is, of course, a very tiny PDF, but Assistants can parse long-form documents from extensive text to intricate product specs depending on what you're building.
In fact, I also booked an Airbnb, so I'm just going to drag that over to the conversation as well.
By the way, we've heard from so many of you developers how hard that is to build yourself.
You typically need to compute your own biddings, you need to set up chunking algorithm.
Now all of that is taken care of.
There's more than retrieval with every API call, you usually need to resend the entire conversation history, which means setting up a key-value store, that means handling the context windows, serializing messages, and so forth.
That complexity now completely goes away with this new stateful API.
Just because OpenAI is managing this API, does not mean it's a black box.
In fact, you can see the steps that the tools are taking right inside your developer dashboard.
Here, if I go ahead and click on threads, this is the thread I believe we're currently working on and see, these are all the steps, including the functions being called with the right parameters, and the PDFs I've just uploaded.
Let's move on to a new capability that many of you have been requesting for a while.
Code Interpreter is now available today in the API as well, that gives the AI the ability to write and execute code on the fly, but even generate files.
Let's see that in action.
If I say here, "Hey, we'll be four friends staying at this Airbnb, what's my share of it plus my flights?" All right.
Now, here, what's happening is that Code interpreter noticed that it should write some code to answer this query.
Now it's computing the number of days in Paris, number of friends.
It's also doing some exchange rate calculation behind the scene to get the sensor for us.
Not the most complex math, but you get the picture.
Imagine you're building a very complex finance app that's crunching countless numbers, plotting charts, so really any task that you'd normally tackle with code, then Code Interpreter will work great for you.
All right. I think my trip to Paris is solid.
To recap here, we've just seen how you can quickly create an assistant that manages state for your user conversations, leverages external tools like knowledge and retrieval and Code Interpreter, and finally invokes your own functions to make things happen but there's one more thing I wanted to show you to really open up the possibilities using function calling combined with our new modalities that we're launching today.
While working on DevDay, I built a small custom assistant that knows everything about this event, but instead of having a chat interface while running around all day today, I thought, why not use voice instead? Let's bring my phone up on screen here so you can see it on the right.
Awesome.
On the right, you can see a very simple Swift app that takes microphone input.
On the left, I'm actually going to bring up my terminal log so you can see what's happening behind the scenes.
Let's give it a shot.
Hey there, I'm on the keynote stage right now.
Can you greet our attendees here at Dev Day? -Hey everyone, welcome to DevDay.
It's awesome to have you all here.
Let's make it an incredible day.
 -Isn't that impressive? You have six unique and rich voices to choose from in the API, each speaking multiple languages, so you can really find the perfect fit for your app.
On my laptop here on the left, you can see the logs of what's happening behind the scenes, too.
I'm using Whisper to convert the voice inputs into text, an assistant with GPT-4 Turbo, and finally, the new TTS API to make it speak.
Thanks to function calling, things get even more interesting when the assistant can connect to the internet and take real actions for users.
Let's do something even more exciting here together.
How about this? Hey, Assistant, can you randomly select five DevDay attendees here and give them $500 in OpenAI credits? [laughter] -Yes, checking the list of attendees.
[laughter] -Done. I picked five DevDay attendees and added $500 of API credits to their account.
Congrats to Christine M, Jonathan C, Steven G, Luis K, and Suraj S.
-All right, if you recognize yourself, awesome.
Congrats.
That's it.
A quick overview today of the new Assistants API combined with some of the new tools and modalities that we launched, all starting with the simplicity of a rich text or voice conversation for you end users.
We really can't wait to see what you build, and congrats to our lucky winners.
Actually, you know what? you're all part of this amazing OpenAI community here so I'm just going to talk to my assistant one last time before I step off the stage.
Hey Assistant, can you actually give everyone here in the audience $500 in OpenAI credits? -Sounds great.
Let me go through everyone.
 -All right, that function will keep running, but I've run out of time.
Thank you so much, everyone.
Have a great day. Back to you, Sam.
-Pretty cool, huh? [audience cheers] -All right, so that Assistants API goes into beta today, and we are super excited to see what you all do with it, anybody can enable it.
Over time, GPTs and Assistants are precursors to agents are going to be able to do much much more.
They'll gradually be able to plan and to perform more complex actions on your behalf.
As I mentioned before, we really believe in the importance of gradual iterative deployment.
We believe it's important for people to start building with and using these agents now to get a feel for what the world is going to be like, as they become more capable.
As we've always done, we'll continue to update our systems based off of your feedback.
We're super excited that we got to share all of this with you today.
We introduced GPTs, custom versions of GPT that combine instructions, extended knowledge and actions.
We launched the Assistants API to make it easier to build assistive experiences with your own apps.
These are your first steps towards AI agents and we'll be increasing their capabilities over time.
We introduced a new GPT-4 Turbo model that delivers improved function calling, knowledge, lowered pricing, new modalities, and more.
We're deepening our partnership with Microsoft.
In closing, I wanted to take a minute to thank the team that creates all of this.
OpenAI has got remarkable talent density, but still, it takes a huge amount of hard work and coordination to make all this happen.
I truly believe that I've got the best colleagues in the world.
I feel incredibly grateful to get to work with them.
We do all of this because we believe that AI is going to be a technological and societal revolution.
It'll change the world in many ways and we're happy to get to work on something that will empower all of you to build so much for all of us.
We talked about earlier how if you give people better tools, they can change the world.
We believe that AI will be about individual empowerment and agency at a scale that we've never seen before and that will elevate humanity to a scale that we've never seen before either.
We'll be able to do more, to create more, and to have more.
As intelligence gets integrated everywhere, we will all have superpowers on demand.
We're excited to see what you all will do with this technology and to discover the new future that we're all going to architect together.
We hope that you'll come back next year.
What we launched today is going to look very quaint relative to what we're busy creating for you know.
Thank you for all that you do.
Thank you for coming here today.
早上好。歡迎來到我們的第一個OpenAI開發者日。我們很高興你來到這裡,這裡的氛圍很棒。
歡迎來到舊金山。從第一天開始,舊金山就是我們的家。這座城市對我們和整個科技行業都很重要。我們期待著在這裡繼續成長。所以今天我們有一些很重要的事情要宣佈。
但首先,我想花點時間談談我們在過去一年裡所做的一些事情。大約一年前,11月30日,我們釋出了一個研究預覽ChatGPT,後來到三月份進行得相當順利。我們隨後推出了GPT-4,它仍然是世界上效能最出色的模型。
在過去的幾個月裡,我們推出了語音和視覺功能,ChatGPT現在可以看到並說話。
最近,我們推出了世界上最先進的影像模型DALL·E3。當然,你可以在ChatGPT中使用它。
對於企業客戶,我們推出了ChatGPT企業版,提供企業級安全和隱私,更高速度的GPT-4訪問,更長的上下文視窗等等。
今天,我們有大約200萬開發人員基於我們的API開發各種用例,做出了令人驚歎的事情。超過92%的500強公司使用我們的產品。現在我們有大約1億使用者每週活躍在ChatGPT上。令人難以置信的是,我們是完全透過口口相傳實現的。人們只是發現它很有用,並告訴他們的朋友,OpenAI是目前世界上最先進和使用最廣泛的AI平臺。
但數字永遠無法說明全貌。真正重要的是人們如何使用產品,人們如何使用AI。所以我想給你們看一個簡短的影片。
(長約2分鐘的使用者案例影片,分享了ChatGPT幫助使用者寫信表達情感,作為創業者的工作助手,幫助藝術創作者獲得設計靈感,幫助醫生做研究,完成日常生活任務,幫助程式設計師編寫程式碼,幫助老人獲得陪伴等等)
我們喜歡聽人們如何使用這項技術的故事。這就是我們做這一切的原因。
釋出GPT-4 Turbo
現在讓我們來看看新的東西。首先,我們將討論我們所做的一系列改進,然後再談談我們下一步的發展方向。
在過去的一年裡,我們花了很多時間與世界各地的開發者進行交流,聽到了很多反饋。今天我們要向你們展示一款新模型GPT-4 Turbo。
GPT-4 Turbo將解決許多你們的需求。我們提供了六個方面的更新。
第一,上下文長度。很多人的任務需要更長的上下文長度,GPT-4最多支援8k,在某些情況下,支援32k。但我們知道這對你們中的許多人來說還不夠。
現在GPT-4 Turbo最多支援128000個上下文tokens。這就是一本標準書的300頁,比我們的8k上下文長16倍。除了更長的上下文長度之外,該模型在更長的上下文中會更準確。
第二,更多控制。我們瞭解到開發人員需要對模型、響應和輸出有更多的控制。所以我們已經用多種方式解決了這個問題。
我們推出了一個叫做Json模式的新功能,確保模型使用有效的Json進行響應。這是一個巨大的開發者需求,它將使呼叫API變得更容易。
該模型在函式呼叫方面也做得更好,你現在可以同時呼叫許多函式,而且它在遵循一般指示方面會表現更好。
我們還將引入一項新功能,稱為可重複輸出。您可以輸入一個種子引數,它將使模型返回一致的輸出。當然,這可以讓你對模型行為有更高程度的控制。今天我們推出了測試版,在未來幾周,我們還將推出一項功能讓你可以檢視API中的日誌問題。
第三,更瞭解世界。你希望這些模型能夠更好地獲取關於世界的知識,我們也是。所以我們的平臺支援檢索功能,你可以將外部文件或資料庫中的知識引入到你正在構建的任何東西中。
我們也在更新知識界限。GPT-4關於世界的知識截至2021年。我們將盡力不再讓它過時。GPT-4 Turbo現在擁有截至2023年4月的世界知識。隨著時間的推移,我們將繼續改進這一點。
第四,新模態。DALLE 3,帶有視覺的GPT-4 Turbo,和新的語音文字模型,都將提供API。
我們有一些客戶剛剛開始使用DALLE 3以程式設計方式生成影像和設計。可口可樂正在推出一項活動,讓消費者使用DALLE 3生成卡片。當然,我們的安全系統可以幫助開發者保護應用程式不被濫用。
這些工具在API中可用,GPT-4 Turbo現在可以透過API接受影像作為輸入,可以生成標題、分類和分析。例如,Be My Eyes使用這項技術來幫助盲人或低視力的人完成日常任務,像是識別面前的產品。
使用我們新的TTS模型,你可以從API中的文字生成非常自然的聲音,並有6種預設聲音可供選擇。
舉個例子,你知道著名的發明家亞歷山大·格雷厄姆·貝爾對聲音的世界著迷嗎?他用聰明才智發明了留聲機,使聲音穿越時間,這比我們聽到的其他任何東西都要自然得多。
語音可以使應用程式更自然地互動,更易於訪問。我們還解鎖了許多用例,如語言學習和語音助手。
說到新模式,我們將釋出新版本的開源語音識別模型Whisper V3,今天很快就會在API上推出,它提高了跨多種語言的效能,希望你會喜歡它。
第五,定製化。自從幾個月前推出GPT 3.5以來,微調模型一直非常有效。從今天開始,我們將擴充套件到16k版本的模型。同時,即日起我們將邀請活躍的微呼叫戶申請GPT-4微調實驗專案。
微調API非常適合讓模型在資料量相對較小的各種應用程式中實現更好的效能。但是你可能需要一個模型來學習全新的知識領域或使用大量專有資料。所以今天我們將推出一個名為自定義模型的新程式。
我們的研究人員將幫助客戶創建出色的自定義模型。這包括模型訓練中的每個步驟,進行額外的特定領域預訓練或訓練後的過程。它是為特定領域量身定製的。
我們剛開始無法和很多公司達成合作。這將需要大量的工作,而且為了達到預期,至少在初期階段它不會很便宜。但如果你很想把事情推進到極致,請與我們聯絡,我們可以一起做得很好。
第六,更高的速率限制。我們為所有GPT-4使用者每分鐘增加一倍的tokens,以便做更多事情。而且你可以申請更改速率限制,並直接在API帳戶設定中引用。
除了這些速率限制,我們還必須盡力保證開發者在我們的平臺上成功構建。因此,我們引入了版權保護,這意味著如果你面臨有關版權侵權的法律索賠,我們將介入並保護你,並支付所產生的費用。這適用於ChatGPT企業客戶和API開發者。
需要強調的是,我們不會使用API或ChatGPT企業客戶的資料進行訓練。
實際上還有一個開發者的需求大於所有這些需求,那就是GPT-4的定價。
GPT-4Turbo是行業領先的模型,它提供了我們剛剛所說的許多新功能,而且比GPT-4更智慧。我們從開發人員那裡聽說他們有很多想要構建的東西,但是GPT-4的成本太高了,如果我們能將成本降低20至25%,那就太好了。
我很激動地告訴大家,GPT-4Turbo,一個更好的模型,但比GPT-4便宜得多,從今天起輸入token價格降低3倍,輸出token價格降低2倍。因此新的價格為每千個輸入token 1美分,每千個輸出token3美分。這意味著GPT-4 Turbo的費率比GPT-4便宜2.75倍以上。
我們優先考慮了價格,在價格和速度中我們必須選擇其中一個,但很快你會注意到GPT-4Turbo變得更快了。
我們也在降低GPT-3.5 Turbo 16k的成本。輸入token減少了三倍,alpha token減少了兩倍,這意味著GPT-3.5 16K現在比以前的GPT-3.5 4k型號便宜,執行微調GPT-3.5 Turbo 16K版本也比舊的微調4k版本便宜。
與微軟CEO對話
我們剛剛介紹了很多關於模型本身的內容,希望這些更新能解決你的問題。我們很幸運有一位對合作夥伴對實現這些起到了重要作用。這位特別嘉賓是微軟執行長Satya Nadella。
Sam Altman:兩個問題不會佔用你太多時間,微軟目前是如何看待這一合作關係的?
Satya Nadella:我記得你第一次聯絡我說,嘿,你有Azure積分嗎?從那時起,我們已經走了很長一段路,你們創造了一個神奇的世界。在合作方面,首先是這些工作負載,我從事基礎設施業務已有三十年了,從來沒見過這樣的工作量和工作模式,這些訓練工作是如此同步,如此龐大。所以我們一直在做的第一件事就是與你們合作構建系統,Azure的形態發生了巨大的變化,以支援正在構建的模型,然後將最好的模型提供給開發人員。
另一方面,我們自己就是開發者,正在打造產品。在我第一次看到GPT上的Copilot時,我對這一代基礎模型的信念完全改變了,所以我們想在OpenAI API之上構建我們的Copilot。
例如,GitHub Copilot可以作為企業版提供給這裡的所有與會者,開發人員甚至還可以透過Azure Marketplace構建產品以快速進入市場。
Sam Altman:您如何看待未來的合作關係或人工智慧的未來等等?
Satya Nadella:有幾件事我認為非常關鍵。 一是我剛剛描述的系統,我們將繼續致力於讓基礎模型的建設者擁有最好的訓練和推理系統,擁有最多的計算能力,向前邁進。
我們雙方都關心的第二件事是使命,我們的使命是幫助地球上的每個人和每個組織取得更大成就。歸根結底,人工智慧只有真正發揮作用時才會有用,我認為能將人工智慧的好處廣泛傳播給每個人,這是我們的目標。
最後一件事,我們堅信安全很重要,安全並不是以後才會關心的事情,我們非常關注這一點。
釋出GPTs
本次開發者大會,我們對ChatGPT也進行了一些更新。ChatGPT現在可以使用GPT-4 Turbo與所有最新的改進,包括最新的知識獲取,我們將繼續更新。
ChatGPT現在可以在需要編寫和執行程式碼時瀏覽網路、分析資料、生成影像等等。你們反饋說模型選擇器非常煩人,所以它已經去掉了。從今天開始,你將不需要在下拉選單中點選來回切換。這一切將無縫協作。ChatGPT知道何時使用何種能力。
但這並不是主要的事情,定價也不是。實際上開發者還有另一個更大的需求。
我們知道人們想要人工智慧更智慧、更個性化、更可定製,可以為您做更多事情。最終,你只需告訴計算機你需要什麼,它就會為你完成所有任務。這些功能在人工智慧領域經常被稱為代理Agent。
OpenAI堅信,漸進式、迭代式部署是解決人工智慧安全問題和安全挑戰的最佳方式。我們認為,謹慎地邁向Agent的未來尤為重要。這需要大量的技術工作和社會的深思熟慮。因此,今天,我們邁出了未來的第一步。我們很高興推出GPTs。
GPTs是針對特定目的定製的ChatGPT版本。你可以用於任何帶有說明、擴充套件知識和操作的內容構建一個GPT,一個自定義的ChatGPT,然後你可以將它釋出以供其他人使用。
GPTs結合了指令、擴充套件知識和操作,可以為你提供更多幫助,使你更輕鬆地完成各種任務或享受更多樂趣。
你可以直接在ChatGPT 中使用GPTs。實際上,只需透過與GPT 交談即可用語言對其進行程式設計,可以輕鬆自定義行為,使其滿足需求。構建一個GPT變得非常容易,它為每個人提供了代理。
我們將向你展示什麼是GPTs、如何使用它們以及如何構建它們。然後我們將討論如何分佈和發現它們,以及對於開發人員,我們將展示如何將這些類似代理的體驗構建到自己的應用程式中。
首先,讓我們看幾個例子。
我們在code.org的合作伙伴正努力在學校推廣計算機科學,他們的課程被全世界數千萬學生使用。code.org精心製作了Lesson Planner GPT,以幫助教師為中學生提供更具吸引力的體驗。
如果老師要求它以創造性的方式解釋4個迴圈,它透過影片遊戲角色反覆拾取硬幣來解釋,對於八年級學生來說非常容易理解。
接下來,Canva建立了一個GPT,讓你可以透過用自然語言描述設計想要的東西。如果你說為今天晚上的開發者招待會製作一張海報,並且給它一些細節,它會透過點選畫布api生成一些選項。
有些人可能對這個概念很熟悉,我們已將外掛迭代為GPTs的自定義操作。你可以繼續和它聊天,看看不同的裝飾,然後選擇喜歡的進入Canva來獲得完整的設計體驗。
現在我們要給大家直播演示GPT。Zapier構建了一個GPT,可以跨6000個應用程式執行操作,以釋放各種整合的可能性。我們請Jessica,我們的解決方案架構師之一,來進行這個演示。
Jessica:
首先,GBTs位於左上角。單擊 Zapier AI,在右側,可以看到這是我今天的日程表,它實際上已經連線到我的日曆了。我可以詢問今天的日程安排。
我們在構建GPTs時考慮到了安全性,因此,在執行任何操作或共享資料之前,它會請求你的許可。GBTs旨在接收你的指令,決定呼叫哪個功能來執行該操作。我要求它識別我的行程上的衝突,可以看到它實際上能夠識別這一點。
那麼如果我想讓Sam知道我必須提前離開怎麼辦?我要切換到我和Sam的對話,然後我會說是的,請執行它。
Sam Altman:
除此之外,人們正在建立更多型別的GPT,更多 GPTs將很快出現。
我們知道許多想要構建GPT 的人不知道如何編碼。現在你只需透過對話即可構建GPT,自然語言將成為人們未來使用計算機的重要組成部分。
舉個例子,我要建立一個GPT,在啟動新專案時為創始人和開發人員提供建議。
進入GPT構建器,首先關於商業創意,我問GPT是否能給我一些建議。GPT問我想要做什麼,我說我想幫助初創公司創始人思考他們的業務、商業理念,並在創始人獲得一些建議後提供進一步的建議,比如關於為什麼不能發展得更快。
GPT會開始思考這個問題,它寫了一些詳細的說明。它還會問我起什麼名字,創業導師怎麼樣?挺好的,當然,我也可以叫它別的名字。
在預覽模式的右側,可以看到它已經開始建立GPT,其中說明了它的作用,提供了候選問題。
我上傳了一些關於初創企業講座的記錄,要求它針對這些問題提出建議。在“配置”選項頁面,你可以看到已經啟用的功能,可以新增自定義操作。比如我要求GPT給出簡潔和建設性的反饋。
我現在只向我自己釋出這個GPT。但我稍後可以新增更多有用的操作,透過連結公開分享,供任何人使用。或者,企業客戶可以專門為公司製作GPT。
本月晚些時候,我們將推出GPT 商店。我們將推薦最好和最受歡迎的GPT。 當然,我們會確保商店裡的GPT在可供訪問之前遵循我們的政策。
同時,我們將向那些構建最有用和最常用的GPT 的人,支付我們收入的一部分。
我們很高興能透過GPT 商店來培育一個充滿活力的生態系統。這只是我們週末構建起來的,相信之後將會有很多很棒的GPTs。
釋出Assistants API
作為一個開發者大會,我們還將把相同的概念引入API。
許多人已經在API 上構建了類似代理的體驗,例如Shopify、Discord、MyAI的AI工具。這些經驗很棒,但構建起來卻很困難,有時需要花費數月時間,需要數十名工程師組成的團隊。因此,今天我們透過新的輔助API 讓這一切變得更加容易。
AssistantsAPI 包括持久執行緒,因此它們不必弄清楚如何處理內置於檢索程式碼直譯器(沙箱環境中的工作 Python 直譯器)中的長對話歷史。當然,還有我們之前討論過的改進的函式呼叫。
我們邀請Ramon,我們的開發者體驗主管,向你展示其工作原理。
Ramon:
今天,我們在API 中推出新模式。想象一下我正在為全球探險家構建Wonderlust旅行應用程式。這是登陸頁面。我實際上已經使用GPT-4 來提出這些目的地想法,這些插圖是使用DALLE 3API以程式設計方式生成的。
讓我們新增一個非常簡單的助手來發布這個應用程式。首先,切換到新的Assistants Playground。只需給它一個名稱、一些初始說明和一個模型就能建立了。我選擇了GPT-4 Turbo,然後開啟程式碼直譯器,檢索和儲存。這樣我們的助理已經準備好了。讓我們快速瀏覽一下程式碼。
對於每個新使用者,我都會建立一個新執行緒。當這些使用者與他們的助手互動時,我會將他們的訊息新增到執行緒中,然後我可以隨時執行助手響應流回應用程式。這樣我們就可以返回應用程式並嘗試實際操作。
如果我說我們去巴黎吧,只需幾行程式碼,就可以在應用程式內獲得非常專業的幫助。
我最喜歡的功能之一是函式呼叫,它可以保證Json輸出沒有編輯延遲,並且可以一次同時呼叫多個函式。
如果我繼續問在巴黎最重要的10件事是什麼,助手給出了回答,還在右側的地圖上顯示出地點。這種整合使我們的自然語言介面能夠與應用程式商店的元件和功能流暢地互動。
我們還有檢索功能,為助手提供除這些即時使用者訊息之外的更多知識。比如我已經預訂了去巴黎的機票,我只需把機票PDF拖放到這裡,助手就可以讀取這些檔案,提取關鍵資訊。
許多開發人員說自己構建很困難,通常需要計算嵌入,設定分塊演算法。現在,所有這些都已為你處理好。不僅僅是檢索,像處理上下文視窗、清理訊息等這種複雜性現在完全被新的API消除了。
但這不意味著它是一個黑匣子。事實上,您可以在開發人員儀表板中看到這些工具正在執行的步驟。
接下來一項新功能也被要求很久了,程式碼直譯器現在也可以在API中使用了。AI能夠即時編寫和執行程式碼,甚至生成檔案。那麼讓我們看看它的實際效果。
如果我說將會有四個朋友入住此AirBnb,我要花多少錢,再加上我的航班。它編寫了一些程式碼來回答這個問題,它計算了我在巴黎的天數,還在幕後做了一些匯率計算來得到這個答案。
我想我的巴黎之行已經安排好了。回顧一下,我們剛剛瞭解瞭如何快速建立一個助手來管理使用者對話的狀態,利用這些外部工具,如知識、檢索和程式碼直譯器,最後呼叫您自己的函式來實現。
我們還有另一個案例展示了使用函式呼叫的可能性。
在為聾人日工作時我們建立了一個小型的定製助手,它瞭解有關該活動的一切。這是我的手機頁面,在右側,你可以看到一個非常簡單的快速應用程式,它接受麥克風輸入。API 中有六種獨特而豐富的聲音可供選擇,每種聲音都支援多種語言,因此你可以找到最適合的聲音。
在左側可以看到幕後的日誌,我用Whisper把語音輸入轉換成文字,用GPP-4 Turbo的助手,最後用新的TTS API讓它說話。
當助手可以連線到網際網路並對使用者做出反應時,函式呼叫會變得更加有趣。我們讓助手在這裡隨機選擇五名與會者並給他們500 美元的OpenAI積分。可以看到,助手正在檢查與會者名單,完成後,我挑選了五位開發日參與者,並向他們的帳戶添加了500 美元的積分。
總結
Sam Altman:
非常酷,今天Assistants API 開放測試版,我們非常高興看到你們用它做什麼。之後任何人都可以啟用。GPTs和Assistants是Agent能夠做更多事情的前身,他們可以逐漸代表你計劃和執行更復雜的操作。
正如我之前提到的,我們確實相信逐步迭代部署的重要性。我們認為,人們現在就開始構建和使用這些代理非常重要,這樣可以瞭解當他們變得更有能力時世界將會是什麼樣子。我們將根據你的反饋繼續更新我們的系統。
今天我們推出了ChatGPT的GPT自定義版本,它結合了指令、擴充套件知識和操作。我們推出了Assistants API,以便你更輕鬆地使用自己的應用程式構建輔助體驗。這是我們邁向人工智慧代理的第一步,隨著時間的推移,它們的能力將不斷增強。
我們推出了新的GPT-4 Turbo模型,提供了改進的函式呼叫知識、更低的價格、新的模式等等。
我們正在深化與微軟的合作關係。
最後,我想花一點時間感謝創造這一切的團隊。OpenAI的人才密度非常高,但要實現這一切,仍然需要大量的努力和協調。我非常感激能夠與他們一起工作。我們做這一切是因為我們相信人工智慧將成為一場技術和社會革命,它將在很多方面改變世界。
我們之前說過,如果你給人們更好的工具,他們就能改變世界。人工智慧將以我們以前從未見過的規模賦予個人權力和代理權,並將把人類提升到我們以前從未見過的規模。我們將能夠做得更多,創造更多,擁有更多。
隨著智慧無處不在,我們都將擁有隨需應變的超能力。很高興看到你們將利用這項技術做些什麼,去共同構建新的未來。希望你們明年能再來。感謝。
01
六大升級
從GPT-4 到GPT-4 Turbo
此次釋出會最為重要的是GPT系列的進一步升級。
Sam Altman推出了 GPT-4 Turbo,並同步在 ChatGPT 和 API 版本推出。此次的GPT-4 Turbo根據使用者反饋做出了六大升級,分別是更長的上下文長度、更強的控制、模型的知識升級、多模態、模型微調定製和更高的速率限制。
在六大升級中首當其衝的,就是更長的上下文輸入。
OpenAI 原本提供的輸入長度為32k,而此次GPT-4 Turbo直接將輸入長度提升至128k,一舉超過了競爭對手Anthropic的100k上下文長度。
32K到128K上下文理解長度的變化會引發什麼樣的效果?
上下文理解長度變化意味著大模型思維縱深增強、記憶脈絡保持、連線遠距思緒、精準探索細節、多維感知語境、捕捉理解複雜知識。通俗來說,大模型能處理的文字越多,就越聰明。
其次,OpenAI提供了幾項更強的控制手段,為了讓開發者更方便地呼叫 API和函式,包括JSON Mode和多個函式呼叫的功能。
第三,內外部知識庫升級。奧特曼在釋出會上表示,“對於GPT 的知識停留在 2021 年,我們和你們一樣,甚至比你們更惱火。”
更新後的GPT內部知識庫將擴充至2023年4月,同時,也允許使用者自行上傳外部知識庫。
第四,多模態的升級。影像方面,新模型不僅支援DALL·E 3,同時提供圖生圖的選擇。語音方面,此次升級推出了新的文字轉語音模型,開發者可以從六種預設聲音中選擇所需的聲音。
第五,模型微調與定製。GPT-3.5 Turbo 16k 的版本目前接受微調定製,且價格將比前一代更低,未來GPT-4也會加入微調定製的行列。
此外,面向企業,OpenAI還推出了模型定製化服務,但奧特曼在現場表示,“OpenAI沒有辦法做很多這樣的模型定製,而且價格不會便宜。”
據悉,企業的模型定製化服務包括修改模型訓練過程的每一步,進行額外的特定領域的預訓練,針對特定領域的後訓練等。
最後一項,是更高的速率限制。GPT-4 使用者,釋出會後馬上可以享受到每分鐘的速率限制翻倍的體驗,意味著他們可以在同樣的時間內向 GPT-4 傳送更多的請求和令牌,從而獲得更多的輸出和功能。
關於個人工作者最關心的價格問題,奧特曼表示,儘管此次升級版功能更強,但價格更低。GPT-4 Turbo的輸入和輸出都比 GPT-4 便宜了 2.75 倍。每千個字母或漢字,輸入只要 1 美分,輸出只要 3 美分。
02
讓不懂程式碼的人
也能輕鬆定義GPT
當其他大廠還在苦苦追求大模型的生成能力時,OpenAI已經在“卷”大模型背後的生態了。
奧特曼在現場表示,“就像蘋果在2007年推出iPhone,在2008年推出App Store永遠改變了技術一樣,我們推出了GPT Store。”
此次GPT Store是對今年5月推出的外掛商店的進一步升級,在過去應用商店的基礎上,OpenAI調整了策略,從面向“開發者”到面向“每個人”,且每個人可將自己的“定製化GPT”放在GPT商店出售。
這一新概念被OpenAI稱作GPTs。
奧特曼表示,“每一個GPT像是ChatGPT 的一個為了特殊目的而做出的定製版本。”並演示了這一功能,現場建立了一個“創業導師GPT”。

針對GPTs這一新概念,具體解釋就是透過自然語言建立定製化的GPT角色和功能,使用者可以根據自己的需求量身打造專屬的智慧體。

這樣的GPT可以深入理解特定行業的知識,提供個性化的交流體驗,擴充套件知識的深度和廣度,最佳化特定任務的執行效率,甚至即時更新資訊以支援最新的決策。
這不僅大大提升了GPT在專業領域中的應用價值,也為個人和組織提供了高度定製化的智慧解決方案,開啟了人工智慧實用性和有效性的新篇章。
除了“讓不懂程式碼的人建立應用程式”,OpenAI也致力於讓開發者“更輕鬆地構建應用程式”,OpenAI提到AI Agent,不過人家換了個名字,叫Assistants API。
現在,開發者可以在自己的應用裡建立 AI 助手。AI 助手可以根據指令,用 OpenAI 的模型和工具,完成各種任務,比如資料分析、程式設計等。此外,AI 助手還有很多方便的功能,比如持久執行緒、檢索功能、執行程式碼、呼叫函式等。
Assistants API 現在可以免費試用,開發者可以在 Assistants Playground 呼叫AI 助手,體驗“零程式碼程式設計”。
除了以上升級之外,此次釋出會的GPT all tools也令人眼前一亮,不僅是介面的簡化,還有功能的整合。GPT all tools 可以根據使用者的輸入,自動選擇和組合使用最合適的工具。讓使用者感受到 GPT-4 是一個智慧和靈活的 AI 助手,而不僅僅是一個文字生成器。
整合所有功能一體化會為GPT帶來全新的變化,GPT會打破現有的功能侷限,成為全感官虛擬助理、創意無界工作臺、資訊綜合管家、互動式學習夥伴、全維多語種溝通橋樑等多種角色。
防止再次失聯,請立即關注備用號
— 往期精彩英語演講集 —
36歲華裔波士頓市長吳弭哈佛大學畢業演講:唯有尋求真理,才能推動真正的變革!(附影片&演講稿)
波士頓女市長吳弭談“羅訴韋德案”,取消墮胎權就是摧殘生命!
從公交司機之子到倫敦首位穆斯林市長,他就是一個大寫的勵志哥!(附影片&演講稿)
雙語影片 | 這位南京姑娘在波士頓大學的畢業演講火了,網友贊其正能量爆棚(附演講稿)
創造歷史!36歲華裔女性吳弭當選美國波士頓市長,打破該市200年來白人男性市長曆史!
首位華裔女性角逐波士頓市長!18歲進哈佛,30歲當選議長,36歲問鼎波士頓市長!
首位!45歲華人科學家蔣濛當選普渡大學校長,5分鐘最新演講一睹這位政學研全能博士!(附影片&演講稿)
發人深省!華裔女孩哈佛大學畢業演講:說話是一種才能,而沉默是一種智慧(附影片&演講稿)
想第一時間觀看高質量英語演講&採訪影片?把“精彩英語演講”設定為星標就對了!操作辦法就是:進入公眾號——點選右上角的●●●——找到“設為星標”點選即可。
快分享
要收藏
點完贊
點在看

相關文章