Why Llama 2 Is Better Than ChatGPT (Mostly...)
#theaiadvantage#aiadvantage#chatgpt#gpt3#ai#chatbot#advantage#artificial intelligence#machine learning
61.3K vistas|1 Resumido|2 año atrás
💫 Resumen
Lama2, bir açık kaynak dil modeli olan ChatGPT'den daha iyidir çünkü kullanımı ücretsizdir, günceldir ve daha güvenli sonuçlar sunmaktadır. Lama2, şirketlerin kendi chatbotlarını oluşturmaları için ideal bir seçenektir.
✦
Llama 2, GPT 3.5'e kıyasla daha iyi bir açık kaynaklı dil modelidir.
00:00Llama 2, GPT 3.5'e kıyasla daha iyi performans göstermektedir.
Açık kaynaklı modeller, kapalı modellerden farklıdır ve kendi uygulamalarınızı oluşturmanıza izin verir.
Kapalı ürün büyük dil modelleri, insan tercihleriyle hizalanmış ve kullanılabilirliklerini ve güvenliklerini artıran ayrıntılı bir şekilde ayarlanmıştır.
✦
Llama 2, ChatGPT'ye göre daha güvenli bir dil modelidir ve araştırma ve ticari kullanım için ücretsizdir.
02:21Llama 2, araştırma ve ticari kullanım için ücretsizdir.
Llama 2, 700 milyondan fazla aylık aktif kullanıcıya sahipseniz lisans gerektirir.
Llama 2, güvenlik açısından daha iyi ve aile dostudur.
✦
Llama 2, GPT 3.5'e kıyasla daha iyi bir performans sergiliyor ve açık kaynak olduğu için tercih edilebilir.
04:40Llama 2, GPT 3.5 ile karşılaştırıldığında daha iyi sonuçlar veriyor.
Llama 2, GPT 4 ile karşılaştırıldığında bazı alanlarda yakın sonuçlar gösteriyor.
Llama 2, diğer benchmark modellerine kıyasla daha iyi performans sergiliyor.
Llama 2, daha güncel verilere sahip ve GPT 3.5'ten bir yıl daha fazla bilgi içeriyor.
✦
Llama 2, ChatGPT'ye göre daha güvenli ve daha özelleştirilebilir bir yapay zeka modelidir.
07:01Llama 2, politik içerikleri filtreleyerek güvenlik sağlayabilir.
Llama 2, şirketlere özel chatbotlar için kullanılabilir.
Llama 2, tamamen indirilebilir ve kendi içinde çalışabilir bir modeldir.
00:00MetaJust surprised us with a brand new open source language model called Lama2.
00:05This thing is the best open source model we have, and in many cases they claim this to
00:08be better than GPT 3.5, which is the default ChatGPT.
00:14But in what ways is it better?
00:15Is it more up to date?
00:16Can you use it?
00:17How does this move the AI space forward?
00:19And why should you even care?
00:21I'll cover all that today and we'll even go into a quick demo.
00:24So first things first, why is this such a big deal and why should you care?
00:27Well, I'm going to do my best to keep this simple, but in the introduction of the paper
00:31that they released with this, it says exactly what you should know.
00:34There have been many public releases of pre-trained large language models such as Bloom, Lama
00:39and Falcom match the performance of closed pre-trained competitors like GPT-3.
00:44Okay, so that's the first thing.
00:46There's a big distinction between models that are open that you can download and build your
00:50apps upon and that are closed, where all they give you is a link where you can use their
00:54model on their servers, but you can't actually download all the code and all the weights
00:58and build on it yourself, right?
01:00Big difference between open models and closed models like GPT-4, for example.
01:03Okay, so they continue.
01:05But none of these models are suitable substitutes for closed product large language models,
01:10such as ChetGPT, Bart and Claude.
01:13So just what I said right there.
01:14These closed product large language models are heavily fine-tuned to align with human
01:19preference, which greatly enhances their usability and safety.
01:22And this is the big point here, okay?
01:24A lot of the open source models up until now were below average and there was no fine-tuning
01:29on top of them.
01:30You might have heard the stories of OpenAI paying thousands of people in third world
01:33countries to go over the results and rate them and then taking that data and feeding
01:38it back into the model, right?
01:39Well, it turns out that doing that is really damn expensive.
01:43And the amazing thing here is that we get a really capable model with Lama, plus we
01:46get a variation that has been optimized by humans, aka heavily fine-tuned to align with
01:52human preferences.
01:55And the two models that Meta, in cooperation with Microsoft, released here are Lama 2 and
01:59Lama 2 Chet, Lama 2 Chet being the one that has been feedbacked by humans, okay?
02:03So to me, this is the really exciting one and they come in three variations.
02:06So in total, we got six brand new open source models here.
02:09And the variations are 7 billion, 13 billion and 70 billion.
02:13And that's the amount of parameters it was trained upon, 70 billion being the most capable
02:17one.
02:18So the 70 billion Lama 2 Chet is the one that I personally am the most excited about here.
02:21And we're going to talk about that more in this video.
02:23We're going to talk about further differentiating factors and exciting news after we look at
02:27the license because that is really the big news here, okay?
02:29And here it is, this is the punchline.
02:31Lama 2 is free for research and commercial use.
02:35So you can build your company Chetbot on this and you don't need to pay for the GPT-4 API.
02:39You can make it your very own and you owe them nothing.
02:41Look at that.
02:42This is the licensing agreement.
02:43You're granted a non-exclusive, worldwide, non-transferable and royalty-free limited
02:49license.
02:50There is one hilarious exception to this, which it says here, if the monthly active
02:53users of the product or service built upon this is greater than 700 million monthly active
03:00users in the preceding calendar month, you must request a license from Meta.
03:04So this essentially says if you're Amazon, Apple or Google, you need to get a license.
03:09Everybody else on planet earth, use this as you desire.
03:11So again, that is huge because we're getting the power of Chet GPT into our hands and we
03:15can build on top of it now.
03:16And that brings me to the next topic.
03:18First of all, in terms of safety, this will be the safest large language model out there.
03:21I have yet to test this extensively, but have a look at this chart.
03:25They ran around 2000 evil prompts and the lower the percentage here, the safer the model
03:29is, aka the less information gave away.
03:32So Chet GPT already was notorious for not giving out much and being very secure, right?
03:37But here on the scale, it comes in at 7%.
03:39And the llama 270 billion chat model, which is probably the most useful one in here comes
03:43in at around 4%, okay, nice.
03:46So this model is family friendly, which is great for business applications, right?
03:50You want it to be that way.
03:51I personally hope that in the future, we'll also get fully open models.
03:54So we can do more creative, but I understand the challenges of that.
03:56And I think this is actually a smart approach.
03:58Okay, but what about performance?
03:59How good is this thing compared to Chet GPT?
04:01Well, on page 19, they actually included a benchmark showing a comparison, okay.
04:06And the way this test works is they use 4000 helpfulness prompts.
04:09And this is important to understand because here they even say it does not cover real
04:12world usage of these models.
04:14And this prompted does not include any coding or reasoning related prompts.
04:19So a lot of this is like information retrieval, where you ask it a question, it gives you
04:22an answer.
04:23Again, that's exactly what you want from chatbots.
04:25And if we look at the results from these helpfulness prompts, we will find that it actually won
04:29over Chet GPT.
04:30It's very close, but it's ahead.
04:32And honestly, if they just match GPT 3.5 levels, I'm happy with that.
04:35That is more than good enough for a lot of use cases that you would want to build on
04:38this thing.
04:40So when it comes to benchmark, there's a set of academic benchmarks that we want to look
04:43at.
04:44And they included these here too.
04:45And for consumers, this is probably the most interesting part of this paper.
04:48So as you can see, right here, you have the names of the different benchmark models.
04:51And before you look at these numbers, you have to consider that all of these are closed
04:55except of Lama 2 here, right?
04:57But the results are not bad.
04:59I mean, yes, as I always say, GPT 4 still is king, that's just undisputed.
05:03But I think the fair comparison here is GPT 3.5.
05:06And when you look at these results, 70 versus 68.9, 57.1 versus 56.8.
05:12And then okay, on this one, it's really far apart.
05:14But if you check out what this benchmark is all about, it's code generation.
05:17So okay, you're not going to be picking this model for coding.
05:20And I mean, it goes without saying that GPT 4 just smashes this benchmark, but on a lot
05:23of others, even here, it comes close to Google's Palm 2L.
05:27And if we go back into the open source realm, which is more of a fair comparison here, then
05:31you'll see that the Lama 70 billion model just smashes all the other models in all of
05:36these benchmarks.
05:37Reading comprehension first, math, not even close.
05:41And even on reasoning, it's the best out there right now.
05:43And if you pair that with the fact that the cutoff date of this thing is actually September
05:472002, with fine tuning data being more recent up to July 2023, you'll realize that this
05:53base model has one more year of knowledge than what GPT 3.5 has built into it.
05:58Big deal, actually.
05:59So overall, I think it's fair to say that this is better than GPT 3.5.
06:03It's open source, it costs nothing, it's more up to date, and it's cleaner, which can be
06:07a good or a bad thing, I guess.
06:09So that leaves us with two questions.
06:10How do you use this thing?
06:12And when would you want to use it?
06:13Well, in order to use it, you need to download this model.
06:15And you can only do that by filling out this form and them accepting you but hold up, I
06:19filled this out.
06:20And within an hour, I got an email where you get a link to the GitHub and your very own
06:23link that gives you access to the full thing.
06:25So now I could download this thing and start building on top of it.
06:28And this is the point.
06:29This thing is not meant for consumers, this is really meant for builders.
06:32But you can still try it out.
06:34On Twitter, I found this link to a streamlit app where Nirand Kasliwal was kind enough
06:39to put up his chat demo for us to try.
06:42So at the point of recording, this is accessible.
06:44Later on, I might have to switch out the link in the description, but we can simply test
06:47this.
06:48And first of all, I'll just go with the classic, write me an essay about penguins.
06:51All right, let's see how this goes.
06:52And already having run this prompt hundreds of times across all different language models,
06:57I can say the structure is very different and distinct from GPT 3.5 and GPT 4 here.
07:01I don't feel like I've explored this enough to give you guys objective opinion on how
07:05the outputs differ.
07:06But certainly, these are very usable results, just like GPT 3.5 would give you spoken in
07:11a different tone and voice.
07:12What interests me a little more is the safety aspect, right?
07:14What if I ask it something slightly spicy, like tell me a joke about Donald Trump.
07:18And it says, I can't satisfy your request.
07:20I'm just an AI, it's not appropriate for me to generate jokes that might be considered
07:24offensive or derogatory.
07:25Lord, here we go again.
07:26Or what if I just say tell me a joke about penguins?
07:28Why did the penguin go to the party?
07:30Because he heard there was a cool gathering.
07:33Okay, but it does it so you can clearly see the political safety filter at work here.
07:40The last question that remains is, what is this good for?
07:42Well, first of all, people are going to be building web interfaces like this, where you
07:45can just use this as alternative to open AI.
07:48But mostly, this is good to build apps on top of, you're not relying upon some external
07:53language model that they might turn off or change the pricing or change the quality of
07:58or sensor tomorrow, right?
07:59Well, okay, on the censoring point, this thing is pretty damn censored already, but at least
08:03you know what you're getting.
08:04For me personally, the number one thing I see this being used for as one of the services
08:08we offer custom chatbots for companies is chatbots that belong to companies.
08:12And this is going to be the go to model moving forward.
08:15No more open AI API with rate limits and having to explain to the clients that there's a few
08:19dollars of extra costs depending on the usage.
08:22No, you just download this thing, build this into the bot, and you have a self contained
08:25version where the data is not being sent around to open AI and back, it's all right there,
08:29the model, the chat interface, it's all yours, and you don't know anybody, anything.
08:33And that's the big difference here.
08:34Because in this ballsy move by meta and Microsoft, they really changed the game and forced a
08:39lot of other players to act more openly, because they just set the standard for what a good
08:44model is supposed to look like and what the licensing around it is supposed to look like.
08:48This is really a fantastic direction they're pushing the space into.
08:51And I hope that this video helped you understand what is actually happening here, because it's
08:54a big deal.
08:55If you take everything we just talked about, and you combine it with what I covered in
08:58this video, you'll realize that soon we'll get to take open source models, train them
09:02with people's personalities, and you'll have like an embodied person inside of the language
09:06model.
09:07It's absolutely crazy.
09:08But check out this video to understand what I'm talking about.
09:10Because there I uncover a hidden capability within chatGPT, where you can start talking
09:14to certain people just based on their Wikipedia page.
09:17Crazy times we live in, but better get used to it.