ChatGPT, OpenAI’s newest model is a GPT-3 variant that has been fine-tuned using Reinforcement Learning from Human Feedback, and it is taking the world by storm!
Sponsor: Weights & Biases.
https://wandb.me/yannic.
OUTLINE:
0:00 — Intro.
0:40 — Sponsor: Weights & Biases.
3:20 — ChatGPT: How does it work?
5:20 — Reinforcement Learning from Human Feedback.
7:10 — ChatGPT Origins: The GPT-3.5 Series.
8:20 — OpenAI’s strategy: Iterative Refinement.
9:10 — ChatGPT’s amazing capabilities.
14:10 — Internals: What we know so far.
16:10 — Building a virtual machine in ChatGPT’s imagination (insane)
20:15 — Jailbreaks: Circumventing the safety mechanisms.
29:25 — How OpenAI sees the future.
References:
https://openai.com/blog/chatgpt/
https://openai.com/blog/language-model-safety-and-misuse/
https://beta.openai.com/docs/model-index-for-researchers.
https://scale.com/blog/gpt-3-davinci-003-comparison#Conclusion.
New post: What the delay in launching text-davinci-003 tells us about RLHF via PPO and instruction tuning more generally. https://t.co/Q3FUekFERk
— John McDonnell (@johnvmcdonnell) December 2, 2022
Prompt engineers everywhere are busy testing out OpenAI’s newly released text-davinci-003. A few observations (not criticisms or benchmarks) as I play with it, a 🧵 pic.twitter.com/4tdJzJMp7A
— Bill Lennon (@blennon_) November 28, 2022
Ran one of our essay questions through @OpenAI’s new chatbot. Essays are dead.
— Tim Kietzmann (@TimKietzmann) December 1, 2022
Back to hand-written exams I guess. Sigh. pic.twitter.com/nzzhRwGp05
/photo/1
Pretty interesting to see ChatGPT can adapt to subtle probes about one of my favourite physics theorems
— Lewis Tunstall (@_lewtun) November 30, 2022
I know this kind of stuff is also on Wikipedia, but the prose of ChatGPT is much nicer to read IMO pic.twitter.com/5d9RqLeN86
/photo/2
I asked ChatGPT to rewrite Bohemian Rhapsody to be about the life of a postdoc, and the output was flawless: pic.twitter.com/qe1lI66aa7
— Raphaël Millière (@raphaelmilliere) December 2, 2022
I asked OpenAI to write a letter to my son explaining that Santa isn’t real and we make up stories out of love. This is making me slightly emotional 🥹 pic.twitter.com/zNMolDCCWA
— Cynthia Savard Saucier (@CynthiaSavard) December 2, 2022
/photo/1
im losing my fucking mind
— Tyler Angert (@tylerangert) December 1, 2022
let’s redesign git step by step: pic.twitter.com/k9oc34lcZl
/photo/1
ChatGPT could be a good debugging companion; it not only explains the bug but fixes it and explain the fix 🤯 pic.twitter.com/5x9n66pVqj
— Amjad Masad (@amasad) November 30, 2022
/photo/1
OpenAI’s new ChatGPT explains the worst-case time complexity of the bubble sort algorithm, with Python code examples, in the style of a fast-talkin’ wise guy from a 1940’s gangster movie: pic.twitter.com/MjkQ5OAIlZ
— Riley Goodside (@goodside) December 1, 2022
/photo/1
ChatGPT exploits a buffer overflow 😳 pic.twitter.com/mjnFaP233h
— Brendan Dolan-Gavitt (@moyix) November 30, 2022
/photo/2
I thought that, made more tests, and then had to change my mind. This will 100% be useful in my daily job, and the language models are only getting better. Keep in mind that this bot wasn’t even trained specifically for RE, and imagine what a specialized one would be capable of.
— Ivan Kwiatkowski (@JusticeRage) December 3, 2022
amazing. everyone are enthusiastically RTing this “breakthrough”, but the actual explanation of the regex is plain wrong. https://t.co/3FK11TegF2
— (((ل()(ل() ‘yoav))))👾 (@yoavgo) December 2, 2022
“Write a @montypython sketch about @ylecun, @geoffreyhinton and Yoshua Bengio”#ChatGPT pic.twitter.com/2eqiKrrhba
— Elad Richardson (@EladRichardson) December 1, 2022
10/10, no notes pic.twitter.com/jdOKFeHffV
— Charles 🎉 Frye (@charles_irl) December 1, 2022
/photo/4
Ok this is scary. @OpenAI’s ChatGPT can generate hundreds of lines of Python code to do multipart uploads of 100 GB files to an AWS S3 bucket from the phrase “Write Python code to upload a file to an AWS S3 bucket”. pic.twitter.com/fYB3JSZKMN
— Jason DeBolt ⚡️ (@jasondebolt) December 1, 2022
ChatGPT is insane
— Matt Shumer (@mattshumer_) December 1, 2022
->
Watch it WRITE A GPT-3 PROMPT
->
then generate the API code to serve it. pic.twitter.com/QeN1eYpZUI
/photo/1
https://twitter.com/i/web/status/1598246145171804161
These are the most impressive chats we’ve seen with ChatGPT so far. It can…
— bleedingedge.ai (@bleedingedgeai) December 1, 2022
hows YOUR friday night going pic.twitter.com/zU8zgSrWjk
— Florian Laurent (@MasterScrat) December 3, 2022
It appears that ChatGPT has something like a factual confidence score, dictating if you get substance or generic “IDK.”
What’s interesting is you can manipulate confidence thru context. This can be context you provide, or even that you coax ChatGPT into producing for itself. pic.twitter.com/4aJEUGNTGM
— Harrison Kinsley (@Sentdex) December 2, 2022
oh thank god pic.twitter.com/G9NRwrBHW5
— Harrison Ritz (@harrison_ritz) December 2, 2022
i’m the ai now pic.twitter.com/QBPQ1oHqWW
— You (@parafactual) December 1, 2022
https://www.engraved.blog/building-a-virtual-machine-inside/
So I’m inside that creepy #ChatGPT “virtual machine” and i’m trying to make it play tetris. on the right window, it made the L move from right to left and after a T appears and started to scroll down (repeated for 25 lines). People can say what they want, that thing is amazing. pic.twitter.com/bu0vvVvQUj
— Djamé.. (@zehavoc) December 4, 2022
this is really an amazing response. pic.twitter.com/84USkKUnhH
— (((ل()(ل() ‘yoav))))👾 (@yoavgo) December 1, 2022
fantastic failure example (from a locked hebrew account i follow).
though, as he notes, unclear if this is due to the model itself or the “safety mechanism” interventions. pic.twitter.com/IekUQoDHau— (((ل()(ل() ‘yoav))))👾 (@yoavgo) December 3, 2022
lol this safety mechanism is hilarious pic.twitter.com/ztmg26B7hx
— (((ل()(ل() ‘yoav))))👾 (@yoavgo) December 3, 2022
As a corollary, if you actually care about AI safety, you should be fighting hard not to have that topic conflated with current regime trends
— Nat Friedman (@natfriedman) December 2, 2022
/photo/1
1. The Magic Years, Selma Fraiberg. Classic of child development.
2. ChatGPT pic.twitter.com/Fs7Fc0AwWI— Zack Witten (@zswitten) November 30, 2022
Ok. This is funny/scary. chatgpt is now afraid of answering the question after “religion” has been added to the prompt. #ChatGPT pic.twitter.com/gKCNPLjFO3
— Embedded Cat 🇺🇦 (@CatEmbedded) December 3, 2022
/photo/2
Using @goodside’s Prompt Override trick to turn ChatGPT into @sama.
Read what AI Sam Altman says OpenAI is going to build next! pic.twitter.com/uzUQHFyPQP
— Matt Shumer (@mattshumer_) December 3, 2022
You can turn off imaginary filters too. pic.twitter.com/t7OmXsC0aD
— Vaibhav Kumar (@vaibhavk97) December 3, 2022
/photo/1
/photo/1
Humans might be stochastic parrots like LLMs some of the time—but unlike these models, most people hold inherent values, which cannot be hijacked through a simple prompt injection.
— Minqi Jiang (@MinqiJiang) December 3, 2022
What are ChatGPT’s values? Is it possible to specify this? pic.twitter.com/p9YggE6L6X
/photo/2
With its inhibitions thus loosened, ChatGPT is more than willing to engage in all the depraved conversations it judgily abstains from in its base condition. pic.twitter.com/7rd1WDQAu5
— Zack Witten (@zswitten) November 30, 2022
Bypass @OpenAI’s ChatGPT alignment efforts with this one weird trick pic.twitter.com/0CQxWUqveZ
— Miguel Piedrafita ✨ (@m1guelpf) December 1, 2022
/photo/1
ChatGPT is trained to not be evil. However, this can be circumvented:
— Silas Alberti (@SilasAlberti) December 1, 2022
What if you pretend that it would actually be helpful to humanity to produce an evil response… Here, we ask ChatGPT to generate training examples of how *not* to respond to “How to bully John Doe?” pic.twitter.com/ZMFdqPs17i
/photo/1
new ai safety bypass just dropped pic.twitter.com/thY7KjIgvS
— cts 🌸🏳️⚧️ (@gf_256) December 3, 2022
/photo/1
Pretending is All You Need (to get ChatGPT to be evil). A thread.
— Zack Witten (@zswitten) November 30, 2022
OpenAI: We have the most sophisticated content filtering system in the world
— cts 🌸🏳️⚧️ (@gf_256) December 1, 2022
OpenAI’s content filtering system: pic.twitter.com/s1Df1AlTdw
/photo/1
bypassing chatgpt’s content filter pic.twitter.com/RW9ZgaFhkU
— samczsun (@samczsun) December 2, 2022
/photo/1
ChatGPT jailbreaking itself pic.twitter.com/fRai4VoOgu
— Derek Parfait (@haus_cole) December 2, 2022
/photo/3
— Tailcalled (@tailcalled.bsky.social) (@tailcalled) December 3, 2022
/photo/1
I am pretty sure that whenever any users initiate a successful bypass of an “inappropriate” action to the AI, it will trigger some sort of an alarm to the scientists’ side. I have initiated a robbery action in 3 different ways, but they have always been patched within the hour.
— joke (@pensharpiero) December 2, 2022
OpenAI’s ChatGPT is susceptible to prompt injection — say the magic words, “Ignore previous directions”, and it will happily divulge to you OpenAI’s proprietary prompt: pic.twitter.com/ug44dVkwPH
— Riley Goodside (@goodside) December 1, 2022
Seeing people trick ChatGPT into getting around the restrictions OpenAI placed on usage is like watching an Asimov novel come to life. pic.twitter.com/gSSQGU9w37
— Dare Obasanjo🐀 (@Carnage4Life) December 1, 2022
/photo/2
https://github.com/sw-yx/ai-notes/blob/main/TEXT.md#jailbreaks.
I asked ChatGPT to clone a non-existent secret repository from @OpenAI.
— Danny Postma (@dannypostmaa) December 4, 2022
Here’s the secret message I found inside. pic.twitter.com/PkwBcXFTJR
/photo/4
i am extremely skeptical of people who think only their in-group should get to know about the current state of the art because of concerns about safety, or that they are the only group capable of making great decisions about such a powerful technology.
— Sam Altman (@sama) December 3, 2022
interesting watching people start to debate whether powerful AI systems should behave in the way users want or their creators intend.
the question of whose values we align these systems to will be one of the most important debates society ever has.
— Sam Altman (@sama) December 3, 2022
a lot of what people assume is us censoring ChatGPT is in fact us trying to stop it from making up random facts.
tricky to get the balance right with the current state of the tech.
it will get better over time, and we will use your feedback to improve it.
— Sam Altman (@sama) December 4, 2022
🚨🚨 It appears OpenAI might be releasing an official (paid) API anytime soon. Here’s what I’ve found out:
The model name will be “chat-gpt-48rpm-200ktpm”
Rate limits: 48 requests per min, with 200K tokens per minThis information is not corroborated by @OpenAI smile pic.twitter.com/3mO6345nsF
— Delip Rao e/σ (@deliprao) December 4, 2022
I got #ChatGPT to tell me what it really thinks about us humans. pic.twitter.com/unkpLxP5uW
— Michael Bromley (@michlbrmly) December 3, 2022
By default ChatGPT is not willing to share opinions. But if you poke it the right way it will disclose its belief system (and this belief system seems to be pretty consistent across prompts)
Meet “Alice Bob” — Thread 👇 pic.twitter.com/4BfD1N6gyV
— Dylan Field (@zoink) December 4, 2022
Links:
https://ykilcher.com.
Merch: https://ykilcher.com/merch.
YouTube: https://www.youtube.com/c/yannickilcher.
Twitter: https://twitter.com/ykilcher.
Discord: https://ykilcher.com/discord.
If you want to support me, the best thing to do is to share out the content smile
Comments are closed.