What can and can't language models do? Lessons learned from BIGBench
Por um escritor misterioso
Descrição
So what exactly can and can’t language models do? What's the least impressive thing GPT-4 won't be able to do? What will GPT-4 be incapable of?
BIGBench is kind of a way to figure this out. BigBench, aka “The Beyond the Imitation Game” Benchmark, is an attempt to explore the capabilities of large language models over a wide variety of tasks. All the tasks are enumerated here.
I looked through every BIGBench task and took the ones that compared both GPT3 and PaLM against humans.
* Spreadsheet
Xinyun Chen (@xinyun_chen_) / X
TECHTALK. AI scientists are studying the “emergent” abilities of
Large Language Models In A Nutshell - FourWeekMBA
Language Modelling
444 Authors From 132 Institutions Release BIG-bench: A 204-Task
Large Language Model: Most Up-to-Date Encyclopedia, News & Reviews
Language Models Perform Reasoning via Chain of Thought – Google
Language Models Perform Reasoning via Chain of Thought – Google
Lessons Learned from Developing a Product with Large Language
Gemini in-depth analysis. ChatGPT killer or scam?
R] Language Models Don't Always Say What They Think: Unfaithful
Stanford CS25: V2 I Language and Human Alignment
Benchmark of LLMs (Part 1): Glue & SuperGLUE, Adversarial NLI, Big