DeepMind's New AI System Can Perform Over 600 Tasks - TechCrunch

DeepMind’s New AI System Can Perform Over 600 Tasks – TechCrunch

The ultimate achievement for some in the artificial intelligence industry is the creation of a system with artificial general intelligence (AGI), or the ability to understand and learn any task a human can perform. Long in the field of science fiction, it has been suggested that AI would bring systems with the ability to think, plan, learn, represent knowledge, and communicate in natural language.

Not every expert is convinced that AGI is a realistic – or even feasible – goal. But DeepMind, an Alphabet-backed research lab, arguably took a step towards it this week by launching an artificial intelligence system called Gato.

Gato is what DeepMind describes as a “general purpose” system, one that can be taught to perform many different types of tasks. Researchers at DeepMind trained Gato to complete 604, to be exact, including hanging pictures, engaging in dialogue, stacking blocks with a real robot arm, and playing Atari games.

Jack Hesel, a research scientist at the Allen Institute for Artificial Intelligence, points out that a single AI system that can solve many tasks is not new. For example, Google recently began using a system in Google Search called the Multitasking Unified Model, or MUM, that can handle text, images, and videos to perform tasks, from finding differences between languages ​​in the spelling of a word to associating a search query with an image. but what he is Probably what’s newer here, Heysel says, is the diversity of tasks handled and the method of training.

DeepMind’s Gato Engineering. Image credits: deep mind

“We’ve seen previous evidence that single models can handle surprisingly diverse sets of inputs,” Hessl told TechCrunch by email. “In my view, the key question when it comes to multitasking learning … is whether or not the tasks complement each other. You can imagine a more boring case if the model implicitly separates the tasks before solving them, for example, “If you find out the task A as input, I will use subnet A. If I detect task B instead, I’ll use a different subnet B. For this null hypothesis, similar performance can be achieved by training A and B separately, which is frustrating. In contrast, if training A and B jointly improves either (or both!), then things will be More exciting “.

Like all AI systems, Gato learned by example, ingesting billions of words, images from real world and simulated environments, button presses, shared torques and more in the form of tokens. These tokens represented data in a way that Gato could understand, enabling the system to, for example, trigger hack mechanisms, or any group of words in a sentence that might have a grammatical meaning.

Gato does not necessarily do these tasks we will. For example, when chatting with someone, the system often responds with a superficial or factually incorrect answer (such as “Marseille” in response to “What is the capital of France?”). In hanging pictures, Gato insults people. And the system stacks blocks correctly with a real bot only 60% of the time.

But in 450 of the 604 missions mentioned above, DeepMind claims that Gato performs better than Expert for more than half the time.

“If you think we need a year [systems]which is a large number of people in the field of artificial intelligence and machine learning, then [Gato is] Matthew Gusdial, associate professor of computing sciences at the University of Alberta, told TechCrunch via email. “I think the people who say it’s a big step towards artificial general intelligence are somewhat exaggerating it, because we still don’t have human intelligence and likely won’t get there soon (in my opinion). I personally am in the camp of many of the little modelers.” [and systems] More useful, but there are certainly benefits to these generic models in terms of their performance on tasks outside of their training data. “

Oddly enough, from an architectural point of view, Gato is not significantly different from many AI systems in production today. It shares common characteristics with OpenAI’s GPT-3, meaning it is a “transformer”. Dating back to 2017, Transformer has become the architecture of choice for complex thinking tasks, demonstrating the ability to summarize documents, generate music, classify objects in images, and analyze protein sequences.

Deep Mind Gate

Various tasks that Gato has learned to complete. Image credits: deep mind

Perhaps most importantly, the Gato system is orders of magnitude smaller than the single-tasking systems, including GPT-3, in terms of the number of parameters. Parameters are the parts of the system that are learned from training data and they basically determine the skill of the system in a problem, such as creating a text. Gato has only 1.2 billion, while GPT-3 has more than 170 billion.

DeepMind researchers kept Gato intentionally small so that the system could control the robot’s arm in real time. But they assume that – if scaled up – Gato can handle any “task, behavior, and personification of interest”.

Assuming this is the case, several other obstacles must be overcome to make Gato superior on specific missions over sophisticated single mission systems, such as Gato’s inability to learn constantly. Like most Transformer-based systems, Gato’s knowledge of the world is grounded in training data and remains consistent. If you ask Gatto a sensitive question about appointments, such as the current president of the United States, he will likely answer incorrectly.

The adapter—and Gato, by extension—has another limitation in its context window, or the amount of information the system can “remember” in the context of a given task. Even the best Transformer-based paradigms can’t write a lengthy article, let alone a book, without failing to remember key details and thus losing plot track. Forgetting occurs in any task, whether it is typing or controlling a robot, which is why some experts have called it the “Achilles heel” of machine learning.

For these and other reasons, Mike Cook, a member of the Knives & Paintbrushes research group, cautions against assuming Gato is a true path to general-purpose AI.

“I think the result is subject to a bit of misinterpretation. It seems interesting that an AI is able to do all these seemingly very different tasks, because it seems to us that writing text is very different from controlling a robot. But in reality, it is not much different from GPT-3 understands the difference between plain English text and Python code,” Cook told TechCrunch via email. “Gato receives specific training data about these tasks, just like any other AI of its kind, and learns how patterns in the data relate to each other, including learning to associate certain types of input with certain types of output. That doesn’t mean it’s easy, But to an outside observer, this might seem like an AI can also make a cup of tea or easily learn ten or fifty other tasks, and it just can’t.We know that current large-scale modeling methods can allow it to learn multiple tasks simultaneously. I think it’s great work, but it doesn’t seem to me like a major stepping stone on the road to anything.”

2022-05-13 14:31:00

Leave a Comment

Your email address will not be published. Required fields are marked *