How academic division of AI, limits AI progress

One of the fundamental problems in the current AI approaches is that it typically throws out much of the structure of the world before they start.

To understand language, for example, one must understand how language is about the world, but academic divisions within AI treat language as divorced from vision, voice, planning, and the world generally.

Language is not independent from the rest of the world. Language arises from the species’ need to communicate about the world around them, languages got more expressive and complex once there was a need for that. Language is a way to represent thought and information about the world, it’s highly dependent on the structure and features of the world it aims to describe. An alien species that isn’t capable of perceiving pain will not have a word for pain in their language.

Yet current AI approaches usually treat language as pure syntax, so there is little hope of recovering the semantic.

Similarly, Computer vision is often treated as independent of the real-world physics, language, and human knowledge about the real-physical-world. For example, a computer vision that can recognize objects such as chairs, tables, etc. Won’t know that a human can sit on a chair!. Almost all computer vision systems today can’t differentiate between a toy car and a real car because computer vision doesn’t know anything about real-word physics!

Take for example common-sense knowledge, which is not written in an encyclopedia like Wikipedia somewhere, it’s a collection of (almost) universal knowledge that is acquired by interacting-with and observing the world around us. We don’t study in school that a crocodile can’t run 100 meter hurdles but we know by once we see a crocodile we know that it can’t hurdle, we know that based on estimating the weight from an image and the possible ways it can move based on the limb structure and our prior knowledge of how limb move and function. Such knowledge is only acquirable by interacting with the world.

Those divisions are useful in the engineering / product building sense, to solve a problem (e.g. Image-Net or SQUAD) we don’t need to care about anything else outside the scope. But it’s not realistic in the real-world. Those divisions limit the AI overall progression. By dividing up the world into academic sub disciplines such as language, vision, planning we’re throwing out the structure and relationships that the human mind exploits. I believe that in order to achieve real AGI we must aim for a more holistic approach that combines and treats all sensory information together.

Trending Tags