Machine Understanding: 2012

Monday, August 20, 2012

ApeWorm Brain Overview

(part of the series Aping As the Basis of Intelligence)

Figure 2 gives an overview of the ApeWorm "central nervous system" or control system.

Two eyes or retinas gather information about the exemplar that is analyzed to form an exemplar inner model. The ApeWorm also analyzes it internal sensors to construct a self inner model, or self image. If they match, the Ape control function registers success and indicates to the muscle control functions to stop movement. It also may reinforce the "synapses" or weights that led to the success, including any temporal sequence involved. If there is No Match, muscle control allows movement to continue.

It should be noted that the compartmentalization of functions in Figure 2 need not necessarily correspond to how the system is constructed. A neural model might make it possible, or even necessary, to tightly integrate two or more, or even all, of the functions that seem separate in this analysis.

Wednesday, July 11, 2012

Constructing the ApeWorm World

In this draft I will try to avoid using any particular programming language to illustrate the mechanisms of action. If necessary, I will use pseudocode. When the analysis is finished and it is time to try to run simulations I will try to remember to insert links to the code samples.

The ApeWorm world (or dual worlds) does not need to be defined explicitly. The upper and lower limits for the ApeWorm coordinates will suffice. As shown in Figure 1, the allowable coordinates run from 0 to 4 on both axes.

Figure 1, ApeWorm

Figure 1

ApeWorms themselves have 4 segments, but require five coordinate points to describe fully. Each point has an x1 component and an x2 component. The points will be called A0(x1,x2), A1(x1,x2) , A2(x1,x2) , A3(x1,x2) , and A4(x1,x2).

The data we gods will use to track ApeWorms are not the same data that they keep track of themselves. An ApeWorm will know where its head, or segment one, is on the world grid. In other words, it will know A0(x1,x2) and A1(x1,x2). Each intersegment node (joint) will be able to convey to the control system one of three states: left, center, or right (L, C, R). Handedness will be determined from the higher numbered segment. In other words for the 1 joint, we look towards the 1 segment from the 2 segment. Thus in the curled ApeWorm example in Figure 1 the joints are all in the right or R configuration.

The joints are controlled by sets of two opposing virtual muscles. When there is no signal to a muscle, it is relaxed, and it contracts as signals increase. An algorithm will determine which of the three states the joint is in based on the relative strengths of the control signals to the muscles.

The ApeWorm's ability to "view" another ApeWorm is stereoscopic but otherwise simple. The virtual retinas coincide with the x1 and x2 axes. Each segment of these axes can only "see" a segment in its row or column, and if there are two segments in a column can only see the closest one. [But a second segment in a column might be inferred by the control system.] The sensor can see the segment number, if any, and can see the distance, or what cross-coordinate the segment lies on. Thus in the upper-left example in Figure 1, the sensors on the x1 (horizontal) axis will record nothing. The x2 sensor closest to the origin will record a segment 4 that is at x1 = 1. The x2 sensor between 1 and 2 will record a segment 3 also at x1 = 2, etc.

Next: ApeWorm Brain Overview (to be constructed)

Tuesday, July 10, 2012

ApeWorm, an Initial Machine Ape Project

The core of the system we seek is quite complex. We would like to minimize the inputs and outputs in order to focus our first project on the core aping system. At the same time, we want a general solution that is scalable to projects with larger, more complex input/output and memory requirements.

We will call our first attempt ApeWorms. These creature constructs will exist on a 5 x 5 grid and will have, for bodies, four connected line segments, the ends of which line up on the grid. Any two adjacent line segments will be able to have three relative positions: left, straight, and right {L, S, R}. An ApeWorm cannot have two segments overlapping each other. Figure 1 shows a few possible ApeWorm configurations.

Figure 1, ApeWorm

Figure 1

While ApeWorms are physically simple, their neural networks, or control software, can be as complex as we need. What should an ApeWorm know about itself, and what should it be allowed to know about an exemplar ApeWorm, if aping it is to be a non-trivial task requiring an extensible aping machine?

The ApeWorm will have one end designated the zero {0} end, with joints {1, 2, 3} and a {4} end. Each segment will be designated by its terminal joint, {1, 2, 3, 4}. Each joint will signal its relative position, going from the lower to the higher numbers, as L, S, or R (after training). The zero end and 1 joint will be able to sense their positions in the grid, again after training. For illustration purposes joints may be color designated: 1. Black 2. Green 3. Blue 4. Red. [Training adds a degree of complexity, which I suspect will be important as we move to systems more complex than ApeWorm]

How would the ApeWorm know about its exemplar? Note that it would be a trivial task, if exact information about the exemplar is known, to assign that information to the ApeWorm directly in software. But if we built two ApeWorms in the flesh, so to speak, we should have autonomous systems that require such information to cross space, so that information about position becomes encoded in some other form.

We will allow the ApeWorm to have two "eyes" on the Exemplar worm. The eyes (retinas) will be at the bottom and left sides of the grid. Each eye can detect segments that are broadside to it, and their orientation by row or column and by distance from the eye.

It is tempting to ignore the temporal aspect in the first build of ApeWorm. It seems difficult enough to construct a general aping ability when the minimum requirement is to detect the position of the Exemplar worm and then provide for the ApeWorm to conform itself to a copy of that. However, going forward we will want temporal input and temporal aping.

Consider that the aping instinct evolved in animals only after basic lower functions were in place. An animal has to sense the environment, coordinate itself, catch food and escape predators just to survive. All that puts a high premium on temporal capabilities. Aping sits atop all those capabilities, just as language and culture sit atop aping. We have freed our ApeWorms from the need to eat, but we run the danger of making our aping system hard to integrate into robots and more complex programs if it is unable to at least deal with time sequences.

Next: Constructing the ApeWorm World

Monday, June 11, 2012

Specifications for the Language Machine

Aping as the Basis of Intelligence (cont.)

Specifications for the Language Machine

Typically in systems of artificial intelligence designed for language there is a front-end feature detection system. Thus the slight fluctuations in air pressure we call sound are analyzed for features. In the case of human language these features are often quite complex, but at this point they are well-studied. Thus detectors have been devised for common syllables and voice ranges.

In a developing human there are likely some very generalized feature detectors, but they are also very flexible. This would also be true in mammals and birds that have shown they can learn some human words. Thus a human baby can learn a primitive, click-based tongue from Africa, the simple syllables of modern English, or a tone-based Asian language system. In effect feature detectors evolve based on exposure to language.

Voicing is also complex, controlled by a wide range of muscles. It too is learned, and requires considerable practice to achieve perfection. Aping the voices of other humans is the primary method of learning to speak so as to be understood.

Four major input/output streams can be defined for a human-like language machine. There is the audio input from the ears. There is output to a variety of muscles that produce sounds and speech. There are other inputs ultimately external to the body, necessary to provide positive and negative behavior reinforcement, such as touch. There are internal desire (or rejection) type inputs, notably hunger and other discomforts or wants. There is also a need for decision making: given all the other inputs, deciding what sounds to make and when. This decision making could be incorporated into the language machine or it could be external, and probably is some combination of both in humans.

Tuesday, June 5, 2012

Aping as the Basis of Intelligence

Logic was thought to be the key paradigm for human intelligence from the Aristotle to Wittgenstein, and then in the early days of Artificial Intelligence as implemented on computers.

Careful consideration shows that aping, the ability to copy an action of another individual, is more central to intelligence. It is as much a mammal trait as a human one. Formal exercises in logic are not needed for one animal to learn behavior from another. The complex neural circuitry created by evolutionary pressures to enhance aping potential eventually became the basis of human language and its formalized reasoning skills known as logic.

Aping consists of observing (usually visually) what some other individual is doing, and then copying that behavior. Humans do this so naturally that we seldom think about it consciously. Unless trained to do so (by aping others), practicing logic is a much harder skill. Yet logic circuits are easy to implement with electronic parts. Aping another individual requires very complex systems within the brain.

This essay, and the accompanying computer programs and physical systems, is an attempt to create machines capable of aping, which then should be able to handle other complex behaviors usually associated with human intelligence, including language, mechanical skills, reasoning, and possibly consciousness.

The advantage of aping skills goes beyond helping to ensure the survival of an individual animal. With aping a behavior with survival benefits can become cultural, spanning an unlimited number of generations of animals. This without any genetic change specific to the behavior itself. A generalized pool of neurons can become capable of generating a wide and flexible variety of behaviors, giving a species a heavy advantage over animals that can only engage in behavior based on pre-programmed neural systems.

Pre-language example. Consider a simple aping process and how it differs from paradigms like ordinary remembered (learned) behavior. An animal, a young wolf, has always simply crossed a cold stream by simply plunging through it. One day it sees an older wolf stop at the stream and proceed along the bank to a point where there is a rock in the center of the stream. The older wolf jumps to the rock, then to the far bank, thereby avoiding getting wet while crossing. The young wolf then apes that behavior. Afterwards, when crossing the stream in that area (when not in hot pursuit of prey) the young wolf diverts to the special crossing point.

Note how the aping behavior differs from what might be other reasons for using the rock to cross. The young wolf could learn of the rock crossing by exploration, then would remember the location of the crossing when appropriate. You could argue that following an older animal might be an instinct, so this is not a real example of aping, just a side effect of following.

Aping behavior might have originated in following behavior, and the basic ability to learn from experience, but as more complex situations are involved, it becomes clear that aping is a special case requiring special capabilities.

Supposing this simple example constitutes a form of aping, what can we say about how it is accomplished by the wolf brain? The wolf is capable of recognizing an externalized self. The example wolf, to the aping wolf, represents a possibility for itself. Metaphorically, the aping wolf can see itself crossing the stream using the rock when it sees the example wolf. So the wolf-brain is capable of contemplating (in a very simple, non-philosophic way) the future. It can choose to ape the example, or it can refuse to cross the stream, or it can ignore the example and get wet and cold crossing the stream.

Aping, in the infants of actual apes including humans, is probably automatic. Only at a later time will the young apes start choosing whether or not to ape particular behaviors. There is typically outside support for aping in human children. We reward them when we like their aping efforts, but may discourage them when we don't like their efforts (as when they engage in behaviors culturally reserved for adults).

Consider the "simple" modern act of making a piece of toast with a bread toasting appliance. A child is not likely to learn to do this by accidentally taking a slice of bread, sticking it into the appliance, and pushing down the lever. A child learns this by watching a example or exemplar. The child understands that doing what an exemplar does leads to the same result. The child sees that taking a piece of bread, placing it in a certain way in the appliance, and pressing the lever results in the making of toast. Usually, if sufficiently coordinated, a child can do this on the first try. Aping often results in some level of success in achieving a goal on the first try.

To ape an exemplar the aping child must already have similar levels of muscle control and sense of where its body is in relation to its surroundings. It must understand that aping the exemplar's movements (or other behaviors like vocalizing) achieves a desired result. If an aping attempt is made that does not achieve the desired results, a child may conclude that something is missing from the aping process, a subtlety or trick.
To ape, the brain must have a number of capabilities, which might be summed up as the ability to coordinate the body of the subject with the body of the exemplar.

In humans, when children are quick to ape adults, we call them intelligent. If they are slow to imitate adult behaviors, we call the slow or less intelligent.

Language learning example. Human babies start learning to imitate vocalizations soon after birth. Vocal fussing often results in rewards, like feeding, encouraging further vocalizations. The process may seem like a long one when mechanical devices have long existed that can record and play back sounds and language. There is a fundamental difference between the mechanical play back model and the aping vocalization model that is critical to the reconstruction of how humans are capable of intelligent behavior, including understanding how the world works. [I prefer Machine Understanding as a term for what we are working towards with computerized robots capable of aping, because Artificial Intelligence has largely ignored a set of crucial problems that logic circuits have difficulty handling].

The human brain does not appear to be a recording medium in the sense that a phonograph, magnetic tape machine, or silicon-memory based sound recording and playing device is. The human ability to immediately repeat a phrase of language or hum a melody does not arise from a like mechanism. Humans have fairly good memories, but a weak ability to memorize.

Consider the problem, for the human brain, of learning to repeat its first few spoken words, say "mama, papa, no, puppy, bye, milk." In the ear there are a number of sensory cells that respond to different frequencies of sound. They send nerve impulses towards the brain. At the other end we have the sound making machinery: lungs, vocal cords, and mouth. An infant screaming soon after birth indicates that machinery works, but is nothing like language. Months of hearing and experimental vocalizations follow. There may be feedback, positive or negative, by parents and other exemplars, as well as careful repetition of simple words in the hope this will induce aping by the child.

In the end, however, success is the ability of the child to say a word that sounds, to its own ear, like the kind of words spoken by the exemplars. It will not be identical, nor will it ever be identical, in its fine structure, in fact it will be identifiably different, an individual voice. Six words mastered, many more will follow, all aped after exemplars. In addition, the words will have meaning: they will coordinate with objects or actions in the external world (or internal world, as with "more" or "hurts").

The neural machine that coordinates heard words and the speaking of words with an "ego" must be very complex and very capable. Recreate that machine and we would have a model that, perhaps with some specialized variations, would be able to ape visual input to body positions, or anything else required of it.

Sunday, January 22, 2012

Predictive Memory and Particle Filters

I keep thinking about the Particle Filters AI method, so might as well right some of it down. Note that in AI Particle Filters has a particular meaning, different from the physics for particles in a real-world simulation or in a video game. In AI filtering is a method for localization of an object, which could be a large physical object like a robot. It involves probabilistic methods (Monte Carlo techniques) for predicting where the object might be, combined with sensory feedback. Typically it requires a pre-existing map of the territory (which for a robot is real, but which could be highly abstract in other cases) and a method for matching the feedback to the map. It has been most notably used as an essential software component of the Google self-driving car. A somewhat more extensive overview can be found at Particle Filters.

The reason I keep thinking about it is because I started noticing how I consciously (and probably subconsciously) localize myself in the physical world. Most notably, when I am navigating in the dark in my house. I do this to avoid backtracking: turning on a light, going back to the prior switch & turning it off, etc. In any case I know my house well, but in just a few feet in the dark it is easy to get a bit off course and bump into furniture or a wall where I expected a doorway. The stairs can be tricky too.

So I walk a few feet in what I guess is the right direction based on experience. Then I reach out to touch what I expect is a wall or piece of furniture. Or I note a change from rug to bare floor, or spot an LED. Usually I get the feedback I expect, but sometimes I find I am off course and need to make a correction.

In Particle Filter Localization terms, I project a probabilistic range of particles representing where I, the human robot, might be next. Then I check. Often I know exactly where I am after a check, but sometimes all I know is that I am not near a wall or furniture piece. Then I make another mental projection, take a step or two depending on my expectations and level of caution, and try again to touch something. I go through similar mental gymnastics when driving a car or taking a daylight walk, but the feedback is visual rather than touch.

I went through a similar analysis process after reading Jeff Hawkins theory of predictive memory in On Intelligence. The theory is that the brain largely operates by making constant predictions, based on memory of historical patterns, and checking them against sensory input. When input varies from predictions, neurons send special signals about detecting novelty to higher brain centers that deal with novelty. For instance, as you read this your brain knows a number of words might be included in this zebra. See, you expected "sentence" or maybe "essay" or the like, you were not expecting "zebra", and so the normal flow of processing was interrupted.

I have not seen a neural network model for particle filters. It could be that the resemblance between how human brains seem to work is only superficially like particle filters. More generally we are talking about feedback systems, which can take a number of forms. Even very primitive animals that lack differentiated neurons have feedback systems, so there could be many types of neural feedback schemes operating in the human brain.

I often notice my dog doing something that makes me think that most mammals, including pre-sapiens homo species, can do that. Yet when these processes, say chasing a rabbit, are abstracted, they can become quite complex. I can imagine chasing a rabbit. I can plan to chase a rabbit into a trap. I can pursue something more abstract, say a solution to a construction problem or even a math problem in a way that resembles chasing a rabbit. Let's see, the answer to the solution of this differential equation has disappeared, it is not obvious. I'll beat the bushes, using three techniques known to help solve this type of equation, most-likely technique first. Hopefully one technique will put the problem in a form that I recognize from the standard map for solving such equations.

How would particle filters be implemented with neural networks? Let me know your thoughts.

Machine Understanding