Animal Learning and Intelligence

ANIMAL LEARNING AND INTELLIGENCE

To read up on animal learning and intelligence, refer to pages 284–304 of Eysenck’s A2 Level Psychology.

Ask yourself

· How do animals learn?

· What cognitive abilities do animals possess?

· Can animals recognise themselves when they look in a mirror?

What you need to know

CLASSICAL AND OPERANT CONDITIONING / EVIDENCE FOR INTELLIGENCE IN NON-HUMAN ANIMALS
· The role of classical and operant conditioning in animal learning
· The ecological perspective / · Self-recognition
· Social learning
· Machiavellian intelligence as evidence of intelligence in non-human animals

Classical and operant conditioning

John Watson established behaviourism, a psychological approach that seeks to study only that which is observable and measurable because it can be studied scientifically. The key assumptions were that it is important to study behaviour under controlled conditions and that learning can be accounted for by conditioning principles. Two major types of conditioning were identified: classical conditioning and operant conditioning. According to behaviourism, people are born as a tabula rasa (blank slate) and so the individual is solely a product of learning from the environment.

Research evidence on classical conditioning

· Classical conditioning is the association of two stimuli. It was discovered by the Russian physiologist, Pavlov (1849–1936), who experimented with dogs. He observed that dogs learnt to associate new stimuli with stimuli that produced an automatic response, e.g. noise and the food bucket were associated with food. Pavlov found he could train dogs to salivate to other stimuli than the food they automatically salivated to. He presented the dogs with a tone just before food was presented over a number of trials. The bell is a neutral stimulus and it becomes the conditioned stimulus (CS) when it is paired with the food, the automatic stimulus or unconditioned stimulus (UCS), to produce the same response (salivation). The salivation in response to the food is the unconditioned response (UCR), whereas in response to the tone it is the conditioned response (CR). Thus, classical conditioning is learning by the association of new stimuli with stimuli that cause innate bodily reflexes.

· Pavlov stressed the importance of timing in the learning process through the law of temporal contiguity, which means the two stimuli must be presented close together in time for the association to be learned. He also observed that forward conditioning the presentation of the CS just before the UCS was much stronger than backward conditioning, the presentation of the CS after the UCS.

· Pavlov also researched generalisation, which refers to the fact the conditioned response can be generalised to similar stimuli and that the strength of the conditioned response (e.g. salivation) depends on the similarity between the test stimulus and the original conditioned stimulus. Pavlov also identified the phenomenon of discrimination. This can occur once a second tone has been introduced and generalisation has occurred. If the first tone is paired with food several more times, but the second tone is never paired with food, then salivation to the first tone increases, whereas salivation to the second tone decreases. Thus, the dog has discriminated between the two tones.

· Another key feature of classical conditioning is experimental extinction. The repeated presentation of the conditioned stimulus in the absence of the unconditioned stimulus removes the conditioned response. Extinction does not mean that the dog has completely lost the conditioned reflex. If brought back into the experimental situation after extinction has occurred, the dog will reproduce some salivation in response to the tone. This is spontaneous recovery. It shows that the salivary response to the tone was inhibited (rather than lost) during extinction.

· Pavlov also identified higher order conditioning, in which a new CS can be associated with the original CS until the new CS can produce the CR.

EVALUATION OF THE RESEARCH ON CLASSICAL CONDITIONING

· Experimental extinction is contradicted by spontaneous recovery. According to classical conditioning, experimental extinction occurs because there is an unlearning of the association between the conditioned and unconditioned stimuli. However, the existence of spontaneous recovery (re-emergence of conditioned responses after extinction) indicates that the association has not been unlearned.

· Reductionist. On the negative side, classical conditioning is surprisingly complex, and the model fails to take account of some of these complexities. Classical conditioning is also oversimplified in that it just considers learning and ignores other factors that influence behaviour. In particular, cognitive and innate factors are ignored and these are key influences on behaviour.

· Learning is not always observable. The key assumption that what has been learned will always reveal itself in performance has been disproved. Research has shown that learning can take place without being immediately evident in behaviour.

· Deterministic. Classical conditioning can be seen as deterministic as it suggests that the environment determines behaviour that ignores the individual’s free will to act upon their environment.

· Ignores nature. Classical conditioning only considers nurture and so ignores nature, which, as the ecological theory and preparedness reveals, does play a role in learning.

· Learning is limited. Classical conditioning (CC) requires an UCS to produce an innate reflex response and consequently the generalisability of CC is limited as animals have a limited number of reflexes. This means CC can only account for a limited range of learned behaviours.

The Ecological Perspective

According to the ecological perspective, animals have inherited behavioural tendencies, helping them to survive in their natural environment. Thus, certain forms of learning are more useful than others, and so are acquired more easily because the animal is innately prepared to do so. This approach is very different from the conventional behaviourist one, which completely ignores the role of nature.
An example of learned preparedness is food-aversion learning. The Garcia-effect involved three conditioned stimuli at the same time: saccharin-flavoured water, light, and sound (Garcia & Koelling, 1966, see A2 Level Psychology page 289). Rats learned to avoid either the light and sound stimuli or the flavoured water, based on them being paired with electric shocks and X-rays inducing nausea respectively. Thus, the animals learned to associate being sick with taste, and they learned to associate shock with light and sound stimuli.
These findings show that animals are “biologically prepared” to learn some behaviours more readily than others. This has an evolutionary basis because there is an obvious survival value in learning rapidly to develop a taste aversion to any food followed by illness.

EVALUATION OF THE ECOLOGICAL PERSPECTIVE

· Less reductionist. Biological preparedness takes into account the influence of nature and nurture and so is less reductionist.

· Not all learning is biologically prepared. Whilst biological preparedness provides a good account of a range of behaviour, not all learned behaviour is instinctive.

Operant Conditioning (OC)

Operant conditioning is based on the law of reinforcement, which means that learning is based on the consequences of behaviour. The probability of a given response occurring increases if followed by a reward or positive reinforcer, such as food or praise. This is known as positive reinforcement. The behaviour is also more likely if followed by negative reinforcement as this is the avoidance of an unpleasant consequence and so it “stamps in” the behaviour. The probability of a given response decreases if followed by punishment. The original behaviour is called the operant, which is a voluntary behaviour that initially occurs at random.

Research evidence on operant conditioning

· Skinner’s well known research involved the Skinner box, in which a rat was trapped without food. To receive food it had to press a lever and it would be rewarded with a food pellet. The probability of a response decreases if it is not followed by a positive reinforcer. This is experimental extinction. As with classical conditioning, there is usually some spontaneous recovery after extinction has occurred. There are two major types of positive reinforcers or rewards: primary reinforcers and secondary reinforcers. Primary reinforcers are stimuli needed for survival (e.g. food, water, sleep, air). Secondary reinforcers are rewarding because the person or animal has learned to associate them with primary reinforcers. Secondary reinforcers include money, praise, and attention.

· Further research investigated shaping, in which the animal’s behaviour moves slowly towards the desired response through reinforcement of successive approximations. For example, if we wanted to teach pigeons to play table tennis first they would be rewarded for making any contact with the table-tennis ball. Over time, their actions would need to become more and more like those involved in playing table tennis to be rewarded. Skinner did teach pigeons to play table tennis using this method.

· Another feature of OC is chaining whereby a secondary reinforcer can become associated with the primary one and so complex sequences of responses develop. Thus, a rat could learn to climb a step to hear a tone that would lead to lever pressing to receive the primary reinforcer of food. The learned sequence of responses is known as chaining.

· Generalisation, discrimination, extinction, and the law of contiguity all apply in the same way as with classical conditioning.

· Schedules of reinforcement have also been researched as these can affect how resistant the behaviour is to extinction. For example, if continuous reinforcement is provided, the behaviour will extinguish immediately the reinforcement is withdrawn. Whereas if there is a partial reinforcement schedule, so that reinforcement is provided only some of the time, then the behaviour will be less vulnerable to extinction. A variable interval schedule is when the reinforcement varies and so one cannot predict when the behaviour will be reinforced. A real-life example of this is gambling as you never know when the next win will happen and this schedule means the behaviour is least likely to be extinguished.

Punishment: Positive and Negative

Operant conditioning in which a response is followed by an aversive stimulus is known as positive punishment (often simply called punishment). If the aversive stimulus occurs shortly after the response, it reduces the likelihood the response will be produced in future. Responses are also avoided if they are followed by the removal of a reward or something pleasant (negative punishment).
Skinner argued that punishment can suppress certain responses for a while, but it doesn’t produce new learning. Punishment often has a more lasting effect when animals can obtain positive reinforcement with an alternative response. Thus, if you want to remove an undesirable behaviour, then it is better to reward an alternative behaviour so that the undesirable one is less likely to occur.

Avoidance Learning

Avoidance learning is when a new behaviour is stamped in because it stops aversive stimuli being presented. This is avoidance learning because the new behaviour is learned in order to avoid the aversive stimulus and this avoiding of an unpleasant consequence is called negative reinforcement. For example, a child who eats dinner so that he/she can go out to play is an example of the desired behaviour (eating dinner) being stamped in so that the unpleasant consequence of not being allowed out to play is avoided.

The Ecological Perspective

Skinner’s view was that virtually any response can be conditioned in any stimulus situation—this is known as equipotentiality. In fact, this is not true. Some behaviours are harder to learn than others and even harder to maintain because of “instinctive drift”, i.e. the preference for a behaviour that is instinctive may replace or modify the conditioned behaviour. This is shown in research by Gaffan, Hansel, and Smith (1983, see A2 Level Psychology page 295) in which rats in a T-shaped maze had to decide whether to turn left or right. According to conditioning, if the rat was rewarded for turning left it should turn left on the following trial. However, in the natural environment the rat would not return to where it had just removed food. Gaffan et al. found that early in training rats tended to avoid the arm of the maze in which they had previously found food, as predicted from the ecological perspective. This shows the influence of their instinctive behaviour.

EVALUATION OF OPERANT CONDITIONING

· Applications. The principles of operant conditioning (OC) have been used effectively in the training of animals and this has generated many positive applications. For example, OC has been used to teach language to chimps, and the learning process involved has informed understanding of children who have learning difficulties. Thus, OC has made significant contributions both theoretically and in applied psychology.

· Equipotentiality lacks validity. Skinner’s concept of equipotentiality is incorrect, as is his assumption that operant conditioning is uninfluenced by instinctive behaviour, as instinctive drift shows animals have a preference for instinctive behaviour.

· Circularity. We only know that a stimulus is a reinforcer because it reinforces! It is not a scientific statement, because there is no way that it can be tested. This means the concept of reinforcement is circular and lacks scientific validity.

· Individual differences. Food is a fairly universal source of reinforcement for animals—all animals need food! However, other forms of reinforcement may have a less consistent effect, as whether the reinforcement is rewarding or punishing is open to interpretation.

· Ignores other forms of learning. Other forms of learning are possible. Humans often engage in observational learning—they learn simply by observing someone else being rewarded for behaving in a certain way. Operant conditioning does not account for the more sophisticated social learning.

· Ignores nature. Skinner exaggerated the importance of external or environmental factors as influences on behaviour and minimised the role of internal factors. For example, apes have innate personalities and instinctive behaviours.

· Difficult to distinguish between CC and OC. A key problem is that it can be difficult to distinguish between CC and OC in the learning process. For example, if we take Pavlov’s research, was the food the UCS or was it a source of positive reinforcement?

· Scientific validity. The research on OC is conducted in the controlled conditions of the laboratory and so it can be replicated to check reliability and has greater control of confounding variables and so higher internal validity.