In Defence of Dogs Page 14 Read online free by John Bradshaw

Home > Other > In Defence of Dogs > Page 14

In Defence of Dogs Page 14

Dog training, as opposed to mere learning, primarily relies on the other major type of associative learning, instrumental or operant conditioning. This kind of conditioning links together an action that the dog performs and a specific reward. (The reward might also be the avoidance of a punishment.) The action is usually something that the dog would do in other circumstances, but not, until trained, specifically in order to obtain that reward. For example, a dog can be trained to sit down to obtain a morsel of food, even though sitting down is not a normal part of canid hunting or feeding behaviour. Behaviour that does not come naturally is much harder to bring about; for example, it is much easier to train a dog than a horse to retrieve sticks, since the natural model – bringing food back home to feed the young – is an essential part of the canid repertoire, but not that of an animal that grazes grass. Dogs are naturally motivated to perform a diverse range of tasks. For dogs, like all animals, food can be an important reward, but dogs are unusual in that most also regard contact with their owner as rewarding in itself. Some types of dog also find the opportunity to go exploring and/or hunting as rewarding in its own right, independent of any food that might eventually result; this tendency appears to be especially well developed in sled-dogs, for example. Others find play rewarding in itself, over and above the contact with their owner that is often also involved, and thus it is used in some types of sniffer-dog training.

Not all training is deliberate; often a dog will ‘train itself’, learning by trial and error that when it does one particular thing, something good follows. This is often how the simplest forms of attention-seeking behaviour arise. For example, young dogs will often ‘try out’ fragments of adult behaviour when playing with their littermates, or with their human family. One of these behaviours is mounting, a component of sexual behaviour that is also routinely seen in play. When, by chance, the puppy attempts to mount the leg of one of the human members of the household, everyone else in the room will laugh in a slightly embarrassed way, and gently push the dog off. In the young dog’s rather unsophisticated interpretation of human behaviour, this is just play (that is, rewarding) and so it will repeat the performance, subsequently extending it to visitors, to the even greater embarrassment of the owner. The more entrenched the behaviour becomes, the harder it is to eradicate. Boisterous young dogs may even interpret the owner’s smack on the nose as just another part of the game, not the punishment that the owner intended.

Dogs can behave badly – as far as their owners are concerned – not just spontaneously but also due to inadvertent reinforcement: here, too, learning theory can help when attempting to eradicate such unwanted behaviour. Although the association between mounting and praise can be extinguished by simply ignoring it whenever it happens, this can be a difficult tactic in real life, because to begin with the dog may continue trying to obtain the reward by persisting in, and even increasing the intensity of, the unwanted behaviour. Some kind of distraction technique (technically, omission training) is usually needed in such instances, the aim being to direct the dog to do something else that is equally rewarding, but more acceptable. For example, a dog that chases cyclists can be rewarded with a game played with the owner whenever a cyclist appears on the horizon. But the game must stop as soon as the dog starts to react to the bicycle; otherwise, the dog could end up interpreting the game as an encouragement to start chasing. We should not be too surprised that some dogs want to chase joggers and cyclists; chasing things that run away is a natural part of hunting behaviour – although this is no excuse for us to refrain from training the dog not to do it.

Dogs being dogs, they will often not make quite the same associations that we would in the same situation. In most real-life examples of training, the action is preceded by a cue that triggers the behaviour, followed by the reward: owner says ‘sit’, dog sits, dog is praised by owner. However, the ‘cue’ may not be as simple and straightforward as the owner thinks it is (see box – ‘I said, “Sit!”). Some aspect of the surroundings may be included in the (composite) cue that the dog builds up; consider, for example, the young dog that is obedient in the ‘puppy party’,14 where it has been trained consistently under the gaze of the organizer, but is disobedient in other places, where its owner may have been less consistent in delivering rewards.

I SAID, ‘SIT!’

As part of his 1994 UK television series Dogs with Dunbar, well-known American veterinarian and dog behaviour expert Dr Ian Dunbar set up a fascinating demonstration of how dogs can fool us into thinking they’ve learned one thing, while actually having learned another. Most owners believe that their dogs know the word ‘sit’. On camera, he asked several such owners to command their dogs to sit just by saying the word: no body-language, no gestures, just the word. Most of the dogs hadn’t a clue as to what they were supposed to be doing. They’d learned the cues that came easiest to them – not the word ‘sit’, which dogs, with their limited repertoire of vocal signals, must find it difficult to distinguish from other similar-sounding words, but rather the gestures that the owner invariably used to accompany the word ‘sit’.

Inspired by this demonstration, I set out to find what my Labrador Bruno had actually learned when I had trained him to sit. It turned out that in his case it was a sound – but all he was picking up was the final ‘t’ sound, along with my intonation. If I said ‘cricke tba t’ in the right way, his backside hit the ground instantaneously.

As with classical conditioning, the timing of the delivery of the reward is crucial. There must be no more than a second or two between the dog performing the desired action and the arrival of the reward. Longer than this, and not only will the learning be slower to establish, but there is also an increased chance that the dog will make unwanted associations with something else. Take the example of an inexperienced owner attempting to teach a young dog the command ‘sit’. When the dog does finally sit, the owner is so relieved that he praises the dog over and over again – ‘Good dog, good dog, good dog …’ Meanwhile, the excitable young dog has got up from the sit and is attempting to bounce around. Thus the sound ‘good dog’, intended as the reward for sitting, instead becomes the cue for this bouncing around, an activity that is highly pleasurable for the dog. The next time the owner says ‘good dog’, he is surprised to find that the dog instantly increases its activity, ignoring whatever command he is trying to teach at the time.

Delivering a reward is easy when the dog is beside you, but not so easy when it is some distance away – for example when it is chasing after another dog and the owner wants it to learn a command to stop. This problem is especially acute when training aquatic animals – there is usually too much of a delay between the successful performance of a trick presented in the middle of the pool, and the delivery of the rewarding fish by the trainer standing on the edge – and so it was dolphin trainers, rather than dog trainers, who first looked to learning theory for a solution.15 For dolphins, that solution was a whistle. First the dolphin was taught, at the edge of the pool, that the sound of the whistle was immediately followed by a fish. Subsequently, when it performed a particularly spectacular jump in the middle of the pool, the trainer could, with a quick blast of the whistle, signify that this was the jump she wanted, even before the dolphin hit the water – the delivery of the fish could then follow in slow time. Here, the whistle serves as a secondary reinforcer, an initially arbitrary event that, by being reliably and immediately followed by a genuine reward, not only becomes associated with it in the animal’s mind – a case of straightforward classical conditioning – but also, somehow, becomes rewarding in its own right.

Whistles were already being used by some dog trainers as cues, so a different arbitrary sound was needed for dogs. Owners can now make use of a commercially available and highly effective secondary reinforcer known as the ‘clicker’, a tensioned flat metal spring in a plastic case that goes ‘click-clack’ when pressed and quickly released. Actually any kind of very rapid yet obvious noise will work – including
the softer click of a retractable ballpoint pen for a sound-sensitive dog, or a single flash from a bright LED flashlight for a deaf dog. There is nothing magic about any of these; what matters is that they are just both convenient and easily recognized by the dog.

Clicker training

The ‘click’ is meaningless until it has become linked in the dog’s mind to a reward. Many trainers use tiny pieces of something tasty as the reward at first, because there are few dogs which do not respond to food. But as the training advances, it is a good idea to link the click to other kinds of reward as well, such as play with a toy, or petting, otherwise the dog may not respond to the click when it is not hungry. Of course, it is harder to gauge the effectiveness of these alternative rewards. It is easy to see whether or not food is working – dogs reliably enjoy food and will happily eat it – but the effect of the other rewards may be missed. The trainer needs to see the dog’s tail wagging, as a check that the dog has registered the reward.

Eventually, the click alone should be enough to secure the dog’s attention, and thereby function as a reward in its own right. For instance, in the early stages of training a dog to come back to the owner, the click can be used on its own as an instant reward as soon as the dog responds to the owner’s preferred recall signal (‘Fido, come!’), by moving in the right direction. The advantage is that the click can be given while the dog is still some distance away, and then repeated, closely followed by the treat when it reaches the owner. The important principle is that once the clicker has become rewarding in its own right, it does not always have to be followed immediately by the treat, although if it is never followed by the treat (the primary reinforcer), its value will eventually dwindle to nothing.

In real life, the dog does not just learn the sound of the click; it also learns a lot about the circumstances in which it hears it. This was brought home to me forcibly the first time I witnessed mass clicker-training, in the kennels at the Waltham Centre for Pet Nutrition in Leicestershire. These are the dogs that check the efficacy of all Pedigreey® foods, and they are all looked after like pet dogs, with plenty of exercise and contact with people. Clickers are used all the time, sometimes with several dogs simultaneously, yet it is rare for any dog to respond to the wrong clicker. On the rare occasions that it does, of course it does not get rewarded – the dog quickly learns whose clicker is the one that is rewarding, and whose is not. Since the clickers have all come from the same manufacturer, they probably sound identical, even to a dog, so what’s most likely is that the dog has also memorized who is carrying ‘its’ clicker.

Provided they can all be linked to the delivery of a real reward at the end, there is no reason why a dog should not learn several secondary reinforcers. Professional trainers can use this basic principle to teach dogs complex tricks and tasks such as assisting a blind owner to avoid obstacles, adding complexity by joining several learned associations together. The final stage is usually rewarded first, and then once that has been established the earlier stages can be added one at a time, as performance of each stage becomes rewarding in its own right. This forward chaining is easier than backward chaining, because the final reward is always linked to the same action, rather than being delayed every time a new element is added to the chain. A dog that not only comes to its owner when called but also always sits down beside him or her has learned a behaviour chain.

Widely employed in more advanced training is a technique known as shaping. Here, the first step is to reward the dog whenever it spontaneously behaves in a way that approximates to part of the task that needs to be taught – in the case of the guide dog, this would probably be a simple turn away from an obstacle. Once this connection is established, the dog gets rewarded only if it starts to walk around the obstacle, and finally only if it turns, walks round the obstacle and then returns to the original path. In this way, a series of actions that would rarely happen spontaneously, and would therefore be difficult to train in one step, can be progressively built up from any spontaneous behaviour that approximates, however loosely, to the desired result.

Shaping is an extraordinarily powerful technique, one that can be used even to alter completely the dog’s body-language, which is supposedly ‘instinctive’ (according to the supporters of lupomorphism). Gwen Bailey, who pioneered the use of behaviour modification in the United Kingdom to help rescued dogs become fulfilling pets, originally convinced the rehoming charity she worked for that this was possible, by rehabilitating an aggressive dog, Beau. With a history of biting people, dogs and cats, Beau had been abandoned by his owners, but Gwen was able to rebuild his confidence and remove the fears and anxieties that were causing him to bite. Indeed, by the time a television company wanted to make a documentary about this achievement (an extraordinary one for its time), Beau was far too well adjusted to even snap at anyone, let alone bite. So Gwen set about training Beau to snap (in the air) on a very specific cue, using standard shaping techniques, just so that his original snappy disposition could be faked for the documentary. In this way she was able to change what had been a species-typical ‘innate’ signal of aggression into a meaningless response to an arbitrary cue that she produced only when she wanted it to happen.

Shaping is not, however, the exclusive preserve of the expert trainer – in fact, far from it, since dogs spontaneously, if unconsciously, ‘shape’ their own behaviour. If my own experience is anything to go by, there are plenty of dog owners who have shaped their dogs without even knowing it. For instance, I have encountered several dogs that use growling and snapping as (relatively) harmless ways of getting their owners to make a fuss of them. What has probably happened in such cases is that the dog once used this display for real, when it was irritated about something, but the owner accidentally reinforced it, perhaps by simply laughing aloud, and then making a fuss of the dog. Dogs are always looking for ways to get affection from their owners, so when they are presented with such an opportunity, they can switch the ‘meaning’ of this display into this new context. Growling has become a signal for ‘Play with me!’ – which is almost a reversal of its original meaning as a warning. (This is not to say, however, that the same dog will not still use growling and snapping as a prelude to real aggression as well. Indeed, unintentional shaping of this kind can set up the possibility that the owner will, perhaps when a child is nearby, misunderstand what is going on, and unwittingly put the child in danger.)

It is thus possible to get dogs to do most of the things we want by rewarding them. Such techniques are especially easy with dogs because there are several types of reward available – food, attention, play. It is also possible to use rewards to reorganize much unwanted behaviour, especially if that consists of natural behaviour occurring in what we humans consider the wrong context. A straightforward set of training methods has been devised from the science of reward-based learning, itself established on the basis of thousands of experiments done on rats, mice, pigeons and many other animals, including dogs, training methods that eliminate the need to strike a dog, even once. Unfortunately, however, our relationship with dogs predates the science of learning theory by many thousands of years, and so they come with much historical baggage attached, including the mistaken idea that training can best be achieved by physical punishment.

So far, I have mainly discussed how dogs learn based on things they like to do – eating, playing, getting praise from their owners. Dogs must have learned in this way ever since they were domesticated, and indeed the most modern training methods are largely based on setting up associations between rewards and things that the owner wants the dog to do. However, as we have seen, dogs also learn to avoid things that they do not like. Until recently this was the main principle behind the craft of dog training, which was largely based on the selective application of physical punishment.

Confusion can arise from the difference between the everyday use of the word ‘punishment’ and the psychologists’ use of the term. Punishment-based dog training implies the everyday meanin
g, ‘rough physical treatment’, which refers to actions that can produce pain or discomfort, such as choking the dog’s windpipe, pinching its ear, beating it with a stick, giving it electric shocks and so on. Psychologists use the term ‘punishment’ to describe all of these, but they also include other sensations that the dog does not like – indeed, any that lead to negative emotion such as fear or anxiety. (For a sensitive dog, this could be something as slight and momentary as its owner’s raised eyebrow.) However, most of the arguments about what is and is not acceptable in dog training revolve around physical punishment.

Learning that occurs as a result of physical pain or discomfort is classified by psychologists as positive punishment. The commonplace ‘choke chain’ is a good example of a positive punisher; the discomfort of being choked is intended to reduce the dog’s desire to pull on the leash. Dogs have sensitive necks, and so the neck is an obvious target for inflicting pain. The traditional ‘choke chain’, also known as a ‘slip-collar’ or ‘check chain’, and its hardcore variant the ‘prong-collar’, are designed to inflict momentary pain on the dog’s neck when it pulls on the leash. The dog should then learn, via positive punishment, to avoid the pain by not pulling on the lead. But this kind of training is ultimately ineffective: most dogs whose owners require them to wear such a collar do continue to pull, whenever their motivation to move towards something outweighs the pain. They may also habituate to the discomfort of the collar. In the case of the ‘citronella spray collar’, which is designed to suppress barking by dispensing an aversive odour whenever the dog vocalizes, a study has revealed that although it is effective for a week or so, dogs then habituate to the odour, and after two or three weeks, bark almost as often as they did before the collar was fitted.16

‹ Prev Next ›