An Accidental Statistician

Home > Other > An Accidental Statistician > Page 15
An Accidental Statistician Page 15

by George E P Box


  At that time I had a little movie camera. One morning, over the mid-Atlantic, there was a rather pretty sunrise. I was filming it when I noticed that one of the four engines of our aircraft had stopped and a propeller was rotating slowly in the wind. When a fellow passenger asked me what I was doing, I said I was photographing the engine that had stopped working. Soon the captain made the announcement that although we could fly perfectly well with three engines, we were going to land at Shannon in Ireland. So we did. The problem at Shannon was that they didn't want us to wander off, so they made us stay for over 12 hours in a large “duty free” area. There, the only popular item for sale was Irish whisky, so some of the passengers were rather high when we finally did get a rescue plane.

  Arriving in Lancaster from these journeys was always a welcome experience. Gwilym's house, which stood on its own with large gardens and a fish pond, was surrounded by gorgeous countryside. At first, Gwilym and I used to work in the mornings and go for long walks in the afternoons. After a time, Gwilym was too ill to come with me and I walked on my own. In the valley was the beautiful river Lune, in which I sometimes swam. Most of the time it was deserted except for an occasional salmon fisherman. (In times gone by, the town had been named for the castle on the river Lune—“Lunecaster”—and over the years it had become Lancaster.)

  Gwilym was Welsh. He said he didn't speak any English until he was seven years old. His grandmother, who died when she was 102, had not bothered to learn English until she was over 60—she said that hardly anyone spoke English in her village before that. Gwilym had married an English girl, to the disapproval of some of his Welsh relatives. But Meg was a wonderful person and was particularly concerned about Gwilym's health. She believed that the vegetables she grew in her expansive garden would be good for him. Meg's father, Bert Bellingham, another enthusiastic gardener, used to visit periodically to help her, and he and I became great friends.

  In the evening, Bert and I would often go to the pub where we'd drink a couple of half-pints of bitter beer accompanied by some crisps—that is, potato chips that were sold in packets for tuppence. We'd been doing this for a year or two during my visits when we discovered that the pub had been taken over by new management that wanted to change it into a “high class establishment.” So when Bert went to the bar as usual and ordered, “two halves of bitter and a packet of crisps,” the proprietor said, in a somewhat snooty voice, “We don't sell crisps.” So Bert said, “What do you sell?” The proprietor said, “We sell sandwiches.” Bert asked, “What kind of sandwiches do you sell?” When the proprietor answered, “We sell salmon sandwiches,” Bert asked, “How much is a salmon sandwich?” The proprietor then named a price that, compared with our tuppeny crisps, seemed astronomical. So Bert then asked, “What other sandwiches do you have?” And when the proprietor said, “We have lobster sandwiches,” Bert immediately asked, “How much are they?” And so it went on. After a series of inquiries of this kind, Bert said, “We don't want any sandwiches,” and came back to our table and sat down. When we finished our beer, Bert said, “George, it's your turn to go up.” So I went and I said, “We'd like two halves of bitter and a packet of crisps.” The proprietor looked at me rather strangely and said, “We don't sell crisps.” I said, “Oh! What do you sell?” and he said, “We have sandwiches.” I said, “What kind of sandwiches do you have?” and we went through the list, and I asked him each time how much each sandwich cost, and so forth. When it was all over, I said, “We don't want any sandwiches” and went and sat down again. After that we decided to find a different pub.

  Bert had a remarkable ability to make friends. One night when we were going out, Meg asked us to try to get her some tomato plants. At a new-found pub, Bert quickly made himself at home and said, “Does anyone know where we can get some tomato plants?” There was some consultation among those present, and then someone said, “Who you want is old Charlie—comes in around 8:30—he'll be along in about 15 minutes I shouldn't wonder.” Sure enough, we went home that evening with some very nice plants at a very reasonable price.

  The pub goers often amused themselves with a variety of games. Darts, for example, was a very popular game, and the skill shown by the regulars was impressive. On another of our pub excursions, Bert and I were intrigued by a group of people gathered around the contestants in a game of dominoes. Unexpectedly we saw that the dominoes had nine spots rather than the sort we were used to with six spots. We asked about this, and it turned out that the nine-spot preference existed in certain villages but not in others. This led to lively arguments about the various domino preferences in certain remote hamlets. Whatever the game, our pub keeper often awarded prizes to the victors, which, no doubt, inspired interest as well in the sale of beer.

  Bert and I remained good friends long after the summers spent in Lancaster, and whether I visited him at his home, or at the senior housing where he lived later in his life, he always greeted me the same way: “Here's my mate George!”

  There was a fish hatchery just down the road from the Jenkins' house, and Meg had acquired about 40 trout there for the fish pond. It was pleasant to watch them swimming around. At that time Meg used to get up early to tend to her youngest child. One morning she said, “Oh Gwilym, I looked out of the window this morning and there was this huge bird catching and eating our fish.” Well, Meg was a country girl and she went into the village the next day and came back with a rifle she had borrowed. The farmer who had leant it to her told her that the bird she had seen was a heron and that she should wait until the bird came again and then fire the gun in the air to frighten it away. Gwilym was horrified. He said, “Oh but Meg, we can't have a gun in the house.” And they had a disagreement. So I said, “I've been in the Army and if you like, I'll fire the gun.” So that was agreed upon, and a few mornings later Meg woke me up at about 6 a.m. and said, “Oh George it's here.” I looked out of the window, and sure enough, there it was quietly standing very tall on the edge of the pond catching our fish. So I fired the gun into the air and the heron took off, and so far as I know, it never came back.

  This got Meg thinking that some trout might make a very good dinner, so we set about catching some. This quickly became a joke. So far as I can remember, Gwilym had a worm on a bent pin on the end of a piece of string. Meg had a net, and I had a rectangular piece of wire netting. My plan was to wait until a fish got into a corner and then put my netting down so that it couldn't get out. We all had a great time trying, but as we quickly found out, none of this was any good: The fish were much too fast for us. The problem was solved when Meg discovered that the village postman was an avid angler. He quickly caught us some fish, and we had some delicious dinners.

  A later problem that Gwilym and I worked on concerned the forecasting of seasonal time series. At that time, I was doing some work for Arthur D. Little, the well-known consulting firm, and they were forecasting using exponential discounting with polynomial models, but they were not producing very good results. For practice we studied some data of R. G. Brown, which showed for each month over a period of several years the number of passengers traveling on transatlantic airlines. This series had a 12-month developing pattern that was low in the winter and high in the summer and varied somewhat every year. This led us to devise what came to be called the “airline model.” We got a big bang for our buck from this kind of model because even though it contained only two parameters (unknown constants to be estimated from past data), it produced a seasonal pattern made up of 12 sine waves of different frequencies. These tracked the data very well and could develop and change as new data became available.

  Gwilym and I had both learned a lot about likelihood from George Barnard, and it seemed natural to use likelihood to estimate the coefficients. But at that time, I was getting more and more interested in estimation using Bayes' theorem (I thought of likelihood as being a timid man's way of doing Bayes), but for samples of the size we needed, it didn't make much difference anyway.

  A problem of est
imation with any kind of time series is that what happens at the beginning depends on what had happened before you started! This is, of course, unknown. However, we realized our models were reversible, and so a natural thing was to forecast what you didn't know by starting at the far end of the series, and “back forecasting” the whole series, including the few preliminary values you needed to get started. We tested this idea on a number of series and it was later validated from a mathematical point of view.

  Sometimes you needed to model the interrelationships between several related time series. For example a famous series shows the supply of hogs, the price of hogs, the price of corn, the supply of corn, and the amount of farm wages over a number of decades.6 Taking into account the dependencies of these five series produced much more accurate forecasts of each one of them. These ideas were widely applicable, and as a result, George Tiao and I got some students programming multiple time series estimation at Madison. This work was further developed in Chicago under the direction of Professor Lon-Mu Liu. In addition, Gwilym developed a program for multiple time series independently at the University of Lancaster.

  At the University, Gwilym had started the Department of Systems Engineering, and he and his students had developed close associations with local industry. The students solved problems with scientists working at these industries as part of their degree requirements. Gwilym's department was very successful and brought in a good deal of money to the University. But people in other departments became jealous, and there were many difficulties. So later, in 1974, Gwilym left the University and set up his own company (Gwilym Jenkins and Partners) where his ideas were further developed.

  Gwilym chose a publisher, Holden-Day, for our time series book. They were responsible for the first edition of Time Series Analysis: Forecasting and Control that appeared in 1970. Unfortunately, Holden-Day was quite reluctant to pay royalties.

  One day Gwilym asked me to call his lawyer in San Francisco about this. I asked him how he had found the lawyer, whose name was Norman Macleod, and he said that he had reached him through the British Consulate. So I called up this lawyer. He had an extremely strong British accent and said, “Oh, Professor Box, I'm terribly glad you called me. Gwilym said you would.” I said he didn't sound to me like an American lawyer, and he said, “Of course I used to be in England, but I took the Bar Exams and I am working over here now.” He cleverly arranged for a permanent pending legal action that we could threaten to use whenever we didn't get paid. Later Holden-Day went bankrupt.

  Sadly much of the strife with Holden-Day occurred while Gwilym's health was rapidly declining. While before we had sent tapes on which we discussed our ideas while writing the book, as the illness progressed, I added tapes of the old Goon Show as well as tapes containing jazz and other music that Gwilym loved. I recently found a copy of a letter that I sent to him in the spring of 1982, shortly before he died. In it I wrote:

  Last week I was teaching a short course on time series for some Army Operation Research officers. … I always do this kind of thing conditional on everyone having a copy of our book and when they all brought them up to be autographed I told them a bit about you and what fun we had together writing it. When we did the transfer function example someone asked why we read the records at nine second intervals and I told them that, so far as I could remember, it was because Meg, who was helping us that day, said that that was the shortest interval that she could read and we said: ‘Then that's the one we want.’

  I often think of you Gwilym while I am walking Victor, our dog, up near the golf course. I remember how we would go for walks on the golf course and try to sort out ‘the jam jar model.’ And how I got very fed up one day and you saw a hill on the opposite side of the lake and you said, ‘Let's go over there on the other side of the lake and climb up that hill,’ and we did. I remember too all the walks we took from Halton Green House and I can see in my mind exactly what the house looks like--from below as you're walking on the down-below-road, and from above, when you come over from the farm, and what the driveway looks like and how we would sometimes see Meg working away in the garden. The trout pond—I remember the time we tried three different methods for catching them without success. I was sure that a sieve on the end of a pole would work but the trout had other ideas. The ducks came later—I think after the heron that we frightened with a shot gun.

  Very soon now I will be setting off for Bulgaria and returning via England so I do hope that I can come and see you…

  With Much Love,

  George

  The book appeared in subsequent editions, published first by Prentice-Hall and then by John Wiley with Gregory Reinsel, my colleague at Madison, as a co-author.7 It is now in its fourth edition and has been translated into many languages. When it was first published, the reaction of reviewers was negative. Some people said it wasn't rigorous enough; others said there was nothing new in it. However, we were not overly concerned because we knew8 that initially, original work was invariably met with hostility. For example, the first paper on response surfaces, and the paper in which the word “robustness” appeared for the first time, were both quite difficult to get published. I think new ideas upset people. For the time series book, there was much discussion, for example, about our use of differencing. All we were saying, really, was that you could be better off modeling the rates that things happened. Now, however, as a direct consequence of these ideas, cointegration and unit roots are a big business in econometrics.

  On the first page of the book, we mention five explicit applications:

  1. The forecasting of future values of a time series.

  2. The determination of how the output is dynamically related to the input for a system subject to inertia.

  3. Determining the effect of intervening events on the behavior of a time series.

  4. The representation of relationships among several time series.

  5. The design of control schemes for compensating deviations from a desired target.

  Thus, what came out of the research on the automatic optimizer was much more than was our initial intention. This confirmed what we believed: That the best way to develop theory is to study practical examples carefully.

  In addition to its impact on our ideas about control, the book has had a large influence on economics and business. Also, econometric models have a lot in common with chemical kinetic models used in describing complex reaction systems. Gwilym and I both had some experience with these. Later, several quite different theoretical kinetic models were put forward by different sets of chemical engineers and chemists to explain the production of ozone and the subsequent pollution of the atmosphere. The problem was that all the kinetic models contained very large numbers of parameters, which it was quite impossible to estimate from the available data—the same problem that arose in economic modeling. We believed that a good approach is that, rather than start with the model, you should start from the data and produce a simple dynamic-stochastic model empirically, and then try to relate this to theoretical mechanisms that could be identified. This has turned out to be a very valuable approach. At one time we had a joint project along these lines with the economists at Madison, but I am sorry to say that nothing came of it.

  When it came time for my students to pick a thesis topic, I usually discussed their interests with them and then suggested possibilities for further pursuit. Three of my students, however, came to Madison with a predetermined interest in time series. Earlier, Dean Wichern, Paul Newbold, and Larry Haugh had written good theses on the subject. As a matter of timing, however, the students who came after Time Series was published in 1970 had more familiarity with the subject. These included Bovas Abraham, Johannes Ledolter, and Greta Ljung.

  Bovas Abraham was from southern India. Before coming to Madison in 1971, he had received a Master's degree in statistics from the University of Kerala, where, for a time, he also taught, and he had also spent two years teaching secondary school in Cape Coast, Ghana. He had then received
another master's degree from the University of Guelph. Bovas had received a partial research assistantship from Wisconsin, and shortly after moving to Madison, he came to talk to me about it. Before he left my office, he agreed to plot some difficult diagrams for me. Only much later did he tell me that he had had no idea how to do this. He persisted, however, and with great enterprise, he eventually discovered a computer program that could plot the graphs. After a year, Bovas passed his qualifying exams and asked me if I would supervise his dissertation.9

  I insisted that my students produce a well-written thesis, and worked hard with both native and non-native speakers of English to make them understand the importance of this. Because I traveled a great deal in the 1970s, I often read parts of a student's thesis and recorded my reactions, as well as suggestions for improving the writing, on a tape that I would then give to the student. This worked surprisingly well. Bovas and I used this technique, and we also had many meetings. When I was traveling during the week, I often met with students on the weekend. Several students would car pool and make the drive to my home south of Madison. Bovas wrote a good thesis and was able to publish a number of papers that were based at least in part on this research.10 Later, he became professor of statistics at the University of Waterloo in Ontario.

  In 2000, Bovas nominated me to receive an honorary doctorate in mathematics at his university. We had a wonderful party at his home on the evening of the convocation. Bovas' wife, Annamma, who is a first-class cook of, Indian food, was dressed traditionally and looked exquisite. While we were there, they were planning the wedding of one of their two daughters. Annamma had been to the hotel where the reception was to take place to instruct the kitchen staff on Indian cooking and in particular, what rice to use and just how to cook it. When we left Waterloo, she sent us home with a jar of very spicy Indian pickles and an assortment of all the right spices for curry.

 

‹ Prev