by Bob Holmes
As Maier’s chicken nuggets demonstrate, building a flavor that is balanced and convincing when tasted by itself is only half the job. The context a flavor is used in—that is, the other ingredients in the product, known in the food industry as the “base”—makes a huge difference to the final result. Many fruit flavors, for example, stand out more prominently in a sweet base, because we expect fruity and sweet to go together, and the brain amplifies these congruent stimuli. Similarly, a salty base brings out the savory elements in something like a chicken soup.
Another issue is that many flavors interact physically or chemically with the base. Even something as simple as a thickening agent, for example, can slow down the release of flavor molecules in your mouth, so that a thick drink or sauce would taste blander than a thin one with the same amount of added flavor. A lot of flavor molecules dissolve more easily in fats than in water, so a high-fat food also releases its flavor more slowly and may therefore need a higher dose of flavoring to achieve the same effect. At FONA, Bob Sobel likes to demonstrate this by mixing up identical amounts of instant chocolate drink in four different kinds of milk, ranging from skim milk to half-and-half. The differences are striking. Chocolate made with skim milk gives an intense burst of chocolate flavor that vanishes almost instantly. “It comes rushing out—it’s not balanced,” says Sobel. With 2 percent milk, the initial hit is less intense, but the flavor lingers a bit, and that’s even more true of whole milk. The chocolate made from half-and-half, in contrast, has a much more muted flavor, but its richness lasts and lasts. Which is best? Try the experiment yourself at home, and see which you prefer.
Even after flavorists have built and balanced their flavor perfectly for its chosen base, the job’s not done. There’s one more big problem to solve: delivery. Sometimes you can’t just dump the finished flavor straight into the food—adding liquid flavor to, say, instant oatmeal would result in a gummy mess. And often, the flavor needs to be protected so that it survives the journey from manufacturer to mouth. Exposure to air can oxidize some flavor molecules. Others—especially the volatile top notes of a flavor—are prone to just drift away, so that the flavor loses its oomph over time. Flavor decay can also happen in protein-rich foods because the sulfur atoms within proteins gradually latch on to the flavor molecules and prevent their release in your mouth. (This binding by proteins is why the smell of campfire smoke can lurk on your [protein-rich] hair, emerging when your hot shower adds enough energy to knock some scent molecules loose again.) And occasionally, the flavor and the food simply declare war on each other, such as when garlic oil prevents bread dough from rising.
The answer to almost all of these problems lies in a strategy called encapsulation. Usually, the tool of choice for this is a machine called a spray dryer, which blasts a fine mist of liquid flavor and a protective coating such as starch into a heated chamber to yield fine particles of flavor enclosed in a dry shell of starch. Mary McKee, one of Givaudan’s flavor-delivery specialists, shows me a more sophisticated version called a fluid-bed dryer, which suspends the mixture in a strong updraft of air to keep the granules from clumping as they dry. Right now, the machine has lime-green granules bumping up and down in it like bread crumbs in a blender.
McKee, a tall, slender woman whose large eyes seem even larger because of her wraparound safety glasses, opens a port on the machine and dumps a little pile of the granules into my hand. They taste vividly of lime—partly because of the flavor, partly because the color provides a congruent visual cue, and partly because of another trick of the delivery. “When you taste a lime flavor by itself, it’s very terpeney with some top notes. We can spray dry that, and it’s fine,” she says. But in the real world, lime has acid, too, so she spray dried the lime flavor on to crystals of citric acid. Now her flavor granules deliver not just the flavor of the lime, but its citric puckeriness as well. The possibilities here are almost endless. “If we were to spray the same flavor on salt, it would taste very different,” she says. Margaritas, anyone? As another example, McKee pulls out a vial of dried oregano leaves coated with jalapeño flavor. Or you could use the same approach to flavor tea leaves. “You can basically coat anything that you can fluidize,” she says.
Another technology, which Givaudan has patented, lets them load flavor inside an insoluble capsule without spray drying, and therefore without risking damage to volatile flavorants during heating. The capsules shear easily when you rub or chew them, releasing their flavor intact. Flavors protected like this are perfect in something like the breading on chicken, says McKee, because you can fry the chicken without losing the flavor during cooking. In fact, liquid garlic flavor encapsulated in this way would deliver the same flavor impact as six times as much unencapsulated flavor—a huge cost savings to the producer.
Once a new flavor is finished, the company can move on to the last step of the product development process: testing the final product on consumers. Testing panels actually come in two different sorts, as different as the apples and oranges (among other things) that they evaluate: consumer panels and trained panels. The most straightforward are simple consumer panels drawn from the general public. These untrained panelists are just like you and me—they would struggle for words if asked to describe exactly the flavor of a particular sample. And even if they can find a word, there’s little consistency from one panelist to the next. What one calls “fragrant” in the flavor of an apple, for example, another might call “flowery,” and a third “sweet.” So flavor testers generally don’t ask consumer panels to describe flavors. Instead, they stick to simpler questions like “Do you like this?” and “Are these two samples the same or different?”
These are exactly the questions that you want to ask the untrained masses, and Big Food desperately needs to know our answers. Obviously, if you’re planning to sell something, you want to know if consumers are buying. Hence, “Do you like this?” and variants such as “Would you buy this?” Even here, though, it’s important to make sure you’re asking not just the general public, but the right segment of the public. If you’re marketing a cheap flavored coffee to be sold at convenience stores, you don’t really care what the Starbucks drinkers or the hard-core espresso geeks think of it—you want to ask the folks who actually buy their coffee at 7-Eleven.
Companies often also need to know whether they can cut costs without consumers noticing, so they care a lot about “Same or different?” questions. Unlike the “Do you like this?” question, you can’t just ask people outright, because that’s an invitation to imagine differences even where none exist—the same overeager pattern recognition that creates puppy dog shapes in clouds and an image of the Virgin Mary in a grilled-cheese sandwich. Instead, testers give their subjects three samples and ask them to say which one is different—the same triangle test that Joel Mainland gave me when I participated in his “Does this compound have a smell?” study. Sometimes, test organizers use a variant of the triangle test called a tetrad test, in which participants get four samples—two of each—and group them into like pairs. The tetrad test turns out to be much more sensitive than the triangle test, so you need to test fewer participants to be confident of the result.
One winter’s day, I got the chance to be part of a consumer panel in the city where I live. I followed directions to a downtown office building, then found my way to the far end of a long, dimly lit hallway not far from the stairwell. Behind a nondescript door that might have fronted a private-eye’s office or a low-budget dentist, I found a small, rather austere waiting room containing a handful of other people. Soon the organizers ushered us into the testing room, a row of perhaps a dozen small carrels against an L-shaped wall. Behind the wall, I knew, was the kitchen area where staff prepared the samples we were to evaluate.
My carrel had side walls to shield me from seeing what the others were up to, a computer display with a mouse, a cup of water, two saltine crackers, a napkin dispenser, and a bottle of Purell Hand Sanitizer. On the back wall of the carrel was a pass-thro
ugh with a hinged cover that soon opened to reveal a tray containing a numbered plastic cup (#553) with some pieces of roasted red pepper inside. Aha. I guess we’re tasting red pepper.
The computer screen lights up with a question: “How much do you like #553 overall?” It offers a nine-point scale ranging from “Dislike extremely” through “Neither like nor dislike” to “Like extremely.” I take a bite. The sample’s not bad, so I pick seven, “Like moderately.” Then the computer asks, in turn, how much I like the flavor, the appearance, and the texture of #553, and finally, whether I would consume #553 again. Then I push the tray back into the pass-through and close the cover (which opens the pass-through on the opposite side, in the kitchen). I nibble a saltine, take a sip of water, and relax until the next sample appears.
The next one, #310, is sweeter—unpleasantly so—and has a slightly bitter, solventy aftertaste. Is this artificially sweetened, I wonder? The third, #617, seems less roasted. The texture is firmer, but the flavor more insipid. #909 is firm, too, but has that bitter/solventy aftertaste again—my least favorite so far. And #480 is the hands-down winner, with the meatiest texture and the richest flavor. With that, we’re done. Sitting back and looking around the room, I see that most of the other panelists—who’ve done this sort of thing before—are already finished and heading out the door, like studio musicians clearing out as soon as the gig’s over.
In the waiting room afterward, the chief scientist explains to me that we’ve been evaluating a new high-pressure treatment designed to reduce spoilage. The treatment extends the shelf life of the peppers, but some tasters complain that it produces a bitter aftertaste. We’re testing whether that aftertaste is noticeable, and how long the peppers can be kept before the flavor starts to deteriorate. (Since we’re answering two different questions, we can’t do a simple triangle or tetrad test—hence the nine-point scale instead.) The panel as a whole got eight different samples—pressure treated or not, and held for two, four, six, or eight weeks—although to avoid taster fatigue, each individual panelist tasted only five of the eight. “Eight would have been too many samples for one person,” she says.
The results are likely to be messy. For one thing, every pepper tastes a little different, so a good treatment on a poor pepper can yield the same score as a poor treatment on a good pepper. And we weren’t given any instructions on how to use that nine-point scale, so each taster is likely to score the same pepper a little differently. Someone who roasts their own peppers at home, for example, is probably less likely to “Like extremely” these processed ones, compared with a person whose only experience of red peppers is from a jar or can. Still, given enough tasters—usually around eighty to one hundred people—and big enough product differences, the researchers should be able to find the answers they need. Just before I leave, the scientist breaks the code for me. Samples #310 and #909—the two I thought had an unpleasant aftertaste—were both high-pressure treated, while the other three weren’t. #480, my favorite, turned out to be the freshest of all the samples I tasted. If everyone felt as I did, that’s bad news for their antispoilage treatment.
These simple consumer panels tell companies a lot of what they need to know about their products—namely, whether people like them. That’s why consumer panels have become ubiquitous in product testing, whether for foods or cars or laundry detergents. But unlike cars and laundry detergents, where consumers can generally go into more depth about look and feel, when it comes to flavor, language becomes a problem. What one person calls “very bitter” might be another person’s “moderately bitter,” or maybe even “sour” or “metallic.” That’s why the people running my panel never asked us to describe the off taste of some of the peppers, only to say whether we disliked it.
To dig deeper into the flavor details, they would have needed a panel of tasters who agree on what terms like “bitter,” “soapy,” and “metallic” mean. And that takes training. Companies that want this higher level of sophistication in their flavor analyses typically convene a small group of people—usually just eight or ten—and put them through several hours of training with standard samples to specify exactly what should be described as “soapy” or “metallic” and exactly how bitter “moderately bitter” is. After the panelists have settled into a standard vocabulary, they can start testing the product.
In the case of my processed peppers, the organizers might have trained an expert panel to reliably assign standard descriptors for bitterness, sweetness, roasted flavors, and several possible descriptors for the off aftertaste that I naively called “solventy”: soap, turpentine, and nail-polish remover, among others. Then they could present the test peppers to the panel and learn exactly how the various treatments affected the flavor, which might suggest ways to tweak the process to reduce the problem. The catch, of course, is that trained panels are very specific: panelists trained on the descriptors that apply to roasted red peppers won’t have the vocabulary for apples or hamburger patties.
Participants in expert panels such as these quickly learn to speak articulately about the intricacies of flavor. The rest of us can take a page from their book and learn to be more articulate about our own flavor experiences. Most of us are pretty good at talking about colors, because we have a common vocabulary to work with. Given almost any color, non-color-blind speakers of English can quickly assign it to one of eleven basic color categories: black, white, brown, gray, red, yellow, green, blue, purple, orange, or pink. From that starting point, we can then make finer distinctions: Is the green a forest green, a kelly green, or a chartreuse? Does it have a touch of blue in it? (Curiously, while English has eleven basic color terms, many other languages have fewer. Some offer just five (black, white, red, yellow, green-blue), three (black, white, red) or even two (light, dark) terms. Imagine trying to describe the color difference between a Granny Smith apple and a Golden Delicious with just “light” and “dark.”)
Experts approach flavors in much the same way, by breaking up the flavor world into a handful of basic categories. Givaudan, for example, has developed its own whole language for flavors, which they call Sense It, that lets their customers and flavorists quickly converge on what they’re talking about. The details, as usual, are a closely guarded secret.
Over at Givaudan’s competitor FONA, on the other hand, Menzie Clarke happily lays out her own set of ten basic categories: fruity; floral; woody; spicy; sulfury (including onions and garlic as well as most meat flavors, eggs, and many off flavors); acid; green (including herbaceous flavors, but also green apples, avocados, and vegetables like beans); brown (nutty flavors, coffee, chocolate and caramel, honey, maple, and bread); terpeney (resinous flavors like pine and citrus peel); and what she calls “lactonic,” a category that includes sweet, creamy flavors and the peachy note that was in the strawberry flavor I made in Brian Mullin’s lab. Other flavorists, especially those at other companies, might have slightly different categories. Mary Maier, for example, includes “earthy” and “starchy” in her basic list of savory flavor categories.
Most of the time, though, flavorists and their clients are working within a much narrower range of possibilities—strawberry flavors, say, or chicken. One of the first tasks in any project is to build a glossary of likely descriptors that might apply to the product in question. For strawberries, for example, FONA’s basic lexicon includes fruity, floral, buttery, ripe, jammy, seedy, fresh, cooked, green, sweet, candylike, burnt, oniony, and creamy. A list like this gives tasters a ready-made vocabulary for comparing test flavors—and it’s always easier to pick the right term from a list than it is to conjure one out of thin air.
One effective way to organize a frequently used set of descriptors is to arrange them in a flavor wheel. The best example of this is the wine aroma wheel developed three decades ago by Ann Noble, a researcher at the University of California, Davis. (If you’re not familiar with it, have a look—it’s readily available online.) The wheel has three concentric rings, each with a set of descriptions. On
the innermost ring, it lists twelve general categories of wine aromas: fruity, vegetative, nutty, caramelized, woody, earthy, chemical, pungent, oxidized, microbiological, floral, and spicy. Suppose you decide you smell something fruity. Then you move out to the next ring on the wheel, which offers six subcategories of fruitiness to choose among: Is it citrus, berry, tropical, tree fruit, dried fruit, or something else? If you pick tree fruit, the outermost ring offers choices that are more specific still: Is it cherry, apricot, peach, or apple? By helping you narrow down the options, the wine wheel quickly lets you arrive at a specific descriptor that fits the flavor of your wine. The approach works so well that there are now flavor wheels for beer, cheese, Scotch whisky, coffee, cigars, chocolate, honey, olive oil—the list goes on and on. (I’m waiting for the day that an ice cream shop posts an ice cream flavor wheel to help people pick which scoop to order. Do you want a berry, spice, tropical fruit, or caramel flavor? If berry, should it be a red berry or a blue berry? Is the red berry strawberry, raspberry, or blackberry?)
These descriptors generally break the flavor world into categories that correspond to flavors found in the natural world—that is, if flavors were paintings, almost all of them would be landscapes or still lifes, more or less faithful representations of subjects in the real world. But are there also abstract flavors that correspond to nothing in the real world? Perfumers, after all, come up with abstract fragrances all the time, but flavorists have barely even ventured into that genre. When I asked flavorists for examples of so-called fantasy flavors, almost everyone mentioned bubble gum, but they struggled to come up with many other examples. There’s blue raspberry, perhaps, and certainly Red Bull—a fantasy flavor that, I’m told, was made intentionally unbalanced, to give the impression of vigor, even agitation. In a sense, too, a generic “meat” flavor is something of a fantasy, since all real meat tastes like something—chicken, if nothing else.