Book Read Free

The Unicorn Project

Page 28

by Gene Kim


  Over the next two days, the teams work on their portions of the Unicorn Project. Maxine spends most of her time on what she views as the riskiest part of the whole operation, which is getting all the data into the Narwhal NoSQL databases and enabling all the teams to be able to access what they need. She knows that they are now well past the point of no return, having torched the ships they knew how to sail.

  The most difficult part was not the mechanics of importing the data from twenty different business systems. Instead, it was trying to create a unified vocabulary and taxonomy that they could use, because almost every business system had different names for similar things.

  Physical stores have five different definitions of in-store sales, including from a company acquired decades ago. There are six different ways that products are catalogued. Product categories and prices don’t line up. The business rules around pricing and promotion are exercises in forensic archaeology. They pulled in business analysts from across the company to help make sense of it and make decisions about how they should be represented.

  Maxine found herself constantly switching between insisting on clarity and consistency to ensure accuracy to saying “good enough for now” and deferring decisions that would require days of consensus-building because they would impact Parts Unlimited for decades to come. Without her extensive experience working with enterprise systems, she’s sure she wouldn’t have had the judgement necessary to make these types of calls, especially given the deadlines involved.

  Everyone is focused on the big, upcoming Demo Day, where each team will show their portions of the system on the final days before Black Friday. Maggie will be leading it, and almost all the stakeholders will be there, as well as all the technology executives, ending with a final “go/no go” launch decision.

  Because of the high stakes involved, Maxine makes sure that she attends all of the daily engineering team standups, where team members quickly share progress and, more importantly, what help they need. She approves of how quickly and efficiently these meetings are run, with blockers being urgently handled by the team leads.

  On this tight timeline, every day counts. Thanksgiving is just over a week away. She listens intently as she sits in the Unicorn standup. One of the two most senior data scientists from the Promotions team is visibly flustered. “We still don’t have the fields we need in the one percent subset of the customer list from the Data Warehouse team, and we still can’t match up nearly half of the physical store order data.

  “And for our data analysis, the Narwhal database is incredibly fast, compared to what we’re used to. But because of all the joins we need to do, the query times are still orders of magnitude too slow,” he continues. “Given the deadlines, we only have one or two shots at this, and if the results are like the ones we’re getting right now, we will not be ready for the Black Friday launch. And if we use the data we have right now, the promotions are guaranteed to be a real dud. Just this morning I found a case where we would have sent offers for snow tires to people in Texas.”

  Oh, shit, thinks Maxine. This is what you get for waiting too long to invite the data scientists to the engineering meetings. She says out loud, “Okay, I’ll pull together an emergency single-topic meeting later this morning. I’ll make sure Kurt and Maggie are there, as well as the Narwhal team. Could you prepare a ten-minute briefing about these problems and some ideas on how to solve them?”

  When he nods, Maxine takes out her phone and calls Kurt.

  Two hours later, everyone is gathered in a conference room listening to the problems that the Analytics and Promotions teams are having. After fifteen minutes, Maxine is feeling genuinely daunted at the sheer scale of the problem.

  It’s no wonder the Analytics team has made so little progress—what they want to do is simply impossible with the infrastructure they’ve built. The data sets are orders of magnitude larger than what they can handle. Maxine immediately sees that the queries the data scientists are building are a complete mismatch to what they’ve built Narwhal for. Narwhal is stellar at handling API requests from all the various teams across the company, but now they’re learning that it’s spectacularly not great for what the Analytics teams need to do.

  Worse, the Unicorn teams still can’t get the data they need. It takes the Data Warehouse team four months to get twenty lines of SQL from Dev to QA to Production. And every time they do, reports break or show incorrect data. Apparently last month, a schema change somewhere broke almost every report in the company. To Maxine, it’s the same problems they had with the Phoenix Project, but instead of code, it’s for the data that the Unicorn teams need.

  Moreover, the Data Warehouse teams still haven’t reconciled the different definitions of product, inventory, and customer from the physical stores and e-commerce stores. The newly created Narwhal teams were already way ahead of them.

  Maxine drums her fingers. She cannot believe that they’ve run smack into another Phoenix-scale bureaucratic quagmire—the Data Warehouse is sitting on so many things they need.

  As people continue talking, Maxine stares at the numbers on the whiteboard. This is not going to work, she thinks. She decides that she needs to discretely signal Kurt to step out into the hallway so she can tell him that there’s just no way that the Promotions plan can realistically work as currently envisioned. They’ll need to convince the Unicorn team to drastically scale down their plans. Or maybe the Rebellion should abandon them and find another program to work with to generate a business win.

  In order for the Unicorn team to succeed, they somehow need to be decoupled and liberated from the giant data warehouse, and maybe even Narwhal, to support the massive calculations and queries they need to do.

  “I know what you’re thinking,” Shannon says, just as Maxine is about to get Kurt’s attention. “This looks impossible, right? But I spent nearly five years on the Data Warehouse team thinking about this. Let me show you something I’ve wanted to do for years.”

  Over the course of the next thirty minutes, Shannon presents a breathtaking plan that she’s obviously been thinking about and studying deeply. She is proposing to build a Spark-like big data and compute platform, fed by an entirely new event-streaming bus, modeled closely to what the tech giants all have built to solve their data problems at scale. It would allow hundreds, even thousands, of CPU cores to be thrown at the computations, allowing analyses that currently take days or weeks to be done in minutes or hours.

  Maxine is familiar with these techniques. Their use exploded after the famous 2004 Google Map/Reduce research paper was published, which described the techniques Google used to massively parallelize the indexing of the entire internet on commodity hardware, using techniques at the core of functional programming. This led to the invention of Hadoop, Spark, Beam, and so many other exciting technologies that transformed this space, just like NoSQL revolutionized the database landscape.

  Shannon describes how this new data platform would be fed by a new event streaming technology. “Unlike Data Hub, where almost every business rule change also requires a change from the Data Hub team, this new scheme would allow a massive decoupling of services and data. It would enable developers to change things independently, without needing a centralized team to write intermediary code. And unlike the centralized Data Warehouse, the responsibility for cleaning, ingesting, analyzing, and publishing accurate data to the rest of the organization would be pushed into each business and application team, where they have the most knowledge of what the data actually means.”

  She continues, “The importance and urgency of keeping this data secure, making sure that we don’t store PII that we shouldn’t, the need to encrypt it at rest, and the risks of what could happen to Parts Unlimited if this data were stolen are tantamount.” It’s obvious that Shannon is passionate about how this platform must ensure the security of all this data, not leaving it to each individual team.

  And most appealing to Maxine, it could also support an immutable event sourcing data m
odel, which would be a massive simplification compared to the current morass of complexity built up over decades.

  It was also fast. It would have to be, because Data Hub and potentially every application in the enterprise would eventually be throwing everything into this new message bus: all customer orders, all customer activity from their CRM, everything from their e-commerce site and marketing campaign management systems, all customer activity from their in-store and garages … all of it.

  When Shannon is done presenting and has answered questions from the team, Kurt looks pale. “You’re kidding me. We don’t even have approval to get Narwhal off the ground yet. And adding all of … this … would quadruple our compute and storage footprint … and potentially put even more sensitive data out in the cloud,” he says, gesturing at the whiteboard. “Oh, man, Chris is going to lose his shit. There is no way he’ll go for this.”

  Even Brent looks slightly ill. “I’ve always wanted to run something like this, but … it’s just so much new infrastructure to build at once. This seems a bit reckless, even to me.”

  Maxine studies Kurt’s expression, and then Shannon and her drawings that cover two full whiteboards. Then she laughs, momentarily enjoying Kurt and Brent’s discomfort. But she knows how they feel. Gamblers who lost everything at the casino probably had moments of reflection and prudence like this before they went all-in.

  She says, “Are we playing to win and to establish the technical supremacy we need to keep up with what the business needs, or do we just keep limping along, shackled to things built decades ago, and tell our business leadership to throw in the towel and stop having good ideas?”

  Maxine thinks Shannon’s idea is a good one, even though it seems suicidal. Maxine says, “All my intuition and experience says that our data architecture has created another bottleneck that affects every area of the company. This is a problem that’s far bigger than just developers. Anyone who needs data as part of their daily work isn’t getting what they need.”

  “Yes,” Maggie says, looking like she’s been hit by a bat. “That’s absolutely right! I’ve got twenty-five data scientists and analysts across five teams who never have the data they need. But it’s not just them—almost everyone in Marketing accesses or manipulates data. Operations is mostly about data. Sales operations and management is all about data. In fact, I’d bet half of all Parts Unlimited employees access or manipulate data every day. And for years, we’ve been handcuffed by the way everything has to go through the Data Warehouse team.

  “And frankly, we need pros like you to help,” she says, embarrassed. “We have a few data visualization platforms that we manage internally, but we’re not software people. In fact, earlier this year we managed to corrupt all our order data when the vendor told us to change the server time zone.”

  Brent groans, and Maxine is relieved that he manages to refrain from saying anything demeaning about the vendor or Maggie’s server administrators.

  Seeing Kurt’s sudden expression of rapt interest and calculation, Maxine smiles. She knows that hearing this sort of distress and suffering is exactly what motivates him into action. She says, “Let’s start small, with the most critical capabilities to enable the Unicorn team. We leverage all the ETL work we’re already doing with Narwhal, and we use fully managed and battle-tested data platform services in the cloud that could reduce a lot of the operational risk. Here’s what I’m thinking …”

  Maxine congratulates herself that over the next four hours, no one leaves the room. Or quits. Instead, they wrangle over the whiteboard and come up with an outline of a plan that everyone tentatively agrees to explore. They defer the event streaming platform, but Maxine and Shannon will lead the creation of something that can provide more bulletproof data transformations, get things under version control, build automated testing to confirm the correct shape and size of data before it’s ingested, and so many other things to prevent all these data accidents she’s seen and heard about.

  Kurt and Maggie promise to start the delicate discussion with Chris and Bill to head off a political battle with the Data Warehouse team, who might feel threatened. Which is not unreasonable, thinks Maxine. The Data Warehouse team has been the custodian of this data for decades, and now we’d be liberating it, making it available to anyone who wants it, on demand, without opening a ticket.

  Despite all these plans, everyone knows that there is a real chance of total failure. She hears Brent mutter from the whiteboard, “I love it, but there’s just no way we can get all of this done by Thanksgiving …”

  As Maxine’s teenagers would say, Brent is not wrong. But clearly, the way they’re doing data now is not working, and here’s an opportunity to show that there’s a better way. If there’s any time that deserves courage and relentless optimism, it’s now, she thinks.

  When Brent finally says, “Let’s call this Project Panther,” Maxine knows that there’s a shot of making this all work.

  On the night before Demo Day, many teams work late into the evening. The next morning, everyone is there as the Black Friday Promotion demos begin in the lunchroom. Kurt asks Maggie to kick off the session to help frame the “why” behind all of their efforts, but everyone knows that Black Friday is just days away. Everyone working on the Unicorn Project knows that it’s not an exaggeration that the survival of the company depends on their efforts.

  The Unicorn Project is now high-profile. And Maxine knows that if things don’t go well today, it will not be good for the company, and it will be very not good for Maggie, Kurt, and herself.

  Maggie begins, “As everyone knows, Black Friday is right around the corner. Our goal is for the Unicorn Project to drive real revenue, made possible by the Orca, Narwhal, Panther, and mobile app teams. Our focus is on using inventory information and personalization data to drive promotion and to get useful information into our apps, such as inventory availability. Specific outcomes we want to affect are revenue, repeat engagement in our mobile apps and e-commerce site, and campaigns that generate a positive response.”

  Maggie pauses. “And we have a special guest in the room, Bill Palmer, our VP of IT Operations, who helped create Project Inversion, which allowed us to focus so much energy on the Promotions effort. We also have a big contingent from Ops here who are helping fast-track all these initiatives. First up is Justine to present for the Orca team.”

  “I’m Justine, and I’m on the team responsible for generating the data used to create the promotions. As Maggie mentioned, our goal is to give Marketing the ability to create the best promotions based on everything we know about our customers.

  “Data is the lifeblood of the company,” she continues. “In Marketing, almost all of us access or manipulate data to guide the efforts of the company. For the first time, thanks to the Panther platform that Shannon and team created, we can finally get the data we need, trust that it’s correct, and use all sorts of statistical techniques and even things like machine learning to predict what our customers might need. This is what we use to craft offers and promotions. I have no doubt that the future of the organization will be built upon understanding our customers and providing them what they need … and we are best able to do that by understanding this data.”

  Shannon smiles as Justine goes on to outline Orca’s successes. “Over the last two weeks, our goal was to get all the queries needed to support the top priority use cases: we need to find out what the top-selling items are, which customer segments have purchased them, and vice versa. For each customer segment, we need to determine the products they buy most frequently.

  “A great promotion is one where we can sell inventory we already have, but also at the optimum price. We don’t want to unknowingly sell products lower than what customers are willing to pay. And we can only learn what that price is through experimentation,” she says.

  “We built a simple web application where everyone can generate and run these queries, build candidate promotions, and share them with each other,” she continues. “On th
e screen, you’ll see all the top-selling items along with their photos. This is pretty great but also boring, and it’s very difficult to quickly understand what all these SKUs actually are. We realized that the e-commerce site has images for all these products. So we asked Maxine and the Narwhal team if they could give us those links too, which they did within hours and without needing to open a ticket! By the end of day, with only ten lines of code, we were showing these images in our app, which helped everyone on the team generate more compelling offers more quickly and effectively. That’s been a crowd pleaser,” she says with a smile.

  Maxine sees Tom, her former Data Hub coding partner, join Justine at the front of the room. He says, “Once we understood what the Promotions team was trying to do, generating this app was easy. The Narwhal people gave us the API, and we just used one of the modern web frameworks to display it. Justine is absolutely right about how awesome the Narwhal database API is. And it’s blazingly fast. I’m used to queries that take minutes or hours to run on big servers. So, hats off to the Narwhal team—I’m blown away. We couldn’t have done it without them.”

  Maxine grins and sees that Brent and Dwayne also have huge smiles on their faces.

  Justine shows her last slide. “We’re working with the Marketing teams to finalize the promotion campaigns for the two highest-priority customer personas: the Meticulous Maintainers and the Catastrophic Late Maintainers. For each of those, using the Panther data and compute clusters, we’ve generated candidate-recommended products and recommended bundles, which they’re still reviewing and tweaking. Once they’re done, we’ll help get those loaded into the product and pricing databases so we can execute the campaign.”

  Unprompted, one of the senior Marketing people walks to the front of the room and says, “I want to acknowledge and thank everyone’s hard work. This is incredibly exciting and impressive. I’ve been amazed at how much this team has done in a couple of weeks. We’ve been at this for almost two years, but I’ve never been as excited as now. We’re taking all the data from the Orca team and fine-tuning the offers that we’ll be presenting throughout the Thanksgiving weekend. I think there’s millions of dollars of revenue that we can unlock!”

 

‹ Prev