Psst. It's better sideways.
(Try me in landscape!)
How data science is woven into the fabric of Stitch Fix
At Stitch Fix, we’re transforming the way people find what they love. Our clients want the perfect clothes for their individual preferences—yet without the burden of search or having to keep up with current trends. Our merchandise is curated from the market and augmented with our own designs to fill in the gaps. It’s kept current and extremely vast and diverse—ensuring something for everyone. Rich data on both sides of this 'market' enables Stitch Fix to be a matchmaker, connecting clients with styles they love (and never would’ve found on their own).
Our business model enables unprecedented data science, not only in recommendation systems, but also in human computation, resource management, inventory management, algorithmic fashion design and many other areas. Experimentation and algorithm development is deeply engrained in everything that Stitch Fix does. We’ll describe a few examples in detail as you scroll along.
So what does the data look like? In addition to the rich feedback data we get from our clients, we also receive a great deal of upfront data on both our clothing and our clients. Our buyers and designers capture dimension and style details, and our clients fill out a profile upon signup that’s calibrated to get us the most useful data with the least client effort.
Let's first walk through the filling of a shipment request to see a few of the many algorithms that play a role in that process, before zooming out to view the bigger picture.
As noted, when a client first signs up, they fill out a Style Profile (it can be updated at any time). Then, scheduling a delivery is easy: an algorithm is used to populate a calendar from which she selects a delivery date.
The shipment request is processed by an algorithm that assigns it to a warehouse. This algorithm calculates a cost function for each warehouse based on a combination of its location relative to the client and how well the inventories in the different warehouses match the client's needs.
This set of cost calculations is carried out for each client to produce a cost matrix.
The assignment of clients to warehouses is then a binary optimization problem.
And the global optimum includes this particular client's warehouse assignment.
The shipment request is then routed to the Humans + Machines styling algorithm.
First, the machines perform a variety of algorithms to produce rank-ordered lists of the inventory.
A filtering step removes styles from consideration that the client has received in a previous shipment or that have attributes which the client has asked to avoid.
For each of the remaining styles, the machines then try to evaluate the relative likelihood that this particular client will love that particular style. This is a difficult problem, and we approach it in many different ways—only a few of which we’ll discuss here. But in general, note that we tag each item multiple times with match scores from different algorithms, and then rank them.
In some ways, the problem is a classic collaborative filtering problem: given different clients' feedback on different styles, we must fill in the gaps in the (sparse) matrix to predict the result of sending a style to a client who has not yet received it. As such, we do use some standard collaborative filtering algorithms (e.g. those who have liked what you have liked have also liked ...).
However, unlike most collaborative filtering problems, we have a lot of explicit data, both from clients' self-descriptions and from clothing attributes. This helps with the cold start problem and also allows for greater accuracy if we employ algorithms that consider this data.
One such approach is mixed-effects modeling, which is particularly useful because of the longitudinal nature of our problem: it lets us learn (and track) our clients' preferences over time, both individually and as a whole.
And in addition to the many explicit features available, there are some particularly pertinent latent (unstated) features of both clients and styles that we can infer from other data (structured and/or unstructured) and use to improve our performance.
For example, a new client may tell us that she wears medium-sized blouses, but where exactly would her preference fall along the spectrum of smallish mediums to largish mediums? The same question also applies to particular styles of clothing in our inventory. (Note that in this illustration we’re treating fit as unidimensional for simplicity, but in fact at Stitch Fix we treat it as multidimensional.)
With clients' fit feedback and purchase histories, we can learn where particular clients and styles fall along this spectrum. These latent features can then be used in our mixed-effects models and elsewhere.
Moving our problem even further beyond classical collaborative filtering, we also have a lot of photographic and textual data to consider: inventory style photos, Pinterest boards, and the vast amount of written feedback and request notes we receive from clients.
Sometimes it can be difficult to describe your style preferences in words, but you know it when you see it—so we have our machines look at photos of clothing that customers like (e.g. from Pinterest), and look for visually similar items in our inventory. We use trained neural networks to derive vector descriptions of 'pinned' images, and then compute a cosine similarity between these vectors and pre-computed vectors for each item in our inventory.
Natural language processing is used to score items based on the client's request note and textual feedback from other clients about the same item.
All of these algorithm scores—and many others like them—are taken into account when ordering and presenting options for the human expert stylist to consider.
Once this machine ranking is complete, the shipment request gets routed to a human.
Humans are more heterogeneous than machines. The machines are all the same—just pick one. But human stylists are going to be better suited to some clients than to others. So we use algorithms to optimize this match.
To do this, we first calculate a match score between each available stylist and each client who’s requested a shipment during the current period. This match score is a complex function of the history between that client and stylist (if any), and the affinities between the client's stated and latent style preferences and those of the stylist.
Subsequently, the stylist assignment optimization problem is similar to the warehouse assignment problem described above, except that (a) it need only consider those clients awaiting shipments, and (b) we must re-run the optimization problem much more frequently to account for the varying sizes of stylists' queues as they work.
While machines are great for tasks involving rote calculations, there are other tasks that require improvising, knowledge of social norms and the ability to relate to clients. These tasks are in the purview of humans. This is where our stylists perform the type of computation that only humans can do.
To begin styling a shipment, a stylist picks up a task in a custom-built interface designed to help her quickly and deeply understand the client.
Our human computation team does a lot of testing with variations on this interface, helping us to understand how stylists make decisions.
This knowledge helps us in many ways: to improve our algorithmic styling, to improve our stylist training procedures and to continually improve the interface that stylists use to curate boxes.
Ultimately, the stylist finalizes the selections from the inventory list and writes a personal note describing how the client might accessorize the items for a particular occasion and/or how they can pair them with other clothing in their closet.
This wraps up the styling procedure, and the shipment is now ready to be processed.
Various operations research problems can be found in this processing step.
For example, given the items selected for a shipment, what is the best route for pickers to take through the warehouse to fill the box?
This problem of pick-path routing is an instance of the NP-hard Traveling Salesman Problem. In practice, we even take the problem one step further to look at optimal groupings of shipments that can be picked simultaneously.
The shipment is then delivered to the client by their requested delivery date.
But this is just the beginning.
She opens the box, is hopefully delighted, keeps what she wants and sends back then rest, and then tells us what she thinks about each article of clothing. There is a symbiotic relationship between her and Stitch Fix, and she gives us very insightful feedback that we use not only to better serve her next time, but also to better serve other clients as well.
To recap the process of filling a single shipment request: a client creates a Style Profile and requests a shipment, we match them to a warehouse, our styling algorithms and human stylists work together to select styles, the stylist writes a note, we deliver the shipment, and the client keeps what they like, returns the rest, and provides us with feedback.
But this is just one shipment. Zooming out, we can consider the system as a whole. At this level, two other aspects of the business become clear:
(1) We must continually replenish our inventory by buying and/or designing new clothing for our clients, which provides an excellent opportunity to benefit from our rich data;
(2) We must anticipate our clients' needs in order to make sure that we have enough of the right resources in place at the right times.
Let's first look at how we anticipate our clients' needs, then we'll swing around to consider inventory management and new style development.
One of the ways we view this needs-anticipation problem (and related problems) is to consider the "state" of each client at each point in time. Are they a new client? Did they arrive via referral or on their own? Is their closet nearly full? Are they in a phase of building up their wardrobe after a life change? Or simply wanting to try something new? Depending on their state, they will likely have difference shipment cadences, different desires for email contact, etc.
We keep track of every touch point we have with each client—every item we send, every piece of feedback we get, every referral, every email, etc.
With this data, we try to understand clients' states and their needs when in those different states. We can then detect changes in state and consider possible triggers. This process by itself can lead to insights that help us keep our clients happier.
And once we define and understand states, and detect and understand clients' transitions between them, we can develop state transition matrices and Markov chain models that allow us to study system-level effects.
One of the many uses of these Markov chain models is to anticipate future demand, which is important because we often need to buy inventory months before it arrives at the warehouses. We must also ensure that we have the right number of resources and human stylists available at the right times.
Inventory depletion through customer demand must ultimately be offset by purchases of new inventory. One of the challenges is in getting the timing of purchases right, so that we maintain adequate inventory availability for stylists while minimizing the sum of ordering costs and carrying costs (the operation costs and opportunity costs of capital associated with the area under the inventory curve).
Meeting future demand is just one of our inventory management challenges: we must also allocate inventory appropriately to different warehouses, and occasionally donate old inventory to make room for new styles. We can use algorithms to help us with these processes.
(Note that the situation is more complex than this simple illustration, since we must drill down to look at the availability of different types and styles of clothing in each of the warehouses. But we'll stick to the simple illustrations here for cleanliness.)
How much of what styles to purchase? Which items should go to which warehouse? What inventory should be donated when?
We answer these questions by using a model of the system dynamics, fitting it to historical data and using it for robust optimization given quantified uncertainties in our forecasts.
Volumetric challenges are not the only considerations at play with inventory replacement: we also want to purchase and develop new clothing in ways that continually improves our inventory, helping our stylists give greater delight to a broad client base.
And designing new styles for our Stitch Fix exclusive brands provides a great opportunity to tailor new designs for particular client segments that tend to be underserved by other brands.
We approach this opportunity with inspiration from genetic algorithms: we use recombination and mutation along with a fitness measure—the same mechanism used by mother nature in evolution by natural selection.
The first step is to think of each style as a set of attributes ("genes").
Then consider our vast set of styles this way, and consider the client feedback ("fitness") we have available for each of them.
Now consider creating new styles by recombining attributes from existing styles and possibly mutating them slightly. Note that the number of possible combinations is very large (∏ki).
In the next step, we deviate somewhat from a canonical genetic algorithm: instead of simply selecting based on fitness and then unleashing random recombinations and mutations as the next generation of styles, we’re somewhat more picky about what makes it into our inventory.
We first develop a model of how well a given set of attributes is likely to suit the target clients. We then use this model to highlight a variety of attribute-sets that we think have a high probability of love.
We then work with our human designers to vet and refine this collection, and ultimately to produce the next generation of styles.
These new styles get produced and are made available to the styling algorithm, then on to delighted customers, and the cycle of evolution continues.
There is indeed a lot going on in our Algorithms team.
Thus far we have touched on some of the projects in our three vertically-aligned teams: Styling Algorithms, Merch Algorithms & Client Algorithms.
The Data Platform team provides the data and compute infrastructure—along with a collection of internal SaaS products—that allow the vertically-aligned data scientists to effectively and efficiently carry out analysis, write their algorithms, and put them into production. The platform nicely encapsulates properties like data distribution, parallelization, auto-scaling, failover, etc. This allows the data scientists to focus mostly on the science aspect yet still enjoy the benefits of a scalable system. And, the data platform engineers focusing on building… well, platforms. That is, they are not burdened with business logic and requirements for which they don’t have the context. That is the job of the full-stack data scientists. At its heart is the idea that "Engineers Shouldn't Write ETL." Our Data Platform team enables data scientists to carry algorithm development all the way from concept to production.
made by
Eric Colson, Brian Coffey, Tarek Rached and Liz Cruz
made with
Mike Bostock's D3.js and Jim Vallandingham's scrollytelling code
inspired by
Stephanie Yee and Tony Chu's Visual Introduction to Machine Learning,
Victor Powell and Lewis Lehe's Explained Visually project,
and Ilya Katsov's post on Data Mining Problems in Retail
The goal of this interactive tour has been merely to share some of the diverse applications of data science at Stitch Fix. To be sure, we had a difficult time limiting the scope to the ten stories featured above - there are vastly more already in production and even more still being framed. Yet, much of what we refer to as applied data science has long existed as a diverse body of elegant microeconomic theory. The use of these concepts outside of textbooks has been inhibited historically by the lack of data and compute resources to apply them practically; a final bastion currently restricting the concepts to mere theory has been people and culture (organizations are notoriously difficult to change). However, Stitch Fix seems to be relatively free of such impediments. Perhaps the only thing special we’ve done is endow ourselves with rich data via a unique business model and then foster an environment where a data scientists can be successful. From there, curiosity, creativity and the desire to have an impact pave the way for the rest.