Simple Exploratory Data Analysis in Retail Industry

Beginner-friendly & Straight-to-the-point on how to coduct EDA (Exploratory Data Analysis)

Posted by Dwi Hadyan Harsono on Oct 27, 2021
TLDR; Know exactly what questions do you want to know first before you dive in to EDA, and only then you start. otherwise, you'll get lost easily.

This article will be divided into 3 segments :

  1. Scenario : The problem statement
  2. Result : How are we tackling the problem ?
  3. Call-To-Action : What's the solution of the problem ?

---------------------------

Following up from previous article, we'll be discussing on how to explore retail data

Scenario

Now, the company would like to know the overview of their customer purchase behaviours & reordering

Result

With this given scenario, we don't need to build any model, just simple Explatory Data Analysis will be sufficient. Hence we can omit the M in OSEMN. Let's begin


  1. Obtain
  2. Data given from the company


  3. Scrub
  4. Data is assumed clean


  5. Model
  6. Remember, OSEMN is an iterative process, hence the sequence of OSEMN is not strict

    Also here, we completely ommited, as it's not needed to solve this given task. (It's fine guys! not all solutions need a fancy machine learning models

    ...

  7. Explore & iNterpret
  8. Since we just doing EDA, result of Explore is our iNterpret

    Here's what we trying to find out:

    1. Which day & at which hour does.....
      • customer purchase our products the most?
      • ...

        Saturday 12.00-16.00 & Sun 9.00-12.00 has most orders

        ---------------------------



      • customer AVG REORDERED our products the most?
      • ...

        Highest on Sunday between 6am to 9am (nice)

        In general, for any days, highest between 5am to 9am

        Interpretation : 0.66 means, 66% of all orders are actually reorders (returning customers)

        ---------------------------

    2. How many days does the customers usually come back and buy again from us? (and out of all these orders, how many of them are reorders (returning customers) ?)
    3. ...

      Customers usually come back to us once every 7 days or 30 days

      And out of all those orders, 58.97% are reorders (returning customers). See kaggle for this calculation

      ---------------------------

    4. How many products are there in a single order ?
    5. ...

      10 products per orders. with the most is 5. Note that it dropped exponentially after 10

      ---------------------------

    6. Which products that....
      • customer purchase the most?
      • ...

        Fruits (bananas, strawberries) & vegetables (spinach, onions, zucchini)

        ---------------------------



      • customer AVG REORDERED the most?
      • ...

        Completely different than in qty wise. Top 3 are vege wrappers, pads, energy shots, chocolate bar. No fruits & veges at all in top 15

        ---------------------------

    7. Which aisles that....
      • customer purchase the most?
      • ...

        Fruits & Vege

        ---------------------------



      • customer AVG REORDERED the most?
      • ...

        Fruits & vege might be highest in qty, but reordering wise, milk & sparkling water at top, (fruits at 3rd, vegetables aren't even at top15)

        ---------------------------

    8. Which depts that....
      • customer purchase the most?
      • ...

        Top3 are Produce, dairy egg, snacks

        ---------------------------



      • customer AVG REORDERED the most?
      • ...

        Top3 are Dairy Eggs, Beverages, Produce, quite similar as qty

        ---------------------------


Call-To-Action

Management can discuss further with supplier of their highest selling products to get better discount, since its selling alot, hence they can afford to buy bulk (and enjoy bulk discount)

Management can send few people to the least selling dept & aisles and find out why isn't selling. (Perhaps other competitor sell better pricing, or it places at a very far corner, or...?)

Final words from Dwi

You can explore the code here.

Again, this is meant to be as introductory EDA, hence it's very simple & straight forward. Read this article if you wish to do slightly more advanced analysis