Skip to content

Coronasafe/data-analytics-assessment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

Task

SQL Assessment – Data Analytics Role

Design a single SQL script that a candidate can run end-to-end:

  1. Create & populate the schema (done).
  2. Answer a set of 5 analytical questions.

1 ️⃣ Schema & seed data

----------------------------------------------------------
--  A. Core dimension tables
----------------------------------------------------------
CREATE TABLE customers (
    customer_id      SERIAL PRIMARY KEY,
    full_name        TEXT        NOT NULL,
    country          TEXT        NOT NULL
);

CREATE TABLE products (
    product_id       SERIAL PRIMARY KEY,
    product_name     TEXT        NOT NULL,
    category         TEXT        NOT NULL,
    unit_price_usd   NUMERIC(8,2)
);

----------------------------------------------------------
--  B. Fact table with nested JSONB
--     • order_meta keeps flexible fields (payments, promo, etc.)
----------------------------------------------------------
CREATE TABLE orders (
    order_id         SERIAL PRIMARY KEY,
    customer_id      INT         REFERENCES customers(customer_id),
    order_date       DATE        NOT NULL,
    order_meta       JSONB       NOT NULL               -- nested JSONB
    /* example structure
       {
         "payment": { "method": "card", "card_type": "VISA" },
         "promo":   { "code": "NEW10", "discount_usd": 5.0 },
         "ship":    { "mode": "air",  "cost_usd": 12.5 }
       }
    */
);

----------------------------------------------------------
--  C. Bridge table (many-to-many) – quantity & price may vary per order
----------------------------------------------------------
CREATE TABLE order_lines (
    order_id         INT REFERENCES orders(order_id),
    product_id       INT REFERENCES products(product_id),
    qty              INT  NOT NULL,
    line_price_usd   NUMERIC(10,2)  NOT NULL,
    PRIMARY KEY (order_id, product_id)
);

----------------------------------------------------------
--  D. Sample data (minimal but covers edge cases)
----------------------------------------------------------
INSERT INTO customers (full_name, country) VALUES
  ('Alice Smith',   'USA'),
  ('Bala Iyer',     'India'),
  ('Chen Wei',      'Singapore');

INSERT INTO products (product_name, category, unit_price_usd) VALUES
  ('Blood Glucose Monitor',  'Medical Devices',  40.00),
  ('N95 Mask Box (20)',      'PPE',             25.00),
  ('Tele-consult Credit',    'Services',        15.00);

-- Order 1 – has promo + shipping
INSERT INTO orders (customer_id, order_date, order_meta) VALUES
  (1, '2025-05-01',
   '{
      "payment": { "method": "card", "card_type": "AMEX" },
      "promo":   { "code": "NEW10", "discount_usd": 5 },
      "ship":    { "mode": "ground", "cost_usd": 8.5 }
    }'::jsonb);

-- Order 2 – no promo, air shipping
INSERT INTO orders (customer_id, order_date, order_meta) VALUES
  (2, '2025-05-02',
   '{
      "payment": { "method": "upi", "provider": "GPay" },
      "ship":    { "mode": "air", "cost_usd": 15 }
    }'::jsonb);

-- Order lines
INSERT INTO order_lines VALUES
  (1, 1, 2, 80.00),  -- Alice bought 2 glucose monitors
  (1, 3, 1, 15.00),  -- and 1 tele-consult credit
  (2, 2, 4, 100.00); -- Bala bought 4 PPE boxes

2 ️⃣ Analytical questions the candidate must answer

Instructions: Write one SQL query for each question. Use aliases (AS), JOINs, WITH CTEs, and JSONB operators (->, ->>, #>>) where appropriate.


Q1. Total sales with shipping & promo impact

Return, per order_id,

  • gross_line_total (sum of line_price_usd)
  • promo_discount (value inside order_meta → promo → discount_usd; treat missing as 0)
  • shipping_cost (value inside order_meta → ship → cost_usd; treat missing as 0)
  • net_total = gross_line_total – promo_discount + shipping_cost

Q2. Top-selling product category (by revenue)

Using all tables, compute total revenue after discounts for each product category and return the category that generated the highest revenue.


Q3. Customers with > 1 card brand used

Extract card brand (order_meta → payment → card_type) for card payments only. Return every customer_id who has placed ≥ 2 orders with different card brands (e.g., one Visa, one Amex).


Q4. Monthly order funnel view

Using a CTE, build an intermediate table monthly_orders that:

| month | orders | distinct_customers | total_gross | total_net |

(Use the same net calculation as Q1.) Then SELECT * from that CTE for May 2025.


Q5. Country-level basket analysis (correlated sub-query or CTE)

For each country, list:

  • avg_items_per_order – average SUM(qty) per order
  • preferred_payment_method – the JSONB payment method that appears most often for that country (break ties alphabetically)

Hint: Unnest JSONB inside a sub-query or lateral join.


Deliverables

  • A single .sql file that:
    1. Creates & seeds the schema above (copy/paste allowed).
    2. Contains each answer query clearly labelled - Q1 … through - Q5 ….

Candidates should not alter seed values; instead, write queries that would generalise to larger data.


Evaluation focus

Skill Evidence in questions
Basic selects & aliases Q1, Q2
Multi-table joins Q1, Q2
CTE (WITH) usage Q4, Q5
JSONB navigation (nested) Q1, Q3, Q5
Analytical reasoning (aggregations, ranking, tie-break) Q2, Q5

Happy querying!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors