natix-io · dimitarkovachev · Sep 26, 2025
diff --git a/API-breaking-change/readme.md b/API-breaking-change/readme.md
@@ -23,20 +23,58 @@ Imagine you’re designing and maintaining an internal or public-facing **Weathe
 
 Provide examples of what would constitute a **breaking change** to this API response for the frontends that are using tihs endpoints. provide at least 3 example.
 
+Breaking change in an API will be a change that will cause the client code to not work or to behave differently in an unexpected way
+
+1) changing the type of a field
+{ "hour": 0, "temperature": "18", "condition": ["Clear", "Windy"] },
+Here we change the type of the field "condition" from string to array. This will eventually break JS logic or at least cause a FE to dispaly smt like ["Clear", "Windy"] as a plaintext.
+
+2) changing the structure of the response
+"Weather": 
+  { "Hourly": [
+    { "hour": 0, "temperature": "18", "condition": "Clear" },
+    { "hour": 1, "temperature": "17", "condition": "Clear" },
+    ...
+    { "hour": 23, "temperature": "16", "condition": "Cloudy" }
+ ]
+ "Daily":
+    { "day": 0, "temperature": "18", "condition": "Clear" },
+    { "day": 1, "temperature": "17", "condition": "Clear" },
+    ...
+    { "day": 23, "temperature": "16", "condition": "Cloudy" }}
+
+    In the same time being a nice improvement to our API to provide both hourly and daily forecast if we change the responde payload like this will break client code because there will be code like Weather.ForEach() and this will badly fail at runtime
+
+  3) remove a field
+    { "time": "12:00:00", "temperature": "18", "condition": "Clear" },
+    We decide to represent hour time in our system as full timestamp instead of hour. This will fail at runtime
+
+
+
 ### 2 Coordinating Across Multiple Frontends
 
 You have **multiple frontend clients** some update imidiately and some take their update only every 1–2 months.
 **How would you handle an API schema change across all of them safely?**
 
+The most robust option on brekaing change is to intruduce api/v2 and continue to support api/v1 until every client has moved to v2/. In reality you know if there are any clients still on v1 only if you have control over them (ofc having metrics you can discover when there is no traffic through v1 api and just drop it) If you have a public API and you want to drop support for v1 you can mark v1 as depricated and place end of support deadline in the future.
+
 ### 3 How to Catch Breaking Changes During Development
 
 Describe how a team can **detect breaking changes early**, in your experince. please elaburate.
 
+In my experience having control over both FE and BE API means I can have e2e test which I run on every code change and it will fail upon breaking change. 
+
+very convinient way to follow if my server complies with and API schema is defining OpenAPI schema. You can generate server side models and interfaces and use them. Additionaly I've been using schemathesis (https://schemathesis.io/) which basically tests a server against open api schema.
+
+There are also tools that rely on clients declaring contracts they depend on but i have no experience with any.
+
 
 ### 4 Policy for Releasing Changes
 
 What internal **policy/process** was established to manage schema changes safely, in your previous team?
 
+Several times we released v2 api along with legacy v1 api and supporting both. We used heavily openapi schemas and generated stubs. Also e2e tests and schemthesis.
+
 
 ## 🧪 Acceptance Criteria
 - Answer these four questions thoroughly – at least one paragraph each, maximum half a page.

diff --git a/weather-service/solution.md b/weather-service/solution.md
@@ -0,0 +1,103 @@
+Resilient weather service
+
+
+Functional requirement:
+    User will requiere current day weather forecast for a given city.
+        - This means not 24h from now. E.g. request in 23:00h will still expected to return forecast for today, although there is just 1 hour of the day remaining
+
+Non-functional requirements:
+    - 2500 cities
+    - 100k users daily; Let's say ~2 request per user per day: 200k req/day; ~8500 req/hour; ~2-3 req/s -> let's be generous and support up to 10 req/s
+    - Assumption: 1.5s latency requirement
+
+Data persistance requremnts
+    - actual operational weather data - minimal: 2500 cities * 24 hours * 100 bytes (storing the json object + db/kv-store overhead) ~~ 5-7 MB
+    - city Request counter - store last X days of request counts for each City, the simplest solution is redis sorted set with manual deletion; e.g. for last 7 days: 1.4-1.6M entries will be ~45MB - good enough for our use case
+    - metrics data and user interactions data - assumption: out of scope
+
+
+Thoughts:
+
+    Does the external API has the same semantic?! as our API - returns the forecast for today, regardless of the time of the day requested.
+        - Very important: Can we query ExternalAPI in advance, e.g. in Sofia is still 20.09 23:00 but we query the API for 21.09? Assumption: Yes
+        - About the actual weather data: Do we accept weather data as constant, e.g. If I query external API for Sofia on 00:03h and on 11:00 will I get the same result? Assumption: Yes
+        - Giving the above assumtion: Can I query data 2-3 hours early and still have reliable data, e.g. Query Sofia for 21.09 on 20.09 21:00h? Assumption: Yes
+
+    Do we know the 2500 cities in advance? This is very important and cruacial data and we have 2 options. If we assume we don't know them occasionally we will receive a request from unknown city and some city can "die" in our system by not having any requests the last X days.
+
+    Made a brief research. The actual weather stations in the world are 5000-10000 but there are apps that provide support for 200000 - 3M locations. Every location has data from numerical weather prediction models interpolated to that location. But this is a geographical problem. In reallity we can receive request from anywhere in the world and having the means to provide weather forecast is highly complex task out of scope i am assuming, that's why we have the external API. Storing the last X days count will help us big time. We can prioritize cities and also have a heat-map with our users. In reality we want the most accurate data for where we have the most users but we also want to have the ability to respond to any request.
+
+    Thinking about the problem in general we heavily simplify the actual domain. Weather data is actually highly unstable - it can change very often. I am assuming we are tackling different problem here so I will take weather data as a constant variable.
+
+    100 requests / hour = 2400 requests / day but we have:
+        - 2500 cities > 2400 - we are 100 short of just having one per day per city request 
+        - more than 100 cities per hour in some time zones, less then 100 cities in other - we will try fix this by requesting data for congested time zones earlier. We will have the static data - the world has 24 time zones and we know all the "hot" cities, we also know their geo location, we also know where actual people live - our potential users (e.g. we don't want to provide info for the Sahara desert because nobody live there)
+
+
+
+Try to solutionize:
+
+    at every 5 min we will execute the following job:
+
+        get_city_priority():
+            city_priority = (population_weight * 0.4) + 
+                    ((24-hours_unil_new_day) * 0.2) +
+                    (user_request_frequency * 0.5)
+
+        calculate_budget() int 
+            // in redis will have another sorted set with timestamps for each call to external API
+
+            return 100 - (get from redis request count for last hour)
+
+        get_data_for_cities()
+            available_requests = calculate_budget()
+
+            cities_priorty_queue = get_city_priority()
+
+            while available_requests > 0:
+                city = cities_priorty_queue.pop()
+                city_data = extarnalAPI.get(city)
+
+                // update redis data for city with calculated ttl until weather forcaast is valid
+                // add timestamp to redis budget sorted set
+
+                available_requests -= 1
+
+        main():
+            get_data_for_cities()
+
+    We will have API which will serve client requests
+
+    get_weather():
+        // add to redis hot_cities sorted set
+
+        // redis get city
+        if cache-hit:
+            return data:
+        if cache-miss:
+            // for purpose to fine tune the city priority algorithm we will log this as event
+            // Using the geo-spatial index find 3 nearest cities for which we have data and make interpolation between them
+
+            return interpolated_data
+
+
+
+    I will try to fine tune the exact coefficients for calculating city_priority by running simulations, will be fairly easy will mock user request and will count the cache-misses
+
+
+    For redundancy I will have 2 replicas of the API service in 2 different data centers with 2 redis instances in leader-replica configuration. As the constant data for cities isn't too big I will opt to store it also in redis.
+
+    In front of the 2 replicas I will use a load balancer from a cloud provider as it is fairly reliable having it replicated with automatic failover.
+
+    Haven't discussed time latency for requests but <=1.5 seconds should be possible from anywhere in the worlds
+
+
+Data model: 
+
+city
+    name
+    TZ
+    population
+    longitute
+    latitude
+geo-spatial index on longitute and latitude