In software systems, there might be times where you want to either order requests / orders by timestamp, or provide cutoff times.
Seems like a fair solution right — whenever we want to order something we add timestamps to them, and even when there’s race conditions, we know that the timestamp will help order data.
That’s not entirely accurate. Different systems can have different timestamps. Even if they differ by a few milliseconds, this is sufficient to cause discrepancies between 2 systems.
The property of different systems having different timestamps is called clock-skew.
For example, let’s consider the following scenario —
A system is receiving multiple orders per second. If they are received before the admin closes the system at end of day, they should be processed, else they should not be processed.
The admin has an API that closes the system, and publishes the closing time.
Now consider the following scenario —
- System 1 receives a request from admin to close the system. System 1’s local time is 5:05:05:500. It publishes the closing time.
- A few milliseconds later, System 1 receives an order. Due to clock-skew, its time is 5:05:05:400. It sees that it is within the cutoff time for closing, and processes the order.
Later on if you see these 2 systems, you would feel that System 2 should not have processed the order, but there is nothing wrong with System 2’s code.
If you have large distributed systems, clock-skew can actually cause much bigger problems if they rely on their own local timestamp.
What can be done about clock-skew?
- If you’re relying on timestamps, use the same source — prefer the timestamp of the datastore where the data is saved.
- Avoid timestamps, rely on locks / transactions, or other mechanisms of resolving race conditions.
- If you’re using timestamp to order data, instead rely on something simpler like indices. Indices would have a single source of truth, whereas while using timestamps, there might be a tendency to use the VMs timestamp.