Computer Time Keeping and the Network Time Protocol (NTP)

How Computers keep track of the passage of time

Most Computer Operating Systems measure the passage of time using one of the following methods:

  • Tick counting – A hardware device is configured by the Operating System to fire an interrupt at a pre-determined rate. For example 100 times per second. The Operating System then processes the interrupts called ticks and by keeping track of the number of ticks in software it can determine how much time has passed
  • Tickless timekeeping – A hardware counter is used to keep a count of the number of time units that have passed since the Computer booted up. The Operating System can then read the value from the counter as required.

Timing Devices

However not all Computers have the type of hardware counter required for tickless time keeping. Below is a list of different Computer timing devices, the exact functionality provided by each of these devices is outside the scope of this article:

  • Time Stamp Counter (TSC) [1]
  • High Precision Event Timer (HPET) [2]
  • Programmable Interval Timer (PIT) [3]
  • CMOS Real Time Clock (RTC) [4]
  • Advanced Programmable Interrupt Controller (APIC) timers [5]
  • Advanced Configuration and Power Interface (ACPI) timer [6]

Tick counting has several disadvantages when compared to Tickless timekeeping, it adds an additional burden on the CPU as it must process the interrupts in a timely manner to keep time accurately. In contrast as a separate hardware counter is used in Tickless timekeeping, this method usually provides time at a higher level of granularity and precision.

The counter used in Tickless timing must increment at a constant rate and be sufficiently large so that it does not overflow and wraparound particularly often. If this does occur, it must do so in a way that can be detected and counted by the Operating System.

In addition to accounting for the passage of time, Operating Systems must also keep track of Wall-clock time [7], also referred to as absolute time. Wall-clock time is generally obtained early on in the Computer’s boot and Operating System start up sequence from the battery backed Real-time clock. If no Real-time clock is available the Computer can query a network time server to obtain the current time. The progression of time is then measured and tracked using one of the methods outlined above.

Clock Drift

The Wall-clock time within physical Computers often tends to drift over time, the time reported by the Operating System may either be ahead or behind the current time by some margin [8] [9] [10]. This apparent loss of timing accuracy can be attributed to a number of factors:

  • Temperature – Increases or decreases in temperature can affect the rate at which Quartz crystals oscillate causing small variations in CPU Clock frequency. A higher frequency may cause the Wall-clock time to accelerate, a lower frequency may cause the Wall-clock time to decelerate and appear to pass more slowly
  • Dynamic CPU Frequency Scaling – the adjustment of the CPU’s clock frequency in order to either conserve power or reduce the heat generated by the CPU [11]. As with fluctuations in temperature, changes to the CPU’s clock frequency need to be accounted for to keep track of the time accurately
  • CMOS RTC Resolution – Typically the Wall-time provided by the Real-time clock on boot is only provided to the nearest second leading to a loss of timing resolution
  • Lost Ticks – Failure to process or acknowledge an interrupt generated by a timing device due to high system load or other factors
  • Clock Frequency Measurement – It is not always possible to determine the exact frequency of a timing device directly in software; this is true for the APIC Timer and Time Stamp Counter. In such situations approximations of the current frequency must be made using lower resolution timing devices which can lead to a loss of timing accuracy

Clock Drift in Virtual Machines

  • Clock drift is often much worse within virtual machines, this is mainly due to the competition for and scheduling of access to, the underlying hardware resources provided by the physical host [12]
  • The introduction of the hypervisor adds a layer of abstraction and prevents direct access to the physical timing devices within the host. Most hypervisors employ techniques mitigate this; however timekeeping inaccuracies may still occur, especially when the physical host is under high CPU load
  • To guard against this it is critical that each and every Virtual Machine (VM) is configured to query and obtain regular time updates from a group of accurate Network Time Protocol (NTP) servers [13]

The Advantages of Accurate and Synchronised Time

Keeping accurate time between computer systems is essential for a multitude of reasons including but not limited to:

  • Log file analysis following a software glitch, hardware failure or network intrusion event. Accurately time stamped logs, if still present will make it easier to determine the order in which devices or systems failed, or were comprised
  • The timely execution of scheduled tasks such as backups operations or data synchronisation events
  • The time stamping and processing of transactions

Network Time Protocol

The Network Time Protocol and associated client and server software provides a method of synchronising the clocks used in computer systems to a reference time source. NTP was originally designed by David L. Mills in 1985 (Original RFC 958) [14]. The most recent revision of NTP is version 4 (RFC 5905) [15]. This version is backwards compatible with version 3 (RFC 1305) [16]. NTP superseded the Time Protocol (RFC 868) [17] [18] and the ICMP Timestamp message (RFC 792) [19].

NTP messages containing timestamp’s are exchanged between the client and server use the User Datagram Protocol (UDP) [20] as the transport mechanism on port 123. NTP is capable of accuracies of less than a millisecond on Local Area Networks (LANs) and up to a few milliseconds on Wide Area Networks (WANs).

NTP uses an algorithm called the intersection algorithm [21] to construct a list of potential candidate peers that could be used as time synchronisation sources. It then computes a confidence interval for each and drops peers (false tickers) that are deemed to be unreliable time sources. The techniques used in the intersection algorithm were adapted from an earlier algorithm perceived by Keith Marzullo [22] [23]. The algorithms used in NTP are able to mitigate the effects of variable network latency.

NTP Implementations

Under Linux the client and server NTP implementation is called ntpd and it runs as a daemon, this is available for installation in many different Linux distributions [24]. In Microsoft Windows Operating Systems the NTP Client runs as a service and is called W32Time [25].

Time Sources

In NTP time sources are arranged in a hierarchical structure. Each tier of the hierarchy is known as a stratum, with each stratum being assigned a number, starting at zero for the upper most tier.

  • Stratum 0 – This tier contains the extremely precise reference clocks which are typically either Caesium or Rubidium atomic clocks, GPS clocks or other radio based clocks. These clocks are directly connected to a computer. The clocks generate a pulse per second which the computer can detect and is used to mark the start of the next second
  • Stratum 1 – This tier contains computers that are directly connected to the reference clocks. Consequentially the system clocks within these computers are synchronised to within a few microseconds of the stratum 0 devices. Stratum 1 time servers may peer with other stratum 1 time servers for validation and redundancy
  • Stratum 2 – Time servers in this tier will communicate and synchronise with Stratum 1 time servers over a network link. Stratum 2 time servers should query at least three Stratum 1 servers for redundancy and reliability. Ideally this communication and synchronisation should occur over diverse internet connections. In addition to this Stratum 2 time servers should also peer with at least two other Stratum 2 time servers that query different Stratum 1 time servers
  • Stratum 3 – This tier may contain computers that are synchronised to Stratum 2 time servers. Alternatively they can also act as time servers providing time for Stratum 4 computers. The same peering rules used for Stratum 2 time servers should be applied. This level of fan out may only be required in larger enterprise environments to handle the volume of requests

The diagram below depicts a robust NTP topology with a significant amount of redundancy. This amount of redundancy probably isn’t required for most NTP deployments. The three Stratum 0 reference clocks are connected to three separate computers. The three Stratum 1 computers peer with each other and exchange time with the Stratum 2 computers over a network link. The Stratum 2 computers also peer with each other.

In the event of a hardware failure or connectivity issues between any one Stratum 2 computer and the Stratum 1 computers, the Stratum 2 computer could potentially contact one of its Stratum 2 peers to obtain the time. The Stratum 2 computers provide the time to the Stratum 3 computers.

NTP Topology example

An example of a robust NTP Topology

Coordinated Universal Time

NTP will ensure that a given computers clock is synchronised to Coordinated Universal Time (UTC) [26] [27]. UTC is an official standard [28] for the computation of time, as such it should not be thought of as a Time Zone.

The UTC time standard is widely used throughout the world, the time in a given country; region or territory can be calculated by adding or subtracting an offset of a certain number of hours and minutes [29]. For example subtracting 5 hours from UTC would give the local time in New York City.

Two components are combined in order to determine UTC, namely Universal Time (UT1) and International Atomic Time (TAI).

Universal Time (UT1)

UT1 also referred to as Astronomical Time, is linked to the rotation of the Earth. It is used to determine the actual length of a day on Earth [30]. UT1 and the length of a day are subject to small variations, these can be attributed to a number of factors including:

  • Zonal Tides – The displacement of the Earth’s surface caused by the gravity of the Moon and Sun (smaller than 2.5 ms)
  • Oceanic Tides – The rise and fall of sea levels caused by the combined effects of gravitational forces exerted by the Moon, Sun, and rotation of the Earth (smaller than 0.03 ms)
  • Atmospheric Circulation
  • Internal Effects – Related to the movement of the Earth’s liquid core
  • Angular moment – The transfer of rotational momentum due to the Moons orbital motion

International Atomic Time (TAI)

TAI is derived from a few hundred extremely precise atomic clocks housed in time laboratories around the world [31]. The atomic clocks used in such laboratories may only slip by one second over the course of 20 to 300 million years depending on the type of clock used.

One second is defined by the International System of Units (SI) as the time taken for a Cesium-133 atom at sea level to oscillate exactly 9,192631,770 times. The Atomic clocks used in the time laboratories will have been specifically designed to detect and count these oscillations.

The time laboratories provide time data from their atomic clocks to the Bureau International des Poids et Mesures (BIPM) [32]. The Time Department within BIPM then combines this time data to form TAI.

Leap Seconds

The pace of TAI is regularly compared to UT1. To compensate for the variations and gradual slowing of the Earth’s rotation, Leap seconds [33] are inserted as required to keep UTC within 0.9 seconds of UT1.

At time of writing the most recent Leap second was added on the 30th of June 2015. At this point TAI was exactly 36 seconds head of UTC. The computation of UTC since 1972 has required the addition of 26 leap seconds; the other 10 seconds were added at the start of 1972 to compensate for an initial discrepancy in timing. At present Leap seconds are added on either the 30 June or 31 December as required [34].

References
[1] Wikipedia: Time Stamp Counter [2] Wikipedia: High Precision Event Timer
[3] Wikipedia: Programmable Interval Timer [4] Wikipedia: Real Time Clock
[5] Wikipedia: Advanced Programmable Interrupt Controller [6] Wikipedia: Advanced Configuration & Power Interface
[7] Wikipedia: Wall Clock Time [8] Wikipedia: Clock Drift
[9] Journal of Computer Networks and Communications: Internal Clock Drift Estimation in Computer Clusters [10] NTP.Org FAQ’s: Clock Quality
[11] Wikipedia: Dynamic Frequency Scaling [12] VMware: Timekeeping in VM’s
[13] Wikipedia: Network Time Protocol [14] The Internet Engineering Task Force (IETF): Network Time Protocol (NTP) – RFC958 – 1985
[15] The Internet Engineering Task Force (IETF): Network Time Protocol Version 4 – Protocol and Algorithms Specification – RFC5905 – 2010 [16] The Internet Engineering Task Force (IETF): Network Time Protocol Version 3 – Specification, Implementation and Analysis – RFC1305 – 1992
[17] The Internet Engineering Task Force (IETF): Time Protocol – RFC868 – 1983 [18] Wikipedia: Time Protocol
[19] The Internet Engineering Task Force (IETF): Internet Control Message Protocol – RFC792 – 1981 [20] The Internet Engineering Task Force (IETF): User Datagram Protocol – RFC768 – 1980
[21] Wikipedia: Intersection Algorithm [22] Maintaining the Time in a Distributed System – Keith Marzullo & Susan Owicki – 1983
[23] Wikipedia: Keith Marzullo [24] NTP.org – NTP Project Information Page
[25] Microsoft: Windows Time Service Tools and Settings [26] Wikipedia: Coordinated Universal Time (UTC)
[27] Time and Date: About UTC [28] ITU ITU-R TF.460-6 – Standard-frequency and time-signal emissions
[29] Wikipedia: List of UTC Time Offsets [30] International Earth Rotation and Reference Systems Service (IERS): Universal Time (UT1) and Length of Day (LOD)
[31] Wikipedia: International Atomic Time [32] Bureau International des Poids et Mesures (BIPM): Work Programme – Time
[33] Wikipedia: Leap Second [34] National Institute of Standards and Technology (NIST): Leap Seconds FAQ’s