By Andreja Jarc.
There are a number of methods to assure highly available and reliable operation of time & frequency synchronization in your networks. One principle for example is to use well designed and robust equipment. However, even with quality equipment we recommend redundancy, which enables a synchronization system to continue to operate seamlessly although some of its components may have failed. Redundancy in NTP networks can be established in several ways. We may distinguish between two major aspects in terms which subsystem takes responsibility to provide redundancy: NTP servers or sync clients themselves.
Let me address in this article how redundancy for your NTP network can be sustained from the server perspective. Step by step we’ll work out a few different scenarios what may go wrong in a NTP constellation that calls for a server redundancy configuration:
- Power supply failure
- Reference clock failure (signal loss or malicious disturbing)
- Network unavailability
- Inadequate server performance
- Physical damage.
1. Power supply redundancy. If a primary power supply unit of an NTP server breaks down then its power may be seamlessly taken over by the second inbuilt power module. It is also possible to combine AC and DC modules with different voltage range (100-240 V AC/DC or 20-72 V DC, respectively). In case of overvoltage in the primary electrical circuit, a DC module takes over the power supply with an external battery, for instance.
2. Ref clock redundancy. An NTP server may have an option to be synchronized by different reference time sources in case if a primary reference source fails or if it is maliciously manipulated. The most common reference clocks in the market are equipped with GNSS receivers, but for reasons of redundancy one can select separate GPS and GLONASS clock modules to be source independent. Moreover, in central Europe it is possible to achieve redundancy with DCF77/PZF, in UK with MSF and USA with WWVB long wave radio signal respectively. Additionally Meinberg offers also a so-called MRS (Multi Reference Source) system, where apart of the receivers mentioned a reference clock can be additionally synchronized with IRIG Time Codes, PPS, 10 MHz, PTP and external NTP time signals in a configurable priority order. Figure 1 shows an example of a Meinberg LANTIME M900 with redundant power and reference clock modules.
Figure 1: Meinberg LANTIME M900 with a Redundant Power Supply configuration (yellow) and a redundant GPS reference clock module (green).
3. High-availability network connection. This type of redundancy is about constant availability of a NTP server even if the LAN port responsible for NTP traffic fails or the network infrastructure this port is connected to fails. This is achieved with bonding of 2 LAN ports pairwise together (as suggested in Figure 2). In case that a particular network port for some reason falls out its bonded buddy takes over the NTP traffic and does the job seamlessly forward. For detailed information on Bonding configuration Guidelines a new post will be coming out soon.
Figure 2: A possible bonding configuration in Meinberg LANTIME M600 for high-available LAN ports.
4. Cluster mode. This Meinberg LANTIME specific mode of operation enables selection of the current MASTER among a few servers in a cluster according to the best performance parameters. Cluster servers are constantly exchanging information about their current state (e.g. priority, sync status, accuracy, availability, etc) and based on the quality of these features they vote which of them becomes the current master. All the others rearrange into a “slave” mode and they keep exchanging information until one of them may provide better parameters and thus become the next master. Each of the NTP servers is assigned to a common cluster IP address, where the whole NTP traffic is eventually running to sync clients which query time from the cluster IP (see Figure 3). Therefore sync clients may only know about this unified cluster IP, no matter how many servers are involved in the cluster.
Needless to say is that Cluster and Bonding features may be combined as well, but a detailed configuration setup will be addressed in a separate article. Please don’t hesitate to contact us if you would like help implementing this combination.
Figure 3: Several NTP servers involved in a cluster mode. Only one of them (the Cluster Master) is replying to NTP requests sent to the Cluster IP address.
5. Fully redundant system. For the most reliable sync system (also to reduce significantly a potential risk of an unforeseen failure mechanism) one can consider a fully redundant double NTP server setup as mirrored configuration with redundancy in reference clocks, power supply units and LAN interfaces at preferably but not mandatory two physically detached locations.
Figure 4: Fully redundant NTP server system setup.
This was about different variations of NTP server redundancy setups. Do not miss our next coming posts on NTP client redundancy, Bonding configuration settings and How to setup a Cluster mode.
For more information on NTP Time Servers visit our website at: www.meinbergglobal.com.
Juan Carlos Pulido says
Hello,
our company has a Lantime M1000 and we want to buy a new Lantime M1000 for cluster but we have some dudes, in a cluster of 2 Lantimes there is a master and a slave but:
– both reply the NTP request?
– the request is processed by both but only reply the master?
Thank you very much.
Andreja Jarc says
Hello Juan Carlos,
within a cluster only the cluster master is allowed to respond to NTP requests, the other cluster servers remain passive. However, other cluster servers constantly monitor the master for its quality properties. If one of the servers sees that it has better quality parameters than the current master it will take the role of the master at one point.
Regards,
Andreja
Krishnan says
Hi, I have a query in redundancy.
Our company is about to install redundant GPS for our redundant Network system (Two separate Line available with Each PC and NTP Equipment), we come across two type to achieve this redundancy
1. Redundant Antenna, cable connected to 1 SNTP server Lantime M900 (SNTP server shall handle the redundancy here)
2. Two independent Lantime M300 server with antenna cable etc… (Both M300 will sync the network)
can you please explain the significance of the above two types?
Incase if I install two independent Lantime M300, will there is any difference in time between two network ( Even Micro/Nano Seconds)?
Andreja Jarc says
Hello Krishnan,
thank you for your question.
In the first case, with a M900 system you have a configuration with redundant GPS receivers. There is a switching card between two receivers which constantly monitors both incoming signals and measures the difference between both. Only one clock is active the other is passive and used as a backup. In case of a failure or service degradation of the active clock, the switching unit automatically switches to the second clock. The switching between both reference clocks would normally have no quality impact on NTP, therefore the NTP accuracy of the LANTIME should stay the same.
In the second example when you have two standalone M300 systems they are both generating NTP at different ends of the network. Your NTP clients can query time from both of them, but in this case the NTP performance is dependent on the current network traffic, queuing noise in switches and routers in the link between server-client and path delay asymmetry. Yes, in this case a difference between both NTP signals can be in ms (depending on the current situation in the network).
Best regards,
Andreja Jarc
Jacques Henry says
Hello Andreja,
Concerning your answer on the second example with two standalone M300 systems, I have a similar setup with two standalone M1000 on two different locations, both acting a NTP stratum 1 servers. So the cluster mode is out of the way for me as I won’t have the same IP network on the two locations. So I’m in a fully redundant system scenario
1. Can we make the M1000 synchronize between themselves as peers ?
2. Or is it unnecessary/impossible as the clients will use both NTP servers ?
Thanks!
Andreja Jarc says
Dear Jacques,
thank you for your interest in our blog. Let me try to answer your questions.
The peering concept in NTP was introduced in the time where stable and accurate reference clocks (e.g. GPS or radio clock) were not yet available. However, nowadays NTP servers have a reliable time reference (a GPS clock), therefore the time on this hardware clock is much more accurate than the time from other devices in the network.
In your case you have 2 M1000 units, each comprises own GPS clock. Therefore, if possible, configure your clients to poll their time from both servers in parallel. Thus you let the running ntpd on clients to decide automatically to which server it will sync. If one of servers gets unreachable, the ntpd will automatically switch to the remaining IP on the list.
Regards,
Andreja
Jacques Henry says
Thank you very much Andreja.
Regards.
Jacques Henry says
Hello,
Is there a way to use the 162kHz clock signal from the Allouis longwave transmitter in France?
https://en.wikipedia.org/wiki/Allouis_longwave_transmitter
The signal seems technically close to the DCF77.
Thanks in advance!
Regards
Andreja Jarc says
Hello Jacques,
well our receivers, they can additionally receive a longwave signal from
– MSF (England), 60kHz
– WWVB (USA), 60kHz
but not from an Allouis transmitter.
Paul Roberts says
Is there an argument for putting multiple stratum 1 servers on different continents so each server has a different “view” of the GPS satellites? If 2 or more servers are in the same country/continent is there a risk that they might all lock onto the same satellites? I don’t know if this is a valid argument or not when considering where to locate multiple stratum 1 devices?
Douglas Arnold says
Both the accuracy and reliability of GPS as a source for time is very good, and not dependent on location unless you are near the North or South Pole. It is good practice to locate stratum one servers far enough apart so that a GPS jamming source will not effect more than one server at a time. One kilometer distance is enough unless you are near a war zone.
In contrast the errors and reliability issues associated with networks are much more serious. The more hops NTP packets have to traverse the more inaccurate the time will be, and the greater the chance the packets will be lost along the way.
Filip says
Hi
When you have two ntp appliances at two different locations but they are considered the same logical unit, would the reccomended setup for these two be in a master /slave or run as seperate ntp server where clients are pulling both?
Douglas Arnold says
Given that the largest source of error in ntp time transfer is due to queuing in the network, two physically separated servers should not be treated as the same logical server. A client will want to distinguish the time coming from the two separated servers. Defining two ports as the same logical ntp server is more sensible when the two ports are connected to the same switch, and therefore packets to and from these ports experience the same, or at least similar queuing.