Discussion:
systemd, ntp, kernel and hwclock
(too old to reply)
Daniel Pocock
2017-02-27 17:20:02 UTC
Permalink
Hi all,

I've observed a system that had a wildly incorrect hardware clock (when
it was first unboxed), I ran ntpdate to sync the kernel clock but after
a shutdown and startup again it had a wacky time again.

I came across the discussion about how the hardware clock is no longer
set at shutdown[1]

The system has ntpd running

Looking at the output of
adjtimex --print | grep status

the bit corresponding to 64 / STA_UNSYNC is 0

There is a time and date page on the wiki[2] and in the manual[3],
neither of them appears to have up to date information about the way it
works with systemd or how to troubleshoot issues like this.

Monitoring it with:

hwclock -r ; date

shows that the hardware clock is running slowly, losing maybe 1s per
hour. I would have expected that if the kernel is syncing to the
hardware clock every 11 minutes then I wouldn't see such changes.

Can anybody make any suggestions or add anything to the wiki?

Regards,

Daniel

1. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=755722
2. https://wiki.debian.org/DateTime
3.
https://www.debian.org/doc/manuals/system-administrator/ch-sysadmin-time.html
Santiago Vila
2017-02-27 18:20:02 UTC
Permalink
Post by Daniel Pocock
Can anybody make any suggestions or add anything to the wiki?
My old Mac Mini had a crazy clock and ntp was not enough to sanitize it.
I fixed it by using adjtimex in addition to ntp.

As an example, my clock was off by 2890 parts per million, so I used
this in /etc/default/adjtimex:

TICK=10028
FREQ=5898240
# 28 * 100 + 5898240 / 65536 = 2890 ppm

This used to work very well, but OTOH I had my computer always on, so
I'm not sure it the cases are similar.

Thanks.
Russ Allbery
2017-02-27 19:40:01 UTC
Permalink
Post by Daniel Pocock
I've observed a system that had a wildly incorrect hardware clock (when
it was first unboxed), I ran ntpdate to sync the kernel clock but after
a shutdown and startup again it had a wacky time again.
I came across the discussion about how the hardware clock is no longer
set at shutdown[1]
The system has ntpd running
Looking at the output of
adjtimex --print | grep status
the bit corresponding to 64 / STA_UNSYNC is 0
There is a time and date page on the wiki[2] and in the manual[3],
neither of them appears to have up to date information about the way it
works with systemd or how to troubleshoot issues like this.
My understanding from reading a bit about this just now is that the short
version is "install ntpd if you want this to happen."

My impression is that ntpdate has been obsolete for years and upstream has
been slowly trying to kill it. ntpd is the upstream-supported daemon, and
it periodically asks the kernel to set the hardware clock. (And it
supports various command-line options to make it act like ntpdate if you
really want.)

The much simpler systemd-timesyncd doesn't set the hardware clock for
reasons that one may or may not agree with (I honestly haven't researched
it in any depth), but you can just run ntpd instead if you care.

Alternately, if you really want to use a clock setting mechanism that
doesn't ask the kernel to sync the hardware clock but you still want to
set the hardware clock, you can add your own shutdown init script / unit
to run hwclock --systohc (or even a cron job if you want).
--
Russ Allbery (***@debian.org) <http://www.eyrie.org/~eagle/>
Ben Hutchings
2017-02-27 20:50:02 UTC
Permalink
Post by Russ Allbery
Post by Daniel Pocock
I've observed a system that had a wildly incorrect hardware clock (when
it was first unboxed), I ran ntpdate to sync the kernel clock but after
a shutdown and startup again it had a wacky time again.
I came across the discussion about how the hardware clock is no longer
set at shutdown[1]
The system has ntpd running
Looking at the output of
   adjtimex --print | grep status
the bit corresponding to 64 / STA_UNSYNC is 0
There is a time and date page on the wiki[2] and in the manual[3],
neither of them appears to have up to date information about the way it
works with systemd or how to troubleshoot issues like this.
My understanding from reading a bit about this just now is that the short
version is "install ntpd if you want this to happen."
My impression is that ntpdate has been obsolete for years and upstream has
been slowly trying to kill it.  ntpd is the upstream-supported daemon, and
it periodically asks the kernel to set the hardware clock.
The kernel actually does the periodic setting automatically, so long as
the NTP server reports that it's synchronised (by clearing STA_UNSYNC
in timex::status).

(The kernel will only set one RTC device, which is specified in the
build config. On systems that have multiple RTCs and only one of them
works (e.g. the one in the SoC doesn't have battery power but the one
in the PMIC does) this may not work properly. It may be fixable by
disabling the broken RTC in the device tree.)
Post by Russ Allbery
(And it
supports various command-line options to make it act like ntpdate if you
really want.)
The much simpler systemd-timesyncd doesn't set the hardware clock for
reasons that one may or may not agree with (I honestly haven't researched
it in any depth),
It looks like it does iff the RTC is set to UTC:

/*
* An unset STA_UNSYNC will enable the kernel's 11-minute mode,
* which syncs the system time periodically to the RTC.
*
* In case the RTC runs in local time, never touch the RTC,
* we have no way to properly handle daylight saving changes and
* mobile devices moving between time zones.
*/
if (m->rtc_local_time)
tmx.status |= STA_UNSYNC;
Post by Russ Allbery
but you can just run ntpd instead if you care.
But ntpd is also known to have a large amount of code written without
as much regard for security as one would hope. It seems like an
unnecessary risk for most systems.

Ben.
Post by Russ Allbery
Alternately, if you really want to use a clock setting mechanism that
doesn't ask the kernel to sync the hardware clock but you still want to
set the hardware clock, you can add your own shutdown init script / unit
to run hwclock --systohc (or even a cron job if you want).
--
Ben Hutchings
This sentence contradicts itself - no actually it doesn't.
Russ Allbery
2017-02-28 00:30:02 UTC
Permalink
Post by Ben Hutchings
Post by Russ Allbery
The much simpler systemd-timesyncd doesn't set the hardware clock for
reasons that one may or may not agree with (I honestly haven't
researched it in any depth),
/*
* An unset STA_UNSYNC will enable the kernel's 11-minute mode,
* which syncs the system time periodically to the RTC.
*
* In case the RTC runs in local time, never touch the RTC,
* we have no way to properly handle daylight saving changes and
* mobile devices moving between time zones.
*/
if (m->rtc_local_time)
tmx.status |= STA_UNSYNC;
Oh! Okay, then yes, it shouldn't matter whether it persists at shutdown
or not, since it will be setting it periodically anyway.
Post by Ben Hutchings
Post by Russ Allbery
but you can just run ntpd instead if you care.
But ntpd is also known to have a large amount of code written without
as much regard for security as one would hope. It seems like an
unnecessary risk for most systems.
Indeed, I've personally switched to systemd-timesyncd on my systems, which
works fine for me. (I think there are other lightweight clients if people
want something different.)
--
Russ Allbery (***@debian.org) <http://www.eyrie.org/~eagle/>
Daniel Pocock
2017-02-28 09:40:01 UTC
Permalink
Post by Ben Hutchings
Post by Russ Allbery
Post by Daniel Pocock
I've observed a system that had a wildly incorrect hardware
clock (when it was first unboxed), I ran ntpdate to sync the
kernel clock but after a shutdown and startup again it had a
wacky time again. I came across the discussion about how the
hardware clock is no longer set at shutdown[1] The system has
ntpd running Looking at the output of adjtimex --print | grep
status the bit corresponding to 64 / STA_UNSYNC is 0 There is a
time and date page on the wiki[2] and in the manual[3], neither
of them appears to have up to date information about the way
it works with systemd or how to troubleshoot issues like this.
My understanding from reading a bit about this just now is that
the short version is "install ntpd if you want this to happen."
My impression is that ntpdate has been obsolete for years and
upstream has been slowly trying to kill it. ntpd is the
upstream-supported daemon, and it periodically asks the kernel to
set the hardware clock.
The kernel actually does the periodic setting automatically, so
long as the NTP server reports that it's synchronised (by clearing
STA_UNSYNC in timex::status).
(The kernel will only set one RTC device, which is specified in
the build config. On systems that have multiple RTCs and only one
of them works (e.g. the one in the SoC doesn't have battery power
but the one in the PMIC does) this may not work properly. It may
be fixable by disabling the broken RTC in the device tree.)
It would seem reasonable for ntpdate to clear that flag so I opened a
bug report[1] for ntpdate
Post by Ben Hutchings
Post by Russ Allbery
(And it supports various command-line options to make it act like
ntpdate if you really want.)
The much simpler systemd-timesyncd doesn't set the hardware clock
for reasons that one may or may not agree with (I honestly
haven't researched it in any depth),
/* * An unset STA_UNSYNC will enable the kernel's 11-minute mode, *
which syncs the system time periodically to the RTC. * * In case
the RTC runs in local time, never touch the RTC, * we have no way
to properly handle daylight saving changes and * mobile devices
moving between time zones. */ if (m->rtc_local_time) tmx.status |=
STA_UNSYNC;
Post by Russ Allbery
but you can just run ntpd instead if you care.
But ntpd is also known to have a large amount of code written
without as much regard for security as one would hope. It seems
like an unnecessary risk for most systems.
Thanks for that security tip, I'm tempted to get rid of some ntpd
instances now, however a few more questions come to mind before I rush in:

- for a site with several machines, should they all be querying
pool.ntp.org servers directly or can any other local ntp daemon be
relied on?

- this discussion also reminded me of the clock drift issues[2] for
Xen virtual machines / guests / domU systems. I don't know if such
problems still exist with modern hypervisors and kernels but I had
encountered them in the past and had been running ntpd on each VM and
then they all appeared to behave. Does this apply to LXC, KVM or any
other environment? What are best practices for this today, does
systemd-timesyncd solve everything and do people need to manually
tweak any sysctl like /proc/sys/xen/independent_wallclock any more?
Maybe some of this could go in the wiki too if it is still necessary.

Regards,

Daniel

1. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=856343
2.
http://serverfault.com/questions/245401/xen-hvm-guest-has-severe-clock-drift
Adam Borowski
2017-02-28 16:10:01 UTC
Permalink
Post by Daniel Pocock
Post by Ben Hutchings
But ntpd is also known to have a large amount of code written
without as much regard for security as one would hope. It seems
like an unnecessary risk for most systems.
Thanks for that security tip, I'm tempted to get rid of some ntpd
instances now
You'd be interested in NTPsec (https://www.ntpsec.org/) then, which is a
project to review and sanitize ntpd without downsides prevalent in most
replacements (such as same-week accuracy or no managing clock drift).

Sadly, it's not a part of stretch or even unstable yet:
https://bugs.debian.org/819806
Post by Daniel Pocock
- for a site with several machines, should they all be querying
pool.ntp.org servers directly or can any other local ntp daemon be
relied on?
Using a local daemon means:
* less burden on public servers or the network
* if there's a problem, your machines will be consistent at least between
them, which is usually a bigger concern than being globally accurate
--
⢀⣴⠾⠻⢶⣦⠀ Meow!
⣾⠁⢠⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ Collisions shmolisions, let's see them find a collision or second
⠈⠳⣄⠀⠀⠀⠀ preimage for double rot13!
Bjørn Mork
2017-02-28 17:10:02 UTC
Permalink
Post by Adam Borowski
Post by Daniel Pocock
Post by Ben Hutchings
But ntpd is also known to have a large amount of code written
without as much regard for security as one would hope. It seems
like an unnecessary risk for most systems.
Thanks for that security tip, I'm tempted to get rid of some ntpd
instances now
You'd be interested in NTPsec (https://www.ntpsec.org/) then, which is a
project to review and sanitize ntpd without downsides prevalent in most
replacements (such as same-week accuracy or no managing clock drift).
https://bugs.debian.org/819806
I don't think there are enough people caring about ntp in Debian (or the
world) to maintain two code bases. And the fork is still young and not
"obviously better" or "clearly the one true path forward".

See also https://lwn.net/Articles/713901/ for more background
information.

IMHO, it's very unfortunate that this fork was created, and I cannot see
anything good coming out of it. It's just wasting developer resources
which could have been used to improve ntp.


Bjørn
Carsten Leonhardt
2017-02-28 19:30:01 UTC
Permalink
Post by Daniel Pocock
Post by Ben Hutchings
But ntpd is also known to have a large amount of code written
without as much regard for security as one would hope. It seems
like an unnecessary risk for most systems.
Thanks for that security tip, I'm tempted to get rid of some ntpd
Have a look at openntpd, that's coded with security in mind.

- Carsten
Vincent Lefevre
2017-03-07 12:30:01 UTC
Permalink
Post by Carsten Leonhardt
Post by Daniel Pocock
Post by Ben Hutchings
But ntpd is also known to have a large amount of code written
without as much regard for security as one would hope. It seems
like an unnecessary risk for most systems.
Thanks for that security tip, I'm tempted to get rid of some ntpd
Have a look at openntpd, that's coded with security in mind.
But this doesn't apply to the Debian version, as documented. And it
is buggy. I had to remove it from my machine because it did more harm
than solving problems.
--
Vincent Lefèvre <***@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Daniel Pocock
2017-02-27 21:00:02 UTC
Permalink
Post by Russ Allbery
Post by Daniel Pocock
I've observed a system that had a wildly incorrect hardware clock (when
it was first unboxed), I ran ntpdate to sync the kernel clock but after
a shutdown and startup again it had a wacky time again.
I came across the discussion about how the hardware clock is no longer
set at shutdown[1]
The system has ntpd running
Looking at the output of
adjtimex --print | grep status
the bit corresponding to 64 / STA_UNSYNC is 0
There is a time and date page on the wiki[2] and in the manual[3],
neither of them appears to have up to date information about the way it
works with systemd or how to troubleshoot issues like this.
My understanding from reading a bit about this just now is that the short
version is "install ntpd if you want this to happen."
My impression is that ntpdate has been obsolete for years and upstream has
been slowly trying to kill it. ntpd is the upstream-supported daemon, and
it periodically asks the kernel to set the hardware clock. (And it
supports various command-line options to make it act like ntpdate if you
really want.)
The much simpler systemd-timesyncd doesn't set the hardware clock for
reasons that one may or may not agree with (I honestly haven't researched
it in any depth), but you can just run ntpd instead if you care.
Alternately, if you really want to use a clock setting mechanism that
doesn't ask the kernel to sync the hardware clock but you still want to
set the hardware clock, you can add your own shutdown init script / unit
to run hwclock --systohc (or even a cron job if you want).
ntpd is definitely running now, it is a default configuration and it was
already on the box a long time before I observed the issue today.

However, at the time when I ran ntpdate, ntp was not running. I had
brought up the network manually due to an interface renaming issue on
the first boot. Maybe when somebody runs ntpdate in a scenario like
that the kernel is not sending the new date/time to the hardware clock.
I had simply assumed that it would be persisted at shutdown but maybe
ntpdate could be patched to do whatever ntpd does to encourage the
kernel to persist it.

Regards,

Daniel
Russ Allbery
2017-02-28 00:30:02 UTC
Permalink
Post by Daniel Pocock
However, at the time when I ran ntpdate, ntp was not running. I had
brought up the network manually due to an interface renaming issue on
the first boot. Maybe when somebody runs ntpdate in a scenario like
that the kernel is not sending the new date/time to the hardware clock.
Right, ntpdate for some reason doesn't set the flag to do this.
Post by Daniel Pocock
I had simply assumed that it would be persisted at shutdown but maybe
ntpdate could be patched to do whatever ntpd does to encourage the
kernel to persist it.
sysvinit I believe used to always persist the clock to the hardware clock
during shutdown. systemd doesn't do that, for reasons that I've not
thought about in any depth. So that's a change, which is understandably
surprising.

If you get in the habit of using ntpd instead of ntpdate to do the
one-time clock syncs, that might fix the problem (alas, I forget the set
of command line flags that do the same thing as ntpdate).
--
Russ Allbery (***@debian.org) <http://www.eyrie.org/~eagle/>
Ben Hutchings
2017-02-28 01:00:01 UTC
Permalink
Post by Russ Allbery
However, at the time when I ran ntpdate, ntp was not running.  I had
brought up the network manually due to an interface renaming issue on
the first boot.  Maybe when somebody runs ntpdate in a scenario like
that the kernel is not sending the new date/time to the hardware clock.
Right, ntpdate for some reason doesn't set the flag to do this.
[...]

There is a very good reason, which is that without continuous
adjustment the system clock cannot be assumed more stable than the RTC.

Ben.
--
Ben Hutchings
Never attribute to conspiracy what can adequately be explained by
stupidity.
Russ Allbery
2017-02-28 03:50:01 UTC
Permalink
Post by Ben Hutchings
Post by Russ Allbery
However, at the time when I ran ntpdate, ntp was not running.  I had
brought up the network manually due to an interface renaming issue on
the first boot.  Maybe when somebody runs ntpdate in a scenario like
that the kernel is not sending the new date/time to the hardware clock.
Right, ntpdate for some reason doesn't set the flag to do this.
[...]
There is a very good reason, which is that without continuous
adjustment the system clock cannot be assumed more stable than the RTC.
If you've literally just synced the system clock to a remote NTP server,
why could you not assume it was more accurate than the RTC?
--
Russ Allbery (***@debian.org) <http://www.eyrie.org/~eagle/>
Ben Hutchings
2017-02-28 05:30:02 UTC
Permalink
Post by Russ Allbery
Post by Ben Hutchings
Post by Russ Allbery
However, at the time when I ran ntpdate, ntp was not running.  I had
brought up the network manually due to an interface renaming issue on
the first boot.  Maybe when somebody runs ntpdate in a scenario like
that the kernel is not sending the new date/time to the hardware clock.
Right, ntpdate for some reason doesn't set the flag to do this.
[...]
There is a very good reason, which is that without continuous
adjustment the system clock cannot be assumed more stable than the RTC.
If you've literally just synced the system clock to a remote NTP server,
why could you not assume it was more accurate than the RTC?
For that instant, sure, and ntpdate could follow-up the one-shot system
clock synch with a one-short RTC synch. But the kernel doesn't provide
a simple API for that, and it's easy enough to add "hwclock --systohc"
to a script right after "ntpdate ...".

Ben.
--
Ben Hutchings
Never attribute to conspiracy what can adequately be explained by
stupidity.
Kurt Roeckx
2017-02-28 20:10:01 UTC
Permalink
Post by Ben Hutchings
Post by Russ Allbery
Post by Ben Hutchings
Post by Russ Allbery
However, at the time when I ran ntpdate, ntp was not running.  I had
brought up the network manually due to an interface renaming issue on
the first boot.  Maybe when somebody runs ntpdate in a scenario like
that the kernel is not sending the new date/time to the hardware clock.
Right, ntpdate for some reason doesn't set the flag to do this.
[...]
There is a very good reason, which is that without continuous
adjustment the system clock cannot be assumed more stable than the RTC.
If you've literally just synced the system clock to a remote NTP server,
why could you not assume it was more accurate than the RTC?
For that instant, sure, and ntpdate could follow-up the one-shot system
clock synch with a one-short RTC synch. But the kernel doesn't provide
a simple API for that, and it's easy enough to add "hwclock --systohc"
to a script right after "ntpdate ...".
If anything, having ntpdate call hwclock might make sense.

Having ntpdate clear the unsynced flag doesn't make sense since it
would start writing a time to the RTC each 11 minutes, and as Ben
said you have no idea which of the 2 clocks is the most correct
one.

I can also understand that systemd doesn't set the clock for just
the same reason. Either the clock is synched and it's written, or
it's not suched, it's unknown which one is the most correct, and
it's not written.


Kurt
Russ Allbery
2017-02-28 21:10:01 UTC
Permalink
Having ntpdate clear the unsynced flag doesn't make sense since it would
start writing a time to the RTC each 11 minutes, and as Ben said you
have no idea which of the 2 clocks is the most correct one.
Oh, I thought it was a one-shot thing, but it turns on syncing behavior
from that point forward. Thanks, that was the piece that I was missing.
I can also understand that systemd doesn't set the clock for just the
same reason. Either the clock is synched and it's written, or it's not
suched, it's unknown which one is the most correct, and it's not
written.
Yeah, it now makes perfect sense to me.
--
Russ Allbery (***@debian.org) <http://www.eyrie.org/~eagle/>
Roger Lynn
2017-03-04 21:30:02 UTC
Permalink
Post by Ben Hutchings
Post by Russ Allbery
Right, ntpdate for some reason doesn't set the flag to do this.
There is a very good reason, which is that without continuous
adjustment the system clock cannot be assumed more stable than the RTC.
This doesn't make sense to me. Most users are probably not aware that there
is a separate hardware RTC. Why would one assume that the clock the user is
not aware of is better than the clock the user can see and is presumably
happy with?

Roger
Ben Hutchings
2017-03-05 04:10:02 UTC
Permalink
Post by Roger Lynn
Post by Ben Hutchings
Post by Russ Allbery
Right, ntpdate for some reason doesn't set the flag to do this.
There is a very good reason, which is that without continuous
adjustment the system clock cannot be assumed more stable than the RTC.
This doesn't make sense to me. Most users are probably not aware that there
is a separate hardware RTC.
Most users don't know what ntpdate is, either.
Post by Roger Lynn
Why would one assume that the clock the user is not aware of is better than
the clock the user can see and is presumably happy with?
*I* would assume that when a user sets the system clock through a high-
level UI, such as GNOME provides, that is the most accurate source of
information and the RTC should also be set. But I would not assume
that the system clock *remains* very accurate after that point, which
is what the flag in question is supposed to indicate.

I would also expect that users running command-line tools to set the
time, such as ntpdate, have enough technical understanding to
distinguish the system clock and RTC.

Ben.
--
Ben Hutchings
All the simple programs have been written, and all the good names
taken.
Vincent Lefevre
2017-03-07 12:40:01 UTC
Permalink
Post by Ben Hutchings
I would also expect that users running command-line tools to set the
time, such as ntpdate, have enough technical understanding to
distinguish the system clock and RTC.
And what's worse is that by default, ntpdate is run automatically from
/etc/network/if-up.d, so that the date could become incorrect without
a control from the user.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844520
--
Vincent Lefèvre <***@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Loading...