Discussion:
utmp in trixie
Add Reply
Michael Stone
2025-04-01 19:30:01 UTC
Reply
Permalink
/run/utmp is no longer provided in trixie, which means that the
mechanisms used to show active sessions in unix for several decades no
longer work. There's a replacement mechanism provided by systemd, but
it's not 1:1. I propose that for trixie *both* mechanisms are active, so
a person can choose between them (and compare the output, to better
identify gaps between the historic utmp mechanism and the new and
improved systemd facility). I've been told that the reason this can't be
done is that utmp isn't y2038 compliant, but it seems to me that we
won't be supporting trixie in y2038, so who cares? Are there any factors
to consider that I've missed?

Mike Stone
Craig Small
2025-04-02 07:10:01 UTC
Reply
Permalink
Hi,
Post by Michael Stone
/run/utmp is no longer provided in trixie, which means that the
mechanisms used to show active sessions in unix for several decades no
longer work. There's a replacement mechanism provided by systemd, but
it's not 1:1.
I'm the procps maintainer for both upstream and Debian. procps provides some
of these tools that used to, or still can, use utmp.

The Debian package and upstream with the --with-systemd configure option
will
both use and prefer the information provided by systemd over utmp (if both
happen)
to be available.)

As to it being not 1:1, that's partly correct, but the longer answer is,
well longer.
* There are things that have just gone missing and aren't related to utmp,
such as idle
time.
* If you compile procps without systemd and run it on a host with no utmp,
you see 0 users.
* There is also the fact you might not see *term users. This is because for
systemd they are
the same session, so you see it once. The issue with utmp is there are no
rules so say
gnome-terminal you'd see the user but kterm you won't (or vice-versa, I
forget who does what).
That is because, in a way, the right answer is it's a single user.
* There might be some remote users that get missed, basically not visible
via sd_session_get_remote_host()
but are in ut_addr_v6 in the utmp struct. I'm not sure how common this is
or why/if it happens; i suspect
it is some pam_session brokenness/corner case.
Post by Michael Stone
I propose that for trixie *both* mechanisms are active, so
a person can choose between them (and compare the output, to better
identify gaps between the historic utmp mechanism and the new and
improved systemd facility). I've been told that the reason this can't be
done is that utmp isn't y2038 compliant, but it seems to me that we
won't be supporting trixie in y2038, so who cares? Are there any factors
to consider that I've missed?
Yes, there is definitely a Y2038 issue, there are also issues with utmp not
being
handled consistently and some security issues around who can do what to the
file.

For me and procps utils in Debian, we don't use utmp and don't need it.
Having utmp
there won't change the tools' outputs.

If the project at large says they want the file to hang around, that's ok
by me but
it won't give ps/w/etc any more details.

- Craig
Michael Stone
2025-04-02 12:50:01 UTC
Reply
Permalink
Post by Craig Small
Yes, there is definitely a Y2038 issue
*for trixie*?
Post by Craig Small
there are also issues with utmp not
being
handled consistently and some security issues around who can do what to the
file.
Stuff that has been true literally decades. If someone can turn off utmp
if they want to, does this matter?
Bill Allombert
2025-04-02 21:40:01 UTC
Reply
Permalink
/run/utmp is no longer provided in trixie, which means that the mechanisms
used to show active sessions in unix for several decades no longer work.
There's a replacement mechanism provided by systemd, but it's not 1:1. I
propose that for trixie *both* mechanisms are active, so a person can choose
between them (and compare the output, to better identify gaps between the
historic utmp mechanism and the new and improved systemd facility). I've
been told that the reason this can't be done is that utmp isn't y2038
compliant, but it seems to me that we won't be supporting trixie in y2038,
so who cares? Are there any factors to consider that I've missed?
Does that breaks the usual unix commands like 'who' ? If yes this is
dangerous. It is common to use them before deciding whether a host
can be shut down.

Cheers,
--
Bill. <***@debian.org>

Imagine a large red swirl here.
Marco d'Itri
2025-04-02 22:50:01 UTC
Reply
Permalink
Post by Bill Allombert
Does that breaks the usual unix commands like 'who' ? If yes this is
who(1) specifically, yes.
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1079575 .
Maybe the coreutils maintainer is already working to backport this
simple patch in time for trixie, or else you could help him.
Post by Bill Allombert
dangerous. It is common to use them before deciding whether a host
can be shut down.
You may use w(1) for the time being.
--
ciao,
Marco
Craig Small
2025-04-03 11:50:02 UTC
Reply
Permalink
Post by Marco d'Itri
Post by Bill Allombert
Does that breaks the usual unix commands like 'who' ? If yes this is
who(1) specifically, yes.
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1079575 .
Maybe the coreutils maintainer is already working to backport this
simple patch in time for trixie, or else you could help him.
Post by Bill Allombert
dangerous. It is common to use them before deciding whether a host
can be shut down.
You may use w(1) for the time being.
w uses the systemd method by default now which is why it reports users.

I thought there was upstream coreutils support for systemd in gnulib[1]
which
is what who uses, so in theory it's a configure change.

I must admit I don't fully follow coreutils source so may have missed
something.

- Craig

1:
https://github.com/coreutils/gnulib/blob/bb506f75625b47d7844af2b6dc4b8192d4dea676/lib/readutmp.c#L981
Michael Stone
2025-04-03 13:40:01 UTC
Reply
Permalink
Post by Marco d'Itri
Post by Bill Allombert
Does that breaks the usual unix commands like 'who' ? If yes this is
who(1) specifically, yes.
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1079575 .
Maybe the coreutils maintainer is already working to backport this
simple patch in time for trixie, or else you could help him.
The issue isn't making a change, the issue is what change is the right
thing to do. IMO, dropping utmp without any kind of a transition or
deprecation period is the wrong thing to do. Hence this thread.
Marco d'Itri
2025-04-03 18:00:01 UTC
Reply
Permalink
Post by Michael Stone
The issue isn't making a change, the issue is what change is the right
thing to do. IMO, dropping utmp without any kind of a transition or
deprecation period is the wrong thing to do. Hence this thread.
I think it's a bit late now to disagree with the plan implemented last
year by multiple maintainers.
--
ciao,
Marco
Michael Stone
2025-04-03 21:50:01 UTC
Reply
Permalink
Post by Marco d'Itri
Post by Michael Stone
The issue isn't making a change, the issue is what change is the
right thing to do. IMO, dropping utmp without any kind of a
transition or deprecation period is the wrong thing to do. Hence
this thread.
I think it's a bit late now to disagree with the plan implemented last
year by multiple maintainers.
Except, of course, for the primary consumer of utmp...

I'm the one who gets the complaints that who isn't working right, and
there isn't a solution to that problem, since the systemd facility
doesn't provide the same information. I'd argue that a lot of people
didn't realize how screwed up things were going to be, because the
change didn't impact normal use until after a reboot. So people have
slowly been finding out over time that a decades-old interface is no
longer available, and the answer "well, we decided to drop it" falls a
little flat since there seems to be no actual reason to not just support
both mechanisms.
Chris Hofstaedtler
2025-04-03 22:50:01 UTC
Reply
Permalink
Post by Michael Stone
Post by Marco d'Itri
Post by Michael Stone
The issue isn't making a change, the issue is what change is the
right thing to do. IMO, dropping utmp without any kind of a
transition or deprecation period is the wrong thing to do. Hence
this thread.
I think it's a bit late now to disagree with the plan implemented
last year by multiple maintainers.
Except, of course, for the primary consumer of utmp...
I'm the one who gets the complaints that who isn't working right, and
there isn't a solution to that problem, since the systemd facility
doesn't provide the same information.
From what I got from the coreutils bug is that coreutils upstream
has the relevant code and it's supposed to work.

Maybe it needs some additional work to fully function
with current systemd versions. IIRC procps also only recently added
some new code to deal with new systemd behaviours.

But the thing that needs looking at is why who in Debian behaves
like it does, and if it doesn't work right, fix that in who or
wherever else it needs fixing in the stack.

Reintroducing utmp will just hide the problem, and we'll be in the
same situation when releasing forky, or forky+1, ...

Chris


PS: I never understood why there's both w and who, and why they are
different implementations.
Craig Small
2025-04-04 00:10:01 UTC
Reply
Permalink
Post by Chris Hofstaedtler
Maybe it needs some additional work to fully function
with current systemd versions. IIRC procps also only recently added
some new code to deal with new systemd behaviours.
That's right, we were counting sessions, not user sessions. There's an
explicit check for it now.

There's also w terminal mode that displays terminals not users, utmp
kinda-sorta did this
inconsistently. w drives it from the processes tty field so should capture
them all.

PS: I never understood why there's both w and who, and why they are
Post by Chris Hofstaedtler
different implementations.
Ancient historical reasons that pre-date my involvement (< 1997). See also:
why so many *kills

- Craig
Craig Small
2025-04-04 00:20:01 UTC
Reply
Permalink
SESSION UID USER SEAT LEADER CLASS TTY IDLE SINCE
3 1000 michael seat0 2116 user tty1 no -
4 1000 michael - 2125 manager - no -
2 sessions listed.
michael seat0 2025-04-02 16:38
michael tty1 2025-04-02 16:38
Should 'who' report what loginctl list-sessions or list-users reports?
We (procps, w, etc) use the latter by checking the session class for
"user*".
Without that filter, the output looked a bit weird.

It's probably best both who and w agree.


- Craig
G. Branden Robinson
2025-04-04 00:50:01 UTC
Reply
Permalink
Post by Chris Hofstaedtler
PS: I never understood why there's both w and who, and why they are
different implementations.
As I understand it, it's the same reason as why more(1) vs. pg(1) and
why lp(1) vs. lpd(1)--the AT&T/USG/USL vs. BSD schism.

who(1) goes all the way back to "First Edition" Unix (1971).
https://minnie.tuhs.org/cgi-bin/utree.pl?file=V1/man/man1/who.1

w(1) came along much later, in 3BSD (late 1979).
https://minnie.tuhs.org/cgi-bin/utree.pl?file=3BSD/usr/man/man1/w.1

About that time, the BSD/USG Unix split occurred. AT&T's Unix System
III (1980) did not incorporate the Berkeley CSRG's w(1); instead they
aped it with a now-forgotten administrative utility called whodo(1).
https://minnie.tuhs.org/cgi-bin/utree.pl?file=SysIII/usr/src/man/man1/whodo.1m

System III became System V (1982), which continued to eschew w(1) along
with numerous other Berkeley CSRG innovations for much of the duration
of its commercial relevance. In places where ecumenicism was more
important, as with Research Unix (the AT&T CSRC at Murray Hill, New
Jersey, where Unix was born), some "vendor" Unices like Sequent's DYNIX
(with its two "universes" either of which any given user [or process?]
could be assigned), and eventually Unix System V Release 4.2 (1992)--I
think--and Solaris, the two utilities came to coexist. Supporting both
was a Simple Matter of Programming, especially once kernel development
settled down.

...and, as I understand it, the kernel was why there were different
utilities in the first place.

AT&T USG/USL and Berkeley had divergent concerns and development
priorities in the Unix kernel itself. Back in those days, utilities
like ps(1)--and I think who(1) and w(1) as well--had to be setuid root
so that they could read kernel memory. This was in the days before
/proc. It took a while for the idea of a pseudo-file system to expose
kernel memory structures to user space in a stable and safe way to come
to fruition.

Since there was no API for obtaining process or user session
information, the different development organizations, with their
divergent kernels, changed what ever they needed to in the kernel to
achieve other objectives, like implementing virtual memory with demand
paging, or supporting symmetric multiprocessing.

Regards,
Branden
Michael Stone
2025-04-04 01:30:01 UTC
Reply
Permalink
But the thing that needs looking at is why who in Debian behaves like
it does, and if it doesn't work right, fix that in who or wherever
else it needs fixing in the stack.
Because I haven't turned it on, because I'm really unhappy about the
complete lack of a transition plan.
Reintroducing utmp will just hide the problem, and we'll be in the
same situation when releasing forky, or forky+1, ...
No we won't, if we publish a deprecation note, make both formats
available, and make it possible to turn off utmp. Ideally who would have
a flag that lets a user choose an implementation and compare the output,
to make it easier to identify discrepencies. It could use utmp if
available and query systemd if not, if no specific option is chosen.
PS: I never understood why there's both w and who, and why they are
different implementations.
who is a posix standard utility for listing logged-in users, w is a
mashup of uptime and finger that gives more of a system overview (from
the perspective of what you'd want to know about a system 40 years ago).
Henrik Ahlgren
2025-04-04 09:10:01 UTC
Reply
Permalink
Post by Michael Stone
I'm the one who gets the complaints that who isn't working right, and
there isn't a solution to that problem, since the systemd facility
doesn't provide the same information. I'd argue that a lot of people
didn't realize how screwed up things were going to be, because the
change didn't impact normal use until after a reboot.
I think if a fix is not possible, who(1) should at least terminate with
an error code (and possibly display an error message) rather than
failing silently as it currently does. Of course, some forms like "who
-b" still work.
Marc Haber
2025-04-04 13:50:01 UTC
Reply
Permalink
Post by Michael Stone
Post by Marco d'Itri
Post by Michael Stone
The issue isn't making a change, the issue is what change is the
right thing to do. IMO, dropping utmp without any kind of a
transition or deprecation period is the wrong thing to do. Hence
this thread.
I think it's a bit late now to disagree with the plan implemented last
year by multiple maintainers.
Except, of course, for the primary consumer of utmp...
I'm the one who gets the complaints that who isn't working right, and
there isn't a solution to that problem, since the systemd facility
doesn't provide the same information. I'd argue that a lot of people
didn't realize how screwed up things were going to be, because the
change didn't impact normal use until after a reboot. So people have
slowly been finding out over time that a decades-old interface is no
longer available, and the answer "well, we decided to drop it" falls a
little flat since there seems to be no actual reason to not just support
both mechanisms.
Can your package handle the classic on-disk format when it is compiled
with 64 bit time_t? I remember there was some discussion about that
back then.

Greetings
Marc
--
----------------------------------------------------------------------------
Marc Haber | " Questions are the | Mailadresse im Header
Rhein-Neckar, DE | Beginning of Wisdom " |
Nordisch by Nature | Lt. Worf, TNG "Rightful Heir" | Fon: *49 6224 1600402
Michael Stone
2025-04-04 15:40:01 UTC
Reply
Permalink
Post by Marc Haber
Can your package handle the classic on-disk format when it is compiled
with 64 bit time_t? I remember there was some discussion about that
back then.
If you touch /run/utmp you'll see it is still working fine with any
packages that haven't ripped out support.
Antonio Terceiro
2025-04-03 13:00:02 UTC
Reply
Permalink
/run/utmp is no longer provided in trixie, which means that the mechanisms
used to show active sessions in unix for several decades no longer work.
There's a replacement mechanism provided by systemd, but it's not 1:1. I
propose that for trixie *both* mechanisms are active, so a person can choose
between them (and compare the output, to better identify gaps between the
historic utmp mechanism and the new and improved systemd facility). I've
been told that the reason this can't be done is that utmp isn't y2038
compliant, but it seems to me that we won't be supporting trixie in y2038,
so who cares? Are there any factors to consider that I've missed?
I never cared about /run/utmp in itself, but I got used to last(1).
FWIW, a new implementation of last is now provided by wtmpdb.
Michael Stone
2025-04-04 20:20:01 UTC
Reply
Permalink
Post by Antonio Terceiro
I never cared about /run/utmp in itself, but I got used to last(1).
FWIW, a new implementation of last is now provided by wtmpdb.
+1
great, it looks like that
* wtmpdb(8)
could be a well alternative to who(1), as the discussed alternatives
wtmp goes with last, utmp goes with who; they have different semantics
Andrew Bower
2025-04-04 21:30:02 UTC
Reply
Permalink
Hi Dirk,
Post by Antonio Terceiro
I never cared about /run/utmp in itself, but I got used to last(1).
FWIW, a new implementation of last is now provided by wtmpdb.
+1
great, it looks like that
* wtmpdb(8)
[...]
The program arguments are not fully compatible with Unix equivalent
last(1). I.e. it seems not to be possible to just filter out all
current still active sessions, which should be provided by `last -p`
in the Unix world.
Presumably you used 'last -p now' for this? It looks like this would be
satisfied by a richer range of accepted time specifications by wtmpdb.
Do you want to raise a bug? Worth adding if there is anything else
defective (it looks to me that crashed sessions get included, which
seems unhelpful).

Andrew

Holger Levsen
2025-04-03 23:10:01 UTC
Reply
Permalink
package: releasenotes
/run/utmp is no longer provided in trixie, which means that the mechanisms
used to show active sessions in unix for several decades no longer work.
this should be filed as bug in the BTS, at the very least against the
releasenotes, so doing that now.
--
cheers,
Holger

⢀⣎⠟⠻⢶⣊⠀
⣟⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org
⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
⠈⠳⣄

Do you believe in climate change? I don‘t. I also don‘t ‚believe‘ in 2 plus 2
being 4. The idea of ‚believing‘ in climate change was the beginning of framing
science as ideology. Don‘t fall for it.
Richard Lewis
2025-04-04 19:30:01 UTC
Reply
Permalink
Post by Holger Levsen
package: releasenotes
/run/utmp is no longer provided in trixie, which means that the mechanisms
used to show active sessions in unix for several decades no longer work.
this should be filed as bug in the BTS, at the very least against the
releasenotes, so doing that now.
Is this different to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1083102

and, more importantly, is
https://salsa.debian.org/ddp-team/release-notes/-/merge_requests/214
missing anything frm this thread?
Loading...