Musings about Usernames in adduser and Debian

Discussion:

(too old to reply)

Marc Haber

2024-11-21 17:50:01 UTC

[writing this with my adduser hat on. I am also in touch with the
maintainers of src:shadow and base-passwd]

Hi,

recently, I have "taken over" the wiki page about UserAccounts and have
put in some history and general thoughts about what Debian thinks about
user names and name restrictions.

https://wiki.debian.org/UserAccounts

I fear that I have opened an especially nasty can of worms by beginning
to do sanity checks in adduser and being pointed towards user name
encoding in that process. Can you help me to bring some sense into this
mess?

I would like to hear your comments. Feel free to directly apply
corrections to the wiki page. I am especially interested in having clear
terminology regarding unicode codepoints, UTF-8, character strings and
byte strings. It is vitally important to be consistent her to avoid
making the mess even worse.

For adduser's next release, I would like to discuss the following
things:

(1)
Should Debian allow UTF-8 user names in the first place or should we
restrict names for regular users to some us-ascii near set as well? (I
think yes, we should)

(2)
If the answer to (1) is "allow UTF-8", should we also do that for system
users? (I think no, we should not)

(2a)
Which UTF-8 subset / code point classes should we allow and which should
we reject? (I don't have an opinion about that)

(3)
I think that 32 characters/bytes (it's the same if we don't allow UTF-8)
is a good limitation for a system user name. But, should we increase
that for regular user names? (I think yes)

(4)
If we decide to relax some of our current requirements, where are the
borders between "normal" user name, one that requires --allow-bad-names
and finally one that requires --allow-all-names? Wouldn't it be
offensive to speakers of some languages that require --allow-bad-names
for their special characters to be allowed on a user name? (no opinion
here that would not break backwards compatibility)

(5)
Is it right to say "the user name in /etc/passwd is UTF-8 encoded" or
should I better say "the user name in /etc/passwd can be UTF-8 encoded"?

(6)
Does it still make sense to give non-UTF-8-locales special handling
(which one?), or can adduser safely assume that any non-ascii locale is
UTF-8? Or must I check for locale and reject UTF-8 user names on
non-UTF-8 locales? (I hope that we can safely assume UTF-8)

(7)
Do the general restrictions for both kinds of user names make sense?
Going forward with this would mean to reject user names that we used to
accept before. (I think we should come close to systemd's ideas)

(8)
I think that our current way to restrict system account names is fine.
Any objections/additions here?

(9)
Should some of this language be in Policy instead of some random wiki
page? Policy is quite short about user names (chapter 9.2) (I think yes)

(10)
What should adduser do regarding subuids? Since I was ignorant about
that concept until a few hours ago, all accounts created by adduser do
have subuids, regardless of being system account or not, while useradd
does not give system accounts subuids.

Greetings
Marc

P.S.: The teams and inviduals working on src:shadow, base-passwd and
adduser would appreciate your help in coding and packaging. You can gt
in touch with all involved parties via
pkg-shadow-***@lists.alioth.debian.org

Richard Lewis

2024-11-21 22:10:01 UTC