Discussion:
Associating .texi files to the media type text/prs.texi?
Add Reply
Charles Plessy
2025-02-03 22:10:01 UTC
Reply
Permalink
Hello everybody,

I am preparing the update of `/etc/mime.types` for the Trixie release.

A new media type, text/prs.texi, declares association with the file
extension `texi`. On the other hand, the association between the
Texinfo format and `.texi` files has never been declared to the IANA.
As of today, Debian systems associate Texinfo files with the media type
`application/x-texinfo`.

I am in very strong favor in sticking as strictly as possible to the
IANA declarations for the content of `/etc/mime.types`. I have
confirmed by myself that IANA declarations are very easy today, by
registering application/vnd.debian.binary-package and
text/vnd.debian.copyright a dozen years ago. The declaration form is
here: <https://www.iana.org/form/media-types>.

The IANA declarations are not specifix to UNIX and the IANA does not
specify the contents `/etc/mime.types`. This file is also not the only
way on Debian systems to determine the media type of a file. Similar
and more powerful alternatives are provided by the `file` and
`shared-mime-info` packages for instance.

One serious limitation with `/etc/mime.types` is that its contents must
be adjusted to the file that most tools that parse it are not able to
handle the case where two media types are associated with the same
extension. And with the growth and ageing of computing, this is
increasingly a problem. So when this case arises I need to make a
choice and keep only one association. This is documented in the README
file of the `media-types` package.
(<https://salsa.debian.org/debian/media-types#removal-of-duplicated-file-extensions>)

In principle I want to give precedence to declared types over undeclared
ones. As I wrote above, applications using `/etc/mime.types` are using
the simplest and most limited way to determine file media types, and
errors can happen. I wonder if having Texinfo files presented as
`text/prs.texi` from time to time would be such a bad thing, especially
that they are all `text/*` after all. Web browsers would still offer to
users the possibility to open the file with a text editor. And the
situation could be easily reverted by somebody declaring `text/texinfo`
to the IANA.

So please let me know if you think that something would really break if
I would associate text/prs.texi to texi files.

Have a nice day,

Charles
--
Charles Plessy Nagahama, Yomitan, Okinawa, Japan
Debian Med packaging team http://www.debian.org/devel/debian-med
Tooting from work, https://fediscience.org/@charles_plessy
Tooting from home, https://framapiaf.org/@charles_plessy
Simon Josefsson
2025-02-03 23:20:01 UTC
Reply
Permalink
And the situation could be easily reverted by somebody declaring
`text/texinfo` to the IANA.
I did so now.

/Simon
Simon Josefsson
2025-04-23 08:50:01 UTC
Reply
Permalink
Post by Simon Josefsson
And the situation could be easily reverted by somebody declaring
`text/texinfo` to the IANA.
I did so now.
There was a bunch of discussion back and forth with IANA and eventually
application/texinfo was registered:

https://www.iana.org/assignments/media-types/media-types.xhtml#application
https://www.iana.org/assignments/media-types/application/texinfo

There were some charset concerns about text/texinfo (that I never
managed to understand myself, so I can't confirm if they weren't just
imaginary), and some people thought application/x-texinfo was more
wide-spread than text/x-texinfo.

I didn't understand from your first e-mail what your thinking around GNU
Texinfo format for the mime.types registry is? As far as I can tell
there is a proper application/x-texinfo entry already? I tried reading
about the text/prs.texi format on

https://www.iana.org/assignments/media-types/text/prs.texi

but the links aren't working. Is there any software that supports that
format in Debian?

My preference would be to have a application/texinfo associated with
*.texi and *.texinfo. If there is some preference mechanism in place, I
would prefer that application/texinfo is before text/prs.texi.

Of course, I see no reason to remove support for application/x-texinfo,
it should from now on just be an alias for application/texinfo.

/Simon
Jakub Wilk
2025-04-23 09:50:01 UTC
Reply
Permalink
Post by Simon Josefsson
https://www.iana.org/assignments/media-types/application/texinfo
The "Published specification" link is:
https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#Info-Format-Specification

This points to the chapter about the Info format, which is very
different than Texinfo. I recommend removing the anchor from that URL.
--
Jakub Wilk
Simon Josefsson
2025-04-23 10:30:01 UTC
Reply
Permalink
Post by Jakub Wilk
Post by Simon Josefsson
https://www.iana.org/assignments/media-types/application/texinfo
https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#Info-Format-Specification
This points to the chapter about the Info format, which is very
different than Texinfo. I recommend removing the anchor from that URL.
Oops, thank you! I've asked IANA to implement your suggestion.

/Simon
Charles Plessy
2025-04-25 23:10:01 UTC
Reply
Permalink
Post by Simon Josefsson
There was a bunch of discussion back and forth with IANA and eventually
Thanks a lot!
Post by Simon Josefsson
I didn't understand from your first e-mail what your thinking around GNU
Texinfo format for the mime.types registry is? As far as I can tell
there is a proper application/x-texinfo entry already? I tried reading
about the text/prs.texi format on
https://www.iana.org/assignments/media-types/text/prs.texi
but the links aren't working. Is there any software that supports that
format in Debian?
In /etc/mime.types, I try to remove unofficial types, and give preference
to the ones declared to the IANA for the association with file extensions.

In the case of the .texi extension it was recently declared by text/prs.texi
and if I follow my own policy, then I would need to de-associate it from
text/x-texinfo regardless the relevance of text/prs.texi. Duplicate entries in
/etc/mime.types are not supported by some software including web browsers.

So what I did was to refrain from adding text/prs.texi for Trixie and waited
for application/texinfo to be registered. Now I can update the package (maybe
not for Trixie) so that application/texinfo takes over text/x-texinfo.

I hope it explains better.

Thanks again!

Charles
--
Charles Plessy Nagahama, Yomitan, Okinawa, Japan
Debian Med packaging team http://www.debian.org/devel/debian-med
Tooting from home https://framapiaf.org/@charles_plessy
- You do not have my permission to use this email to train an AI -
Simon Josefsson
2025-04-26 18:30:01 UTC
Reply
Permalink
Thanks for explaining!
Duplicate entries in /etc/mime.types are not supported by some
software including web browsers.
This was the part I was missing for my understanding. That seems like a
bug. Is progress on fixing that tracked anywhere? Is /etc/mime.types
still the state of the art for MIME mappings, or should applications
better use some other method?

/Simon
Charles Plessy
2025-04-27 01:00:01 UTC
Reply
Permalink
Is /etc/mime.types still the state of the art for MIME mappings, or should
applications better use some other method?
Hi Simon,

using /etc/mime.types is not the state of the art for detecting the media type
of a file. It is quite ancient and tools that use it tend to rely on the fact
that it is not going to evolve. For this reason the following limitations are
unlikely to be addressed:

- there is an implicit promise that there will be no file extension duplicates.
- case-sensitivity of file extensions is not specified.
- alternatives to file extensions for media type detection are not provided.

By the way, the media-types package that distributes /etc/mime.types (and
nothing else), is Priority: standard and therefore software that use
unconditionally must depend on it.

There are two main alternatives:

- the file command and its libmagic library, which can probe a file's contents
for magic numbers and more complex patterns.

- the xdg-mime that queries the shared-mime-info database for file extension
and magic number matches.

What /etc/mime.types provides that the two alternatives do not is the
exhaustive list of IANA-approved media types including those that are not
declaring an extension. I do not know how useful it is and would be pleased
to hear about applications. (Because maintaining that list is tedious).

In conclusion, my advice is not not use /etc/mime.types unless the alternatives
have serious drawbacks.

Have a nice day,
--
Charles Plessy Nagahama, Yomitan, Okinawa, Japan
Debian Med packaging team http://www.debian.org/devel/debian-med
Tooting from home https://framapiaf.org/@charles_plessy
- You do not have my permission to use this email to train an AI -
Loading...