Discussion:
Supporting alternative zlib implementations
Add Reply
Mark Brown
2024-09-24 14:10:01 UTC
Reply
Permalink
A recurrning question with the zlib package in Debian is interest in the
various alternative zlib implementations that are out there. There was
a long period where upstream zlib development seemed very stalled,
during that period people who wanted improvements started forking their
own projects. The main ones I'm aware of are:

zlib-ng: https://github.com/zlib-ng/zlib-ng
chromium: https://chromium.googlesource.com/chromium/src/third_party/zlib

zlib-ng seems pretty healthy, the chromium fork is less generally active
but is used by Chrome/ChromeOS which is a big userbase.

The main thing people seem excited about is performance work for modern
platforms though both projects have been doing other work on the code.
Unfortunately it looks like there is little interest in bringing these
forks together in spite of zlib's upstream development having picked up
a bit again.

Fedora did a transition to zlib-ng relatively recently, in version 40:

https://fedoraproject.org/wiki/Changes/ZlibNGTransition
https://packages.fedoraproject.org/pkgs/zlib/zlib/
https://packages.fedoraproject.org/pkgs/zlib-ng/zlib-ng/

In the past I've pushed back on doing anything here since zlib is
essential and it seemed better to be consistent over the ecosystem than
to use a more niche implementation, and some of the early optimisation
efforts had not worked well on CPUs other than their immediate targets.
However given the user feedback and looking at the Fedora experience I
think it might be time to reevaluate that.

Obviously it's far too late to do anything with the default for trixie,
we might want to evaluate doing something after the release but for now
it's too late.

There's been some ongoing discussion (which sadly I wasn't looped into
most of) of zlib-ng in WNPP:

https://bugs.debian.org/1002056

with some packaging done, but not AIUI building the zlib compatible ABI
for zlib-ng yet which would allow it to be used as a replacement.
Adding support for the compatible ABI allowing it to be an alternative
for standard zlib seems to me like an obvious step we could take, it
would need a lot of care given that zlib is essentially but would let
people get zlib-ng if they wanted, and if there are problems it can be
held in unstable (or experimental) to avoid impact on trixie. This
would allow people to kick the tires.

Does anyone have thoughts on this?
Guillem Jover
2024-09-24 15:50:01 UTC
Reply
Permalink
Hi!
Post by Mark Brown
In the past I've pushed back on doing anything here since zlib is
essential and it seemed better to be consistent over the ecosystem than
to use a more niche implementation, and some of the early optimisation
efforts had not worked well on CPUs other than their immediate targets.
However given the user feedback and looking at the Fedora experience I
think it might be time to reevaluate that.
Great! I'm happy to hear that.
Post by Mark Brown
Obviously it's far too late to do anything with the default for trixie,
we might want to evaluate doing something after the release but for now
it's too late.
Personally I don't think it's too late, there should be several months
until the freeze, and I think if we wanted to switch we could perhaps
do a staged transition and see how it goes and only do the final
replacement if everything seems fine.
Post by Mark Brown
There's been some ongoing discussion (which sadly I wasn't looped into
https://bugs.debian.org/1002056
with some packaging done, but not AIUI building the zlib compatible ABI
for zlib-ng yet which would allow it to be used as a replacement.
Adding support for the compatible ABI allowing it to be an alternative
for standard zlib seems to me like an obvious step we could take, it
would need a lot of care given that zlib is essentially but would let
people get zlib-ng if they wanted, and if there are problems it can be
held in unstable (or experimental) to avoid impact on trixie. This
would allow people to kick the tires.
Sorry, I've been meaning to bring this up again to your attention,
given that as you mention zlib-ng has seen steady development and
buy in from the community at large. But at the same time, I've been
both a bit reluctant to upload anything to avoid the impression of
some kind of attempt to a hostile takeover, and to bring this up to
you as from your earlier push back I thought that would require some
(perhaps) exceeding changed circumstances. But given your mail, I'm
happy to work on this again and start with say uploading some initial
stuff into experimental for example, after this thread settles a bit?

(I'll start by refreshing the packaging first though.)
Post by Mark Brown
Does anyone have thoughts on this?
Personally, I think fully migrating from zlib to zlib-ng would sound
great (even for trixie), but I guess we can take it slow if you do not
feel confident or have concerns over this.

Also if you'd prefer to take over the zlib-ng ITP, as a continuation
of zlib, that'd seem fine with me too.

Thanks,
Guillem
Fay Stegerman
2024-09-24 22:50:01 UTC
Reply
Permalink
* Guillem Jover <***@debian.org> [2024-09-24 17:45]:
[...]
Post by Guillem Jover
Personally, I think fully migrating from zlib to zlib-ng would sound
great (even for trixie), but I guess we can take it slow if you do not
feel confident or have concerns over this.
As using an alternative zlib implementation could impact Reproducible Builds
[1], I would recommend taking that into consideration before deciding on this
kind of change.

- Fay

[1] https://lists.reproducible-builds.org/pipermail/rb-general/2024-September/003526.html
Guillem Jover
2024-09-25 00:00:01 UTC
Reply
Permalink
Hi!
Post by Fay Stegerman
Post by Guillem Jover
Personally, I think fully migrating from zlib to zlib-ng would sound
great (even for trixie), but I guess we can take it slow if you do not
feel confident or have concerns over this.
As using an alternative zlib implementation could impact Reproducible Builds
[1], I would recommend taking that into consideration before deciding on this
kind of change.
Ah, this is related to something I wanted to mention too and forgot.

I don't think the specific case you mention is in itself a concern for
Debian, because we only guarantee reproducibility given the same inputs,
which includes the set of packages and their versions that were used
when building the binaries. So if there was a switch, those would end up
being recorded as well, and used when reproducing the outputs. And this
could also happen with a newer version of zlib itself.

The problem though is, that because the compressed stream is going to
change, that can make certain test suites fail if we perform this
switch, which I think would be the main fallout that we'd see from
this and would need manual fixing, although I assume Fedora has probably
handled most of these already. For example when I added explicit
zlib-ng support to dpkg, I had to fix its test suite to parametrize
sizes for test artifacts.

I think it would be pretty easy to at least see the extent of this
fallout by performing a mass rebuild for packages build-depending
on zlib1g-dev with a zlib-ng version.

Thanks,
Guillem
Mike Hommey
2024-09-25 00:40:01 UTC
Reply
Permalink
Post by Guillem Jover
Hi!
Post by Fay Stegerman
Post by Guillem Jover
Personally, I think fully migrating from zlib to zlib-ng would sound
great (even for trixie), but I guess we can take it slow if you do not
feel confident or have concerns over this.
As using an alternative zlib implementation could impact Reproducible Builds
[1], I would recommend taking that into consideration before deciding on this
kind of change.
Ah, this is related to something I wanted to mention too and forgot.
I don't think the specific case you mention is in itself a concern for
Debian, because we only guarantee reproducibility given the same inputs,
which includes the set of packages and their versions that were used
when building the binaries. So if there was a switch, those would end up
being recorded as well, and used when reproducing the outputs. And this
could also happen with a newer version of zlib itself.
The problem though is, that because the compressed stream is going to
change, that can make certain test suites fail if we perform this
switch, which I think would be the main fallout that we'd see from
this and would need manual fixing, although I assume Fedora has probably
handled most of these already. For example when I added explicit
zlib-ng support to dpkg, I had to fix its test suite to parametrize
sizes for test artifacts.
As someone who recently tested a rust port of zlib-ng, another factor to
take into account is that while zlib-ng is faster, it also looks like it
compresses less at the same compression level.
Using higher compression levels with the same compression rate is also
usually faster too, but without touching the compression levels, you
end up with something that compresses in less time, but also with a
bigger output.

Mike
Simon Richter
2024-09-25 02:30:02 UTC
Reply
Permalink
Hi,
Post by Guillem Jover
So if there was a switch, those would end up
being recorded as well, and used when reproducing the outputs. And this
could also happen with a newer version of zlib itself.
I have a POWER9 box, which includes a NX-GZIP coprocessor, which is
currently not used for anything, but it makes even -9 compression very
cheap (9 GB/s). If I were to use it (probably through the kernel
subsystem), I should probably record it somewhere.

Thinking about it, I'd expect POWER to generate different output than
x86 when building an arch:all package with the result of some floating
point computation, at least when -ffast-math is active (x86 being wrong).

Simon
Mark Brown
2024-09-25 09:00:01 UTC
Reply
Permalink
Post by Guillem Jover
The problem though is, that because the compressed stream is going to
change, that can make certain test suites fail if we perform this
switch, which I think would be the main fallout that we'd see from
this and would need manual fixing, although I assume Fedora has probably
handled most of these already. For example when I added explicit
zlib-ng support to dpkg, I had to fix its test suite to parametrize
sizes for test artifacts.
I guess this is also a risk for zlib upgrades, seems a bit fragile.
Fay Stegerman
2024-09-25 23:40:01 UTC
Reply
Permalink
Post by Guillem Jover
Hi!
Post by Fay Stegerman
Post by Guillem Jover
Personally, I think fully migrating from zlib to zlib-ng would sound
great (even for trixie), but I guess we can take it slow if you do not
feel confident or have concerns over this.
As using an alternative zlib implementation could impact Reproducible Builds
[1], I would recommend taking that into consideration before deciding on this
kind of change.
Ah, this is related to something I wanted to mention too and forgot.
I don't think the specific case you mention is in itself a concern for
Debian, because we only guarantee reproducibility given the same inputs,
which includes the set of packages and their versions that were used
when building the binaries. So if there was a switch, those would end up
being recorded as well, and used when reproducing the outputs. And this
could also happen with a newer version of zlib itself.
The problem though is, that because the compressed stream is going to
change, that can make certain test suites fail if we perform this
switch, which I think would be the main fallout that we'd see from
this and would need manual fixing, although I assume Fedora has probably
handled most of these already. For example when I added explicit
zlib-ng support to dpkg, I had to fix its test suite to parametrize
sizes for test artifacts.
Whilst it indeed may not affect the reproducibility guarantees for Debian
packages themselves, it does affect being able to use a Debian system for
Reproducible Builds of other software for which the reference artefacts were
built with regular zlib and thus can no longer be reproduced on Debian if that
uses a different zlib implementation (so far I've only encountered the reverse,
which seems relatively rare -- for now).

For example, ZIP files or Android APKs built on a Debian system will have a
different compressed stream, like the test files you mention. Which will likely
break Reproducible Builds tooling like apksigcopier [1] and
reproducible-apk-tools [2].

AFAIK all rebuilders (including my own [3]) for Android APKs use Debian base
systems, so this could cause quite a bit of breakage for Reproducible Builds
within that ecosystem, which is something I would like to avoid (or at least
have a decent workaround for -- e.g. being able to easily choose between
multiple zlib implementations during runtime in my Python tooling would be
great).

As you point out, we've been lucky that zlib has remained backwards-compatible
for a long time (even though it doesn't provide any guarantees of that AFAIK).
Which also makes me wonder how much more likely zlib-ng might be to produce
different compressed streams between different versions or using different
hardware (configurations).

There might also be issues with reproducibility of Debian packages themselves if
e.g. zlib-ng output can differ on different hardware (e.g. number of cores) even
with an otherwise identical build environment. At the very least I think it
would be good to know how all this could be affected (and how likely things are
to remain as stable as zlib has been so far) before making a decision to switch.
Post by Guillem Jover
I think it would be pretty easy to at least see the extent of this
fallout by performing a mass rebuild for packages build-depending
on zlib1g-dev with a zlib-ng version.
- Fay

[1] https://tracker.debian.org/pkg/apksigcopier
[2] https://github.com/obfusk/reproducible-apk-tools
[3] https://github.com/obfusk/rbtlog
Sebastian Andrzej Siewior
2024-10-03 20:10:01 UTC
Reply
Permalink
Post by Fay Stegerman
For example, ZIP files or Android APKs built on a Debian system will have a
different compressed stream, like the test files you mention. Which will likely
break Reproducible Builds tooling like apksigcopier [1] and
reproducible-apk-tools [2].
wouldn't it work to compare the decompressed stream? Is an identical ZIP
file a requirement?
Post by Fay Stegerman
There might also be issues with reproducibility of Debian packages themselves if
e.g. zlib-ng output can differ on different hardware (e.g. number of cores) even
with an otherwise identical build environment. At the very least I think it
would be good to know how all this could be affected (and how likely things are
to remain as stable as zlib has been so far) before making a decision to switch.
I don't know at this time. Maybe we could throw it into exp first and
evaluate the situtation.
Post by Fay Stegerman
- Fay
Sebastian
Fay Stegerman
2024-10-03 22:20:01 UTC
Reply
Permalink
Post by Sebastian Andrzej Siewior
Post by Fay Stegerman
For example, ZIP files or Android APKs built on a Debian system will have a
different compressed stream, like the test files you mention. Which will likely
break Reproducible Builds tooling like apksigcopier [1] and
reproducible-apk-tools [2].
wouldn't it work to compare the decompressed stream? Is an identical ZIP
file a requirement?
By definition a Reproducible Build means a bit-by-bit identical APK, including
the signature (which is why I built a tool to extract an existing signature and
use it as a build input instead of the private key). Which means you need
identical compressed data for Reproducible Builds.

Having identical uncompressed data gets you pretty close to the goals of RB, but
unpacking and/or skipping over signatures is very very hard to get right and
simply cannot provide the same guarantees as having two bitwise identical files.

And it's impossible to create an APK you can actually install if it's not
bit-by-bit identical as the signature would not be valid otherwise. So yes,
unfortunately an identical ZIP file is a requirement and comparing the
decompressed stream not an option, which is why this kind of change is not
something we can just consider an implementation detail or work around.

I wrote more about the very messy situation Fedora's switch to zlib-ng already
created for Android Reproducible Builds [1]. Which likely would have broken a
lot more reproducible Android apps already if Fedora's OpenJDK packages linked
against the system zlib like Debian's OpenJDK packages do (instead of using an
embedded copy of regular zlib).

- Fay

[1] https://lists.reproducible-builds.org/pipermail/rb-general/2024-September/003547.html
Konstantin Demin
2024-10-04 04:30:01 UTC
Reply
Permalink
One minor moment: zlib-ng doesn't seem to be fully backward compatible.
E.g. Angie (nginx's fork with enhancements) is unable to perform gzip
compression [1] if built against zlib-ng.
It's highly likely that nginx is affected too.

[1] https://t.me/angie_support/4205
Post by Fay Stegerman
Post by Sebastian Andrzej Siewior
Post by Fay Stegerman
For example, ZIP files or Android APKs built on a Debian system will have a
different compressed stream, like the test files you mention. Which will likely
break Reproducible Builds tooling like apksigcopier [1] and
reproducible-apk-tools [2].
wouldn't it work to compare the decompressed stream? Is an identical ZIP
file a requirement?
By definition a Reproducible Build means a bit-by-bit identical APK, including
the signature (which is why I built a tool to extract an existing signature and
use it as a build input instead of the private key). Which means you need
identical compressed data for Reproducible Builds.
Having identical uncompressed data gets you pretty close to the goals of RB, but
unpacking and/or skipping over signatures is very very hard to get right and
simply cannot provide the same guarantees as having two bitwise identical files.
And it's impossible to create an APK you can actually install if it's not
bit-by-bit identical as the signature would not be valid otherwise. So yes,
unfortunately an identical ZIP file is a requirement and comparing the
decompressed stream not an option, which is why this kind of change is not
something we can just consider an implementation detail or work around.
I wrote more about the very messy situation Fedora's switch to zlib-ng already
created for Android Reproducible Builds [1]. Which likely would have broken a
lot more reproducible Android apps already if Fedora's OpenJDK packages linked
against the system zlib like Debian's OpenJDK packages do (instead of using an
embedded copy of regular zlib).
- Fay
[1] https://lists.reproducible-builds.org/pipermail/rb-general/2024-September/003547.html
--
SY,
Konstantin Demin
Mark Brown
2024-09-25 08:50:02 UTC
Reply
Permalink
Post by Guillem Jover
Post by Mark Brown
Obviously it's far too late to do anything with the default for trixie,
we might want to evaluate doing something after the release but for now
it's too late.
Personally I don't think it's too late, there should be several months
until the freeze, and I think if we wanted to switch we could perhaps
do a staged transition and see how it goes and only do the final
replacement if everything seems fine.
We do OTOH package more software than most distros on more architectures
so we got a lot more exposure for testing coverage, and the revert would
involve switching the entire implementation which complicates things a
bit compared to a risky patch within a package. I'm not totally
opposed, and if everything goes smoothly we could definitely implement
it within the timeframe, but it feels like an impactful change to
introduce now not having considered it sooner.
Post by Guillem Jover
(perhaps) exceeding changed circumstances. But given your mail, I'm
happy to work on this again and start with say uploading some initial
stuff into experimental for example, after this thread settles a bit?
(I'll start by refreshing the packaging first though.)
Sure.
Post by Guillem Jover
Post by Mark Brown
Does anyone have thoughts on this?
Personally, I think fully migrating from zlib to zlib-ng would sound
great (even for trixie), but I guess we can take it slow if you do not
feel confident or have concerns over this.
Also if you'd prefer to take over the zlib-ng ITP, as a continuation
of zlib, that'd seem fine with me too.
I'm fine with you carrying on with it (actually there is some slight
non-technical complication for me with doing it myself), or we could
also consider a packaging team. I think there was some other interest
in helping out but ICBW. If you're packaging it I'm also more confident
in letting you worry about how risky it is to transition and deal with
any fallout! :P
Guillem Jover
2024-11-22 11:40:01 UTC
Reply
Permalink
Hi!

[ I'll try to summarize the current discussion and status, what might
be blockers, and a potential incremental way forward. ]
Post by Mark Brown
Post by Guillem Jover
Post by Mark Brown
Obviously it's far too late to do anything with the default for trixie,
we might want to evaluate doing something after the release but for now
it's too late.
Personally I don't think it's too late, there should be several months
until the freeze, and I think if we wanted to switch we could perhaps
do a staged transition and see how it goes and only do the final
replacement if everything seems fine.
We do OTOH package more software than most distros on more architectures
so we got a lot more exposure for testing coverage, and the revert would
involve switching the entire implementation which complicates things a
bit compared to a risky patch within a package. I'm not totally
opposed, and if everything goes smoothly we could definitely implement
it within the timeframe, but it feels like an impactful change to
introduce now not having considered it sooner.
True, also two months have passed since (that's on me!). At this time,
I'm now not sure whether it is feasible to consider such a switch, even
if there was agreement to do it. As it is, I think there are too many
unknowns!
Post by Mark Brown
Post by Guillem Jover
(perhaps) exceeding changed circumstances. But given your mail, I'm
happy to work on this again and start with say uploading some initial
stuff into experimental for example, after this thread settles a bit?
(I'll start by refreshing the packaging first though.)
Sure.
I did that, and the current WIP zlib-ng packaging provides now two
builds, one with the new native zng_* API and another (tentatively)
with the compat API/ABI one in libz-dev and libz1 binary packages.

I've tentatively chosen those package names for the compat libraries
to avoid having to go through NEW multiple times (with the assumption
that we'd either go ahead with the switch or the packages could then
simply be dropped). I think this should initially only be uploaded to
experimental, to avoid getting packages built with either zlib or
zlib-ng. But depending on the outcome of this discussion, I think other
(probably better) options would be to perhaps name the compat packages
something like libz-ng-compat*, or drop them completely?

WIP package at <https://git.hadrons.org/cgit/wip/debian/pkgs/zlib-ng.git/>.
Post by Mark Brown
Post by Guillem Jover
Post by Mark Brown
Does anyone have thoughts on this?
Personally, I think fully migrating from zlib to zlib-ng would sound
great (even for trixie), but I guess we can take it slow if you do not
feel confident or have concerns over this.
Also if you'd prefer to take over the zlib-ng ITP, as a continuation
of zlib, that'd seem fine with me too.
I'm fine with you carrying on with it (actually there is some slight
non-technical complication for me with doing it myself), or we could
also consider a packaging team. I think there was some other interest
in helping out but ICBW. If you're packaging it I'm also more confident
in letting you worry about how risky it is to transition and deal with
any fallout! :P
Ok, so after the feedback on this thread, and Sebastian asking how we
can proceed, here are the concerns brought on this thread, along my
own and things I think we need to check or consider:

* There were concerns (from Fay) about whether given same input the
output changes per arch or hw setup, we'd need to check this; I'd
expect this not to be the case for different arches, but it might
be an issue with number of cores for example, but if either is true
this would be a serious blocker.
* I've had concerns both about providing the zlib compat API and the
native zlib-ng API in sid, and then getting a mess of packages
linking against (true) zlib and against (native) zlib-ng, or
packages relying on specific behaviors from either and breaking
when switching from (true) zlib to zlib-ng-compat or vice versa,
for example.
* There were concerns (from Fay) about the output stream changing due
to a potential implementation switch and that affecting external
reproducibility. Personally I think while I can see how this is
annoying for the involved parties, it's part of the "you need
the same tools to generate the same output" premise that we also
assume in Debian. I guess keeping both implementations around
indefinitely, I think, would make this less of an issue, with the
potential drawbacks mentioned in the previous point.
* There was a concern (from Konstantin) about at least one known
upstream (Angie) misbehaving with zlib-ng generated streams.
* There were concerns (from Mark) that even though projects like
Fedora have done such switch, we have way more packages and
architectures, so we might see more fallout that has not already
been handled.
* There was a concern (from Mike) about whether the performance gain
at the cost of stream size makes sense, given that the compression
level could be reduced instead to similar effect (?). I'm not sure
how these compare, so it would be interesting to analyze this,
because perhaps that's a less traumatic way to look at it (but that
might require redefining compression level semantics globally in
zlib, or patching users, with neither look very enticing options).
My perception from when I tested it is that the speed up was
significant enough and the size increase not so much, but… In any
case switching to zlib-ng upstream would also imply other benefits,
like (supposedly) a more responsive upstream with more frequent
releases, the new native API, and an implementation other
distributions are switching to.
* Some upstreams have started to use the zlib-ng native API, so
regardless of whether we plan a switch or not, I guess packaging
zlib-ng (w/ or w/ the compat API) might still make sense.
* To consider a switch we'd need to do a mass rebuild of the
archive. Ideally running autopkgtests and similar to exercise the
packages?


After having written the above, and if Mark agrees, I think I'd opt for
uploading zlib-ng to experimental, with the compat packages renamed to
libz-ng-compat* or similar (even if that implies later on another trip
through NEW if we want to perform a full switch), because that might
make it easier to move them to sid as a way less disruptive change,
even if we decide not to switch the default zlib implementation.

OTOH and unfortunately I don't think I'm currently prepared to drive any
of what I think might be required mass archive rebuilds and testing or
the analysis mentioned above.

Thanks,
Guillem

Charles Plessy
2024-09-25 00:40:01 UTC
Reply
Permalink
Post by Mark Brown
zlib-ng: https://github.com/zlib-ng/zlib-ng
Hi Mark, just out of curiosity, would the carbon footprint of Debian be
lower or higher after replacing zlib with zlib-ng?

Have a nice day,

Charles
--
Charles Plessy Nagahama, Yomitan, Okinawa, Japan
Debian Med packaging team http://www.debian.org/devel/debian-med
Tooting from home https://framapiaf.org/@charles_plessy
- You do not have my permission to use this email to train an AI -
Mark Brown
2024-09-25 08:50:01 UTC
Reply
Permalink
Post by Charles Plessy
Post by Mark Brown
zlib-ng: https://github.com/zlib-ng/zlib-ng
Hi Mark, just out of curiosity, would the carbon footprint of Debian be
lower or higher after replacing zlib with zlib-ng?
You could probably calculate it either way depending on how you want to
make up the numbers; running faster will take less time but larger
outputs might take more storage.
Loading...