Discussion:
Complete and unified documentation for new maintainers
Add Reply
Fabio Fantoni
2025-01-11 12:50:01 UTC
Reply
Permalink
There has been a lot of talk about attracting and helping new
maintainers, some improvements have been made "here and there", the
documentation of gbp (the most used tool) has been improved, salsa,
salsa-ci have been improved, there is discussion about DEP18, accepting
DEP14 etc...

Having mostly the packages in the same place (salsa), with the same
methods/tool (with gbp) would be useful, obviously not preventing you
from using or continuing with something else, but if you want to aim for
this thing you should firstly have a complete, simple and fast
documentation that mainly aims at this goal.

Today trying to see how a new person who wants to start maintaining new
packages would do and trying to do research thinking from his point of
view and from simple searches on the internet I found unfortunately that
these parts are fragmented and do not help at all to aim for something
unified but not even simple and fast enough.

Write on Google "Debian create new package" and first result:
https://wiki.debian.org/HowToPackageForDebian

It points to various parts but mainly the more probable start point
seems https://wiki.debian.org/Packaging/Intro

To point to git and gbp seems more useful
https://wiki.debian.org/PackagingWithGit Here wrote also about DEP14,
tell writing first package out of git and after import, in fact it is
not simple and fast to create the initial package starting immediately
from git and neither to use immediately gbp and also DEP14, to create
immediately on salsa etc... I remember that the last packages created
new some time ago I had to do many steps, workarounds and only after
convert the branches to the DEP14 names.

What would be the best, easiest and fastest procedure (especially for
newcomers) to create a new package from scratch, aiming to use git,
salsa, salsa-ci, gbp and DEP14 from the beginning?

Once found I think it would be useful to have it well documented, in a
unified part and to which new arrivals can point (and also mentors use
and point to it).


There are people with enough experience with git, good intuition and who
learn easily and fast, but as I have noticed there are also people who
have no or almost no experience with git and have difficulty learning
new things.

An example is a new maintainer that I helped a lot to packaging his
software for Debian, at the beginning he tried alone, he arrived on
mentors but despite some attempts for a long time he continued only with
unofficial packages that he created in a simple and fast way.

I explained to him about the packaging itself and then given the many
difficulties in making him learn the necessary things for git and
package management I thought it was better to avoid complicating further
with external packaging on salsa but to do it on a branch of the
upstream repository on github (with gbp import-ref) and now he has
managed to prepare the latest versions by himself or almost.

In some cases like that it might be better to host the packages outside
of salsa but for the major cases having a detailed, simple, fast and
unique documentation to aim to use git, salsa and salsa-ci from the
beginning I think would be useful. Using salsa-ci from the beginning can
help to have some first tests also in a simpler and faster way rather
than making local environments for clean builds on Sid and also all the
various necessary checks, those can be done shortly later, so as not to
overload the new people with too many tools, too many procedures etc...
they will already be loaded by the packaging itself in a complete way
and following the policies and then there are parts like for example the
complete and correct creation of debian/copyright that could even take
more time than all the rest of the packaging in some cases (but this is
another thing).


I don't know if I've managed to explain well what I mean, but from what
I've seen over the years, most of the people I've seen trying to
approach packaging have had difficulty finding documentation and help
(even on mentors, although it seems to have improved recently), but I
haven't had time to follow people, I've mostly made some occasional
messages and tried to follow a few people specifically (unfortunately I
haven't had enough time either). What I seem to have noticed (I could
also be wrong) is that most people "run away" at the beginning or after
a while due to the difficulty in finding all the necessary information
and/or too much time spent on basic packaging following the policies and
complete and correct d/copyright. Others, on the other hand, seem to
have left due to too much time spent waiting for someone that revision
the package or finding sponsors that can upload it (but this would also
be another thing outside the point of this topic).
Andrey Rakhmatullin
2025-01-11 13:10:01 UTC
Reply
Permalink
I don't know if I've managed to explain well what I mean, but from what I've
seen over the years, most of the people I've seen trying to approach
packaging have had difficulty finding documentation and help (even on
mentors, although it seems to have improved recently)
Yeah, not sure if that's your point but I think everyone agrees that we
need a good new packager document and while there were some attempts in
the past (see links on https://mentors.debian.net/intro-maintainers/ )
there is still AFAIK no comprehensive and modern one. debmake-doc is the
only one that tries to be modern, not sure how successful is it in that.
--
WBR, wRAR
gregor herrmann
2025-01-12 00:20:01 UTC
Reply
Permalink
Post by Andrey Rakhmatullin
Yeah, not sure if that's your point but I think everyone agrees that we
need a good new packager document and while there were some attempts in
the past (see links on https://mentors.debian.net/intro-maintainers/ )
there is still AFAIK no comprehensive and modern one. debmake-doc is the
only one that tries to be modern, not sure how successful is it in that.
My favourite packaging tutorial is still Lucas'
https://tracker.debian.org/pkg/packaging-tutorial
https://www.debian.org/doc/manuals/packaging-tutorial/packaging-tutorial.en.pdf


Cheers,
gregor
--
.''`. https://info.comodo.priv.at -- Debian Developer https://www.debian.org
: :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06
`. `' Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe
`-
Julien Plissonneau Duquène
2025-01-11 14:50:01 UTC
Reply
Permalink
Post by Fabio Fantoni
What would be the best, easiest and fastest procedure (especially for
newcomers) to create a new package from scratch, aiming to use git,
salsa, salsa-ci, gbp and DEP14 from the beginning?
It Depends™. As in, it really depends on what you are packaging.
Different technologies have different Debian-specific tools, different
teams have different workflows, there are exceptions, and there are
exceptions to the exceptions.

There are indeed things that are common to all packaging, but IMO as
well as IME trying to document a generic, theoretical trunk that is
rarely used as-is is confusing, overwhelming and not really helpful to
newcomers. That's material for reference manuals, but newcomers should
probably first be guided through a reasonably simple step-by-step
tutorial and then practical cases and examples covering the nominal
packaging practices as adopted by the different clans.

The tutorial could for example show how to properly package a Gnome
game. I randomly picked [1] which already looks like a reasonable pick
for that purpose, but there might be better ones. It might help to craft
a packaging "reference implementation" out of a carefully chosen
project, and publish a frozen demo git repository that could be cloned
and used along the tutorial.

Practices that are specific to teams or maintainers should be documented
by them, as they are in the best place to know what is current and what
is not, and this is not always trivial from an outsider's point of view
(or there might be lingering disagreements over what should be
considered "best" practices). For example I know that the Java packaging
documentation needs to be improved (and the tooling as well) and I'm
planning to work on that later this year.

Cheers,


[1]: https://salsa.debian.org/gnome-team/gnome-2048
--
Julien Plissonneau Duquène
Ahmad Khalifa
2025-01-11 15:40:02 UTC
Reply
Permalink
Write on Google "Debian create new package" and first result: https://
wiki.debian.org/HowToPackageForDebian
It points to various parts but mainly the more probable start point
seems https://wiki.debian.org/Packaging/Intro
To point to git and gbp seems more useful https://wiki.debian.org/
PackagingWithGit Here wrote also about DEP14, tell writing first package
out of git and after import, in fact it is not simple and fast to create
the initial package starting immediately from git and neither to use
immediately gbp and also DEP14, to create immediately on salsa etc... I
remember that the last packages created new some time ago I had to do
many steps, workarounds and only after convert the branches to the DEP14
names.
I just went through this learning process, and IMO, first thing to learn
is how to "package" (debuild, debhelper, lintian, sbuild, devscripts),
then you learn how to "publish" it (dput, gbp, DEP-14).

Packaging/Intro wiki is still an excellent first read if only to get the
terminology in use (upstream tarball, source package, dsc, ...).
It is easiest to first create the first version of a package, outside of Git.
It's an advanced wiki, not a starting step for newcomers.
--
Regards,
Ahmad
Andrey Rakhmatullin
2025-01-11 16:40:01 UTC
Reply
Permalink
Write on Google "Debian create new package" and first result: https://
wiki.debian.org/HowToPackageForDebian
It points to various parts but mainly the more probable start point
seems https://wiki.debian.org/Packaging/Intro
To point to git and gbp seems more useful https://wiki.debian.org/
PackagingWithGit Here wrote also about DEP14, tell writing first package
out of git and after import, in fact it is not simple and fast to create
the initial package starting immediately from git and neither to use
immediately gbp and also DEP14, to create immediately on salsa etc... I
remember that the last packages created new some time ago I had to do
many steps, workarounds and only after convert the branches to the DEP14
names.
I just went through this learning process, and IMO, first thing to learn is
how to "package" (debuild, debhelper, lintian, sbuild, devscripts), then you
learn how to "publish" it (dput, gbp, DEP-14).
The gbp part is debatable.
For some people it's easier when they use a repo from the beginning and
even wrap some tools into other, git-related, tools. For them it's a part
of "packaging". But see below.
It is easiest to first create the first version of a package, outside of Git.
Yup. It's unfortunate.
--
WBR, wRAR
Fabio Fantoni
2025-01-11 17:20:02 UTC
Reply
Permalink
Post by Ahmad Khalifa
https:// wiki.debian.org/HowToPackageForDebian
It points to various parts but mainly the more probable start point
seems https://wiki.debian.org/Packaging/Intro
To point to git and gbp seems more useful https://wiki.debian.org/
PackagingWithGit Here wrote also about DEP14, tell writing first
package out of git and after import, in fact it is not simple and
fast to create the initial package starting immediately from git and
neither to use immediately gbp and also DEP14, to create immediately
on salsa etc... I remember that the last packages created new some
time ago I had to do many steps, workarounds and only after convert
the branches to the DEP14 names.
I just went through this learning process, and IMO, first thing to
learn is how to "package" (debuild, debhelper, lintian, sbuild,
devscripts), then you learn how to "publish" it (dput, gbp, DEP-14).
Packaging/Intro wiki is still an excellent first read if only to get
the terminology in use (upstream tarball, source package, dsc, ...).
It is easiest to first create the first version of a package, outside of Git.
It's an advanced wiki, not a starting step for newcomers.
At a basic level I think almost everyone starts by trying some basic
packaging outside the packages for the official repositories, so it can
be good to aim first at creating the package, but maybe with something
that is both the most complete and up-to-date, and also the simplest and
fastest for very basic cases.

The problem is that it seems to me that there is not enough
predisposition towards git, gbp, salsa and salsa-ci for those who want
to package for Debian.

There are also cases of people who just want to package their software
quickly and fairly well for Debian as well as for other distros, for
Debian they find themselves having to invest days learning and wait
maybe months to MAYBE have the package included while in a few hours
maximum they manage to prepare it for other distros and manage to
include it in a shorter time.

There are those who want to start contributing with small things or
start trying and find themselves in difficulty and "wasting" time
without seeing results and often giving up.

Doing it in git from the beginning can be very useful for reviews,
seeing changes etc... when I looked at some packages in mentors there
was a significant difference between seeing the differences between
uploading to download from mentors and packages on git(even if not
salsa). It is faster to review, comment specifically, and possibly help
fix/improve directly.

Another difference for maintainers can be managing Debian patches, in
many cases they are not needed but where they are needed there can be
also a good difference in the time that can be saved with gbp-pq rather
than manually with quilt in some cases. I too, despite having used gbp
for years, out of habit, continued to manage patches with quilt manually
and only recently with gbp-pq, noticing that I could have saved a lot of
time in some cases.

And also regarding testing the changes you make there is a big
difference in time (and I suppose also much less difficulty or
discouragement in many cases) than preparing ideal environments and/or
doing it manually locally and use salsa-ci. Having local environments
and tools is useful for most cases but they are quite a few things and
can take a long time, while starting to see something concrete in a very
simple and fast way I suppose can reduce premature abandonment. Then
maybe I could be wrong and most of the cases of abandonment are for
others, like some cases seen years ago that seemed to me to be due to
lack of reviewers and sponsors on mentors.


Starting to use certain things from the beginning can favor them (unless
you need something different specific to the team and/or packages), but
if you start by spending a lot of time with certain tools and
procedures, changing later is more averse or discouraged, perhaps
without imagining the advantages it can have.


So in general it's ok when you start to see mainly the packaging files
themselves (both a quick and simple thing, and all the advanced things
in detail, in an optimal documentation) but shortly after if you want to
start contributing and/or packaging something new it's useful to have
something more targeted (again both a quick and simple thing and the
advanced and detailed things).

An example, a person has some basic knowledge of packaging but now wants
to contribute to package a new package for the official repo, he should
find information that helps him understand if the package would fit into
a team and be able to get there, otherwise some information to be able
to start in a simple, fast but also optimal way. It could be perhaps (I
need to retry with updated docs and latest versions to check)... see
"here" to start creating the initial packaging perhaps download the
upstream software, procedure with dh_make and first manual changes to
debian files, then prepare on git in a simple and fast way with gbp and
upload it to salsa to make it easier and faster to have it reviewed and
keep track of the changes and also use salsa-ci to do immediate tests
(then in some cases it is necessary to see in local environments and in
all cases it will be necessary to test it locally). He can also see
something to fix or improve without need to wait a review and
fix/improve the package to make it faster for review (that in addition
to saving time and can concentrate on more advanced parts not spotted on
tests rather than wasting a lot of time listing maybe many basic things
to fix).

I don't know if I can explain myself well but I tried. It's also
possible that he has the wrong idea about what might be useful to foster
new maintainers (and more help from reviewers/sponsors).
M. Zhou
2025-01-11 16:30:02 UTC
Reply
Permalink
Post by Fabio Fantoni
Today trying to see how a new person who wants to start maintaining new
packages would do and trying to do research thinking from his point of
view and from simple searches on the internet I found unfortunately that
these parts are fragmented and do not help at all to aim for something
unified but not even simple and fast enough.
And those fragments also changes as the time goes by. Such as the sbuild
schroot -> unshare changes. They are not necessarily well documented in
every introduction material for new comers.

Even if somebody in Debian community has enough time to overhaul everything
and create a new documentation, it will become the situation described
in XKCD meme "standards": xkcd.com/927/ -- we just got yet another document
as a fragment as time goes by.

LLMs are good companions as long as the strong ones are used. In order to
help new comers to learn, it is better for Debian to allocate some LLM API
credits to them, instead of hoping for someone to work on the documentation
and falling in the XKCD-927 infinite loop.

Considering the price, the LLM API call for helping all DDs + newcomers,
I believe, will be cheaper than hiring a real person to overhaul those
documentations and keep them up to date. This is a feasible way to partly
solve the issue without endlessly waiting for the HERO to appear.

Debian should consider allocating some budget like several hundred USD
per month for the LLM API calls for all members and new-comers' usage.

DebGPT can be hooked somewhere within the debian development process,
such as sbuild/ratt for build log analysis, etc. It is cheap enough
and people will eventually figure out the useful apsect of them.

Opinion against this post will include something about hallucination.
In the case LLM write something that does not compile at all, or write
some non-existent API, a human is intelligent enough to easily notice
that build failure or lintian error and tell whether it is hallucination
or not. I personally believe LLMs, at the current stage, is useful
as long as used and interpreted properly.


BTW, I was in the middle of evaluation LLMs for the nm-template. I did lots
of procrastinations towards finishing the evaluation, but the first
several questions were answered perfectly.
https://salsa.debian.org/lumin/ai-noises/-/tree/main/nm-templates?ref_type=heads
If anybody is interested in seeing the LLM evaluation against nm-templates,
please let me know and your message will be significantly useful for me
to conquer my procrastination on it.
Philipp Kern
2025-01-11 18:10:01 UTC
Reply
Permalink
Post by M. Zhou
Opinion against this post will include something about hallucination.
In the case LLM write something that does not compile at all, or write
some non-existent API, a human is intelligent enough to easily notice
that build failure or lintian error and tell whether it is hallucination
or not. I personally believe LLMs, at the current stage, is useful
as long as used and interpreted properly.
LLMs ingest documentation. If we stop writing or overhauling
documentation, what is the LLM going to suggest? Even hallucination
aside, how can it be the least bit accurate if it is not fed up-to-date
information about how things work?
Post by M. Zhou
BTW, I was in the middle of evaluation LLMs for the nm-template. I did lots
of procrastinations towards finishing the evaluation, but the first
several questions were answered perfectly.
https://salsa.debian.org/lumin/ai-noises/-/tree/main/nm-templates?ref_type=heads
If anybody is interested in seeing the LLM evaluation against nm-templates,
please let me know and your message will be significantly useful for me
to conquer my procrastination on it.
If anyone is attempting to answer these using LLMs, I'd expect them to
be excluded from the process. It's one thing to generate documentation
by reviewing LLM output for accuracy and potentially publishing that to
be helpful. It's another thing to try to lie yourself through the
process in order to gain the project's trust.

(In job interviews candidates already regularly use LLMs in the
background to answer the questions. There I think it's still noticable
when people claim knowledge that they do not have. In offline
communication all bets are off.)

Kind regards
Philipp Kern
Andrey Rakhmatullin
2025-01-11 18:40:01 UTC
Reply
Permalink
Post by M. Zhou
Opinion against this post will include something about hallucination.
In the case LLM write something that does not compile at all, or write
some non-existent API, a human is intelligent enough to easily notice
that build failure or lintian error and tell whether it is hallucination
or not.
No, I don't think a human is intelligent enough for that. Based on my
limited experience with people trying to learn how to use tools I'm the
upstream for I fully expect that human to go to -mentors and ask why the
tool or option proposed by the LLM doesn't exist (without mentioning where
do they get that) or even why does the d/control they wrote results in a
dpkg-source error about a not recognized field.
It's possible that probability of doing this is lower for people learning
packaging than for people learning a tool/langauge/framework in general,
but remember that a noticeable share of people learning packaging is
people who want to package their pet project.
--
WBR, wRAR
Otto Kekäläinen
2025-01-11 20:50:01 UTC
Reply
Permalink
Hi!

(cross-posting to mentors as they have most experience on what is
wrong with our current docs)

...
Post by M. Zhou
Even if somebody in Debian community has enough time to overhaul everything
and create a new documentation, it will become the situation described
in XKCD meme "standards": xkcd.com/927/ -- we just got yet another document
as a fragment as time goes by.
There is no need to start new duplicate parallel efforts. Simply
contribute to the existing ones.

Anyone can edit the current wiki pages if they want to improve them
(although most people I spoke to don't like using MoinMoin, but that
is another topic). If you would like to improve a man page, most
packages in Debian accept Merge Requests on Salsa to improve the man
pages. For example, git-buildpackage has now 29 accepted merge
requests [1], dh-make 16 [2] and debmake 14 [3]. Contributions to
these tool man pages are reviewed by the tool maintainers and thus
likely end up in man pages that are actually correct. Sending MRs to
the tool maintainers likely also helps them stay motivated to continue
to maintain the tools, and the MR contents helps the maintainers get
insight of the "user point of view".

You can easily also contribute to the Debian Developers Reference,
which has already accepted 26 merge requests [4]. The same goes for
the Guide for Debian Maintainers, which has already accepted 26 MRs as
well [5]. The Debian New Maintainers' Guide is abandoned and
deprecated, it should have a more clear banner stating it. The Debian
FAQ [6] has already accepted 13 MRs (8 of them in the past 6 months),
so nothing should be blocking you from contributing improvements
there. The debian.org website has accepted a whopping 716 MRs [7],
contributing there should be easy and fruitful as there are active
reviewers who will guide/help that the update is as good as possible.
The mentors.debian.net has also very active team and 205 accepted MRs
[8].

I have at least MR in all of the above, so I speak out of experience.
The process is smooth and most recipients give the first review
feedback within a week, and after polishing the submissions are likely
to accept.

I warmly recommend others who have time to engage in long discussions
on debian-devel@ to divert some of that energy into updating some of
the documentation.



[1] https://salsa.debian.org/agx/git-buildpackage/-/merge_requests?scope=all&state=merged
[2] https://salsa.debian.org/debian/dh-make/-/merge_requests?scope=all&state=merged
[3] https://salsa.debian.org/debian/debmake/-/merge_requests?scope=all&state=merged
[4] https://salsa.debian.org/debian/developers-reference/-/merge_requests?scope=all&state=merged
[5] https://salsa.debian.org/debian/debmake-doc/-/merge_requests?scope=all&state=merged
[6] https://salsa.debian.org/ddp-team/debian-faq/-/merge_requests?scope=all&state=merged
[7] https://salsa.debian.org/webmaster-team/webwml/-/merge_requests?scope=all&state=merged
[8] https://salsa.debian.org/mentors.debian.net-team/debexpo/-/merge_requests?scope=all&state=merged
Ahmad Khalifa
2025-01-12 16:00:02 UTC
Reply
Permalink
Post by Otto Kekäläinen
There is no need to start new duplicate parallel efforts. Simply
contribute to the existing ones.
+1, please. Too many docs already :)
Post by Otto Kekäläinen
Anyone can edit the current wiki pages if they want to improve them
(although most people I spoke to don't like using MoinMoin, but that
is another topic). If you would like to improve a man page, most
packages in Debian accept Merge Requests on Salsa to improve the man
pages. For example, git-buildpackage has now 29 accepted merge
requests [1], dh-make 16 [2] and debmake 14 [3]. Contributions to
these tool man pages are reviewed by the tool maintainers and thus
likely end up in man pages that are actually correct. Sending MRs to
the tool maintainers likely also helps them stay motivated to continue
to maintain the tools, and the MR contents helps the maintainers get
insight of the "user point of view".
Understandable of course, but the email slows things down a bit.
MoinMoin doesn't have SSO support, but if anyone's interested in OAuth2
and writes python, it's typically very straightforward:
https://moinmo.in/EasyToDo/implement%20oauth

It would be super nice if connected to Salsa or Gitlab.com account
--
Regards,
Ahmad
Serafeim (Serafi) Zanikolas
2025-01-12 21:00:02 UTC
Reply
Permalink
[..]
Post by Ahmad Khalifa
Understandable of course, but the email slows things down a bit.
MoinMoin doesn't have SSO support, but if anyone's interested in OAuth2
https://moinmo.in/EasyToDo/implement%20oauth
It would be super nice if connected to Salsa or Gitlab.com account
what would be truly amazing, imho, would be the whole wiki on git. that'd allow
for mass-updates, and reusing one's code (salsa) workflows for documentation

thanks,
serafi
Jonas Smedegaard
2025-01-12 23:50:01 UTC
Reply
Permalink
Hi Serafeim, and others,

Quoting Serafeim (Serafi) Zanikolas (2025-01-12 21:54:58)
Post by Serafeim (Serafi) Zanikolas
[..]
Post by Ahmad Khalifa
Understandable of course, but the email slows things down a bit.
MoinMoin doesn't have SSO support, but if anyone's interested in OAuth2
https://moinmo.in/EasyToDo/implement%20oauth
It would be super nice if connected to Salsa or Gitlab.com account
what would be truly amazing, imho, would be the whole wiki on git. that'd allow
for mass-updates, and reusing one's code (salsa) workflows for documentation
For those not only wishing that others made that happen, but consider
scratching that particular itch themselves, I am happy to chat with you
more detailed about what that involves, and how it is a quite exciting
challenge - so exciting, in fact, that Debian has had several attempts
at doing a migration over the years, but none so far have succeeded.

I don't say this to scare you off, but to make you aware that it is not
because Debianites love the archane MoinMoin wiki, but because it is
challenging to migrate away from it.

I originally packaged the MoinMoin wiki engine many years ago, before
Ubuntu existed. Then git was invented, and Ikiwiki. Some has tried to
migrate to Ikiwiki, some has tried to migrate to Mediawiki. Personally,
I would today migrate to zola, but you (i.e. those wanting to roll up
your sleeves and actually do this migration task) no doubt have your own
opinion on which should be the goal and why that is ideal. Regardless of
the goal, some of the preparations to get there is likely similar to
those previous attempts, where I and others might have input from those
previous attempts. If you think you have a novel new approach then I
would love to learn about it, as I still run a few legacy wikis that I
would like to migrate.

But let's not hijack any further this thread about welcoming newcomers,
I just wanted to chime in to encourage anyone wanting to do the wiki
modernization to get in touch with me to discuss further.

Kind regards,

- Jonas
--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
* Sponsorship: https://ko-fi.com/drjones

[x] quote me freely [ ] ask before reusing [ ] keep private
Serafeim (Serafi) Zanikolas
2025-01-13 21:10:02 UTC
Reply
Permalink
hi Jonas,
Post by Jonas Smedegaard
Hi Serafeim, and others,
Quoting Serafeim (Serafi) Zanikolas (2025-01-12 21:54:58)
Post by Serafeim (Serafi) Zanikolas
what would be truly amazing, imho, would be the whole wiki on git. that'd allow
for mass-updates, and reusing one's code (salsa) workflows for documentation
For those not only wishing that others made that happen, but consider
scratching that particular itch themselves, I am happy to chat with you
more detailed about what that involves, and how it is a quite exciting
challenge - so exciting, in fact, that Debian has had several attempts
at doing a migration over the years, but none so far have succeeded.
[..]
Post by Jonas Smedegaard
But let's not hijack any further this thread about welcoming newcomers,
let's create a new thread then :)
Post by Jonas Smedegaard
I just wanted to chime in to encourage anyone wanting to do the wiki
modernization to get in touch with me to discuss further.
thank you for the offer but why not have the follow up in a publicly archived
list? happy to switch to -www, if -devel is not ideal

I think that a retrospective/postmortem of past attempts would be very valuable.
I offer to write one if you'd be okay to share all of the context with me (you
could review it before it gets published).

also, I'd think that nailing down the requirements for a new platform and for
the content to be migrated (e.g. drop any pages that are >X years old) would be
an important prerequisite for any technical work. has this been done in previous
migration attempts? to start with, do we have "rough consensus" that a
git-backed wiki would be preferable?

thanks,
serafi
Jonas Smedegaard
2025-01-13 21:50:01 UTC
Reply
Permalink
Quoting Serafeim (Serafi) Zanikolas (2025-01-13 22:06:01)
Post by Serafeim (Serafi) Zanikolas
Post by Jonas Smedegaard
Quoting Serafeim (Serafi) Zanikolas (2025-01-12 21:54:58)
Post by Serafeim (Serafi) Zanikolas
what would be truly amazing, imho, would be the whole wiki on git.
that'd allow for mass-updates, and reusing one's code (salsa)
workflows for documentation
For those not only wishing that others made that happen, but
consider scratching that particular itch themselves, I am happy to
chat with you more detailed about what that involves, and how it is
a quite exciting challenge - so exciting, in fact, that Debian has
had several attempts at doing a migration over the years, but none
so far have succeeded.
[..]
Post by Jonas Smedegaard
But let's not hijack any further this thread about welcoming
newcomers,
let's create a new thread then :)
Post by Jonas Smedegaard
I just wanted to chime in to encourage anyone wanting to do the wiki
modernization to get in touch with me to discuss further.
thank you for the offer but why not have the follow up in a publicly
archived list? happy to switch to -www, if -devel is not ideal
Then let's move to the tinker team mailinglist:
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
Post by Serafeim (Serafi) Zanikolas
I think that a retrospective/postmortem of past attempts would be very valuable.
I offer to write one if you'd be okay to share all of the context with
me (you could review it before it gets published).
I don't mind discussing in public, but do mind doing it in this large
forum.
Post by Serafeim (Serafi) Zanikolas
also, I'd think that nailing down the requirements for a new platform
and for the content to be migrated (e.g. drop any pages that are >X
years old) would be an important prerequisite for any technical work.
has this been done in previous migration attempts? to start with, do
we have "rough consensus" that a git-backed wiki would be preferable?
No, I am unaware of any consensus on any changes, just have a vague
feeling that MoinMoin is somewhat as appreciated in Debian as CDBS :-)

Anyone interested in discussing practicalities of migrating away from
MoinMoin for the Debian wiki, please join the tinker mailinglist at
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel

- Jonas
--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
* Sponsorship: https://ko-fi.com/drjones

[x] quote me freely [ ] ask before reusing [ ] keep private
Jonathan Dowland
2025-01-13 22:20:02 UTC
Reply
Permalink
Post by Jonas Smedegaard
Anyone interested in discussing practicalities of migrating away from
MoinMoin for the Debian wiki, please join the tinker mailinglist at
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
What is "Tinker blend development discussion"? I've never heard of it.


Thanks,
--
Please do not CC me for listmail.

👱🏻 Jonathan Dowland
✎ ***@debian.org
🔗 https://jmtd.net
Jonas Smedegaard
2025-01-13 23:30:01 UTC
Reply
Permalink
Quoting Jonathan Dowland (2025-01-13 23:15:35)
Post by Jonathan Dowland
Post by Jonas Smedegaard
Anyone interested in discussing practicalities of migrating away from
MoinMoin for the Debian wiki, please join the tinker mailinglist at
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
What is "Tinker blend development discussion"? I've never heard of it.
People interested in lightweight systems, like the OSHW-certified
ARM-based laptop and server systems from Olimex.

There is no "MoinMoin" in that team, but very little is going on in that
team, and among those few people there have also been an interest in
hosting MoinMoin due to its neither being bloated nor PHP-based, and
consequently some interest in migrating away from it in more recent
time.

- Jonas
--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
* Sponsorship: https://ko-fi.com/drjones

[x] quote me freely [ ] ask before reusing [ ] keep private
Ahmad Khalifa
2025-01-13 22:30:01 UTC
Reply
Permalink
Post by Jonas Smedegaard
Quoting Serafeim (Serafi) Zanikolas (2025-01-13 22:06:01)
Post by Serafeim (Serafi) Zanikolas
also, I'd think that nailing down the requirements for a new platform
and for the content to be migrated (e.g. drop any pages that are >X
years old) would be an important prerequisite for any technical work.
has this been done in previous migration attempts? to start with, do
we have "rough consensus" that a git-backed wiki would be preferable?
Wikis have their own version control and they're meant for a much wider
audience. I think general documentation definitely belongs on a wiki,
not git. Edit, fix typo, done in 30 seconds :)
Post by Jonas Smedegaard
No, I am unaware of any consensus on any changes, just have a vague
feeling that MoinMoin is somewhat as appreciated in Debian as CDBS :-)
Anyone interested in discussing practicalities of migrating away from
MoinMoin for the Debian wiki, please join the tinker mailinglist at
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
I'd be interested in knowing more as well. I can't imagine there are
more features in MoinMoin that can't be migrated to MediaWiki and
friends, so I want to find out what I'm missing.
--
Regards,
Ahmad
Jeremy Stanley
2025-01-13 22:50:01 UTC
Reply
Permalink
On 2025-01-13 22:27:21 +0000 (+0000), Ahmad Khalifa wrote:
[...]
I can't imagine there are more features in MoinMoin that can't be
migrated to MediaWiki and friends, so I want to find out what I'm
missing.
Another free/libre open source software community I'm involved in
migrated a fairly large corpus of content (10k+ pages) from MoinMoin
to MediaWiki, mainly for access to more robust spam and abuse
handling features. The effort was managed by a community member who
was also a Wikipedia sysadmin on the WMF staff, and required a fair
amount of cleanup after the fact, so not for the faint of heart.
Granted, that was about a decade ago now; it's possible the tools
for doing it have gotten smoother in the meantime.

Having acted as a site admin and moderator on both, I can't say
there's anything I miss from MoinMoin other than the fact that it's
implemented in Python instead of PHP, which made modifying it or
writing custom plugins for MoinMoin myself a little easier and less
dicey.
--
Jeremy Stanley
nick black
2025-01-14 01:20:01 UTC
Reply
Permalink
Post by Ahmad Khalifa
Wikis have their own version control and they're meant for a much wider
audience. I think general documentation definitely belongs on a wiki, not
git. Edit, fix typo, done in 30 seconds :)
there are of course wiki-git bridges, at least for MediaWiki:

https://www.mediawiki.org/wiki/Git-remote-mediawiki
https://github.com/Eccenux/wiki-to-git
https://github.com/Git-Mediawiki/Git-Mediawiki

there's also the (unmaintained) FUSE implementation (not
particularly relevant here, but illustrative of the ecosystem's
breadth):

https://wikipediafs.sourceforge.net/

fwiw, i've maintained several public-facing MediaWiki
installations, my largest (dankwiki[0]) having run on the same
install base since 2008. it's been largely a pleasure; i doubt i
spend more than five hours annually on its administration,
almost entirely for updates or adding new plugins. upstream has
been friendly and helpful the two times i've engaged with them
on IRC. there's a plugin for just about anything one might want
to do, from transcluding Bugzilla queries to inline Youtube
video to integrating with donation services.

in addition, anyone with Wikipedia editing experience can
immediately apply it to a MediaWiki.

the only unpleasant aspects have been PHP (very rare, but
sometimes i need go change properties of my PHP installation)
and esoteric plugins falling out of sync with the main distribution.
it also requires a mysql backend, and default search
capabilities are of the garbage variety (though this has
improved in recent years, to the point where i no longer
consider SphinxSearch a mandatory coinstall, and indeed no
longer use it myself).

development is healthy and ongoing, and comfortably backed by
the Wikimedia Foundation.

but i have no familiarity with Debian requirements, especially
surrounding authentication.

--nick

[0] https://nick-black.com
--
nick black -=- https://nick-black.com
to make an apple pie from scratch,
you need first invent a universe.
Jonathan Dowland
2025-01-14 08:30:01 UTC
Reply
Permalink
Post by Ahmad Khalifa
Wikis have their own version control and they're meant for a much
wider audience. I think general documentation definitely belongs on a
wiki, not git. Edit, fix typo, done in 30 seconds :)
The two git-backed wikis I am familiar with (IkiWiki, and GitLab) have a
full, traditional web UI for interacting with the wiki, and the Git
back-end is effectively invisible from that perspective, but, power
users have the ability to easily create a full backup of the full wiki
history; perform large-scale or complex, machine-assisted edits on the
local copy; use advanced merge-conflict resolution tools.

Having said all that, I believe any effort to revamp the Debian Wiki
should start with determining the requirements and goals, rather than
start picking solutions.
--
Please do not CC me for listmail.

👱🏻 Jonathan Dowland
✎ ***@debian.org
🔗 https://jmtd.net
Jonathan Dowland
2025-01-15 15:30:02 UTC
Reply
Permalink
Agreed 100%, the openness for editing is really what makes wikis
shine. It can be hard to accept, with many of us firmly set in Git's
pre-approval style PR/MR workflows, to allow anyone with an account,
to just change anything, but that empowerment is what makes wikis work
and really blossom.
Please note that git ≠ GitHub or GitLab, and the PR/MR workflows are
just one way git is used. I can see the advantages of accessing wiki
content with git, but I also agree with you that wikis need openness
of editing (in fact I think that's a core principle of what makes a
wiki, a wiki)

The two are not incompatible (even if MR/PR-style approvals are). For
example anyone can go and edit ikiwiki.info, either on the web or by
cloning the git repository and pushing their commits back.
As one of the maintainers of the MediaWiki package in Debian[1] and a
wholehearted wiki enthusiast (longtime Wikipedia admin, etc.), I have
a lot of thoughts on this that I'll save for later, but I want to
cross-link the ongoing discussion at
<https://salsa.debian.org/debian/grow-your-ideas/-/issues/2> that goes
over a lot of the same topics.
Thank you.
--
Please do not CC me for listmail.

👱🏻 Jonathan Dowland
✎ ***@debian.org
🔗 https://jmtd.net
Jeremy Stanley
2025-01-13 22:30:01 UTC
Reply
Permalink
On 2025-01-13 22:43:59 +0100 (+0100), Jonas Smedegaard wrote:
[...]
Post by Jonas Smedegaard
Anyone interested in discussing practicalities of migrating away from
MoinMoin for the Debian wiki, please join the tinker mailinglist at
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
Out of curiosity, what does the Tinker Blend have to do with Debian
Wiki management? It's described as "a Debian Pure Blend which aims
to provide fully configured installations for various forms of
tinkering/hacking on electronics and other hardware devices."
--
Jeremy Stanley
Steve McIntyre
2025-01-14 15:00:01 UTC
Reply
Permalink
Post by Jonas Smedegaard
Quoting Serafeim (Serafi) Zanikolas (2025-01-13 22:06:01)
Post by Serafeim (Serafi) Zanikolas
thank you for the offer but why not have the follow up in a publicly
archived list? happy to switch to -www, if -devel is not ideal
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
Erm, WTF? How about keeping this on a main project list. And maybe
even include the current people working on web and wiki admin?

Unimpressed...
--
Steve McIntyre, Cambridge, UK. ***@einval.com
Can't keep my eyes from the circling sky,
Tongue-tied & twisted, Just an earth-bound misfit, I...
Jonas Smedegaard
2025-01-14 15:10:01 UTC
Reply
Permalink
Quoting Steve McIntyre (2025-01-14 15:16:49)
Post by Steve McIntyre
Post by Jonas Smedegaard
Quoting Serafeim (Serafi) Zanikolas (2025-01-13 22:06:01)
Post by Serafeim (Serafi) Zanikolas
thank you for the offer but why not have the follow up in a publicly
archived list? happy to switch to -www, if -devel is not ideal
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
Erm, WTF? How about keeping this on a main project list. And maybe
even include the current people working on web and wiki admin?
Unimpressed...
WTF what? Should we stop work in smaller groups and discuss everything
here in one giant mailinglist for every detail of the project, or what
in particular makes this detail so F'ing global?

Offended...

- Jonas
--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
* Sponsorship: https://ko-fi.com/drjones

[x] quote me freely [ ] ask before reusing [ ] keep private
Steve McIntyre
2025-01-14 16:30:01 UTC
Reply
Permalink
Post by Jonas Smedegaard
Quoting Steve McIntyre (2025-01-14 15:16:49)
Post by Steve McIntyre
Post by Jonas Smedegaard
Quoting Serafeim (Serafi) Zanikolas (2025-01-13 22:06:01)
Post by Serafeim (Serafi) Zanikolas
thank you for the offer but why not have the follow up in a publicly
archived list? happy to switch to -www, if -devel is not ideal
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
Erm, WTF? How about keeping this on a main project list. And maybe
even include the current people working on web and wiki admin?
Unimpressed...
WTF what? Should we stop work in smaller groups and discuss everything
here in one giant mailinglist for every detail of the project, or what
in particular makes this detail so F'ing global?
What on earth makes the blend-tinker-devel list a reasonable place to
discuss the Debian wiki and maybe working on replacing its software?
Serafeim suggested debian-www, which seems like a much more reasonable
place to me at least.
Post by Jonas Smedegaard
Offended...
What, that people questioned this? That the people in the team
*running the service you're talking about* might care about
discussions about it?
--
Steve McIntyre, Cambridge, UK. ***@einval.com
Can't keep my eyes from the circling sky,
Tongue-tied & twisted, Just an earth-bound misfit, I...
Jonas Smedegaard
2025-01-14 17:10:02 UTC
Reply
Permalink
Quoting Steve McIntyre (2025-01-14 17:27:20)
Post by Steve McIntyre
Post by Jonas Smedegaard
Quoting Steve McIntyre (2025-01-14 15:16:49)
Post by Steve McIntyre
Post by Jonas Smedegaard
Quoting Serafeim (Serafi) Zanikolas (2025-01-13 22:06:01)
Post by Serafeim (Serafi) Zanikolas
thank you for the offer but why not have the follow up in a publicly
archived list? happy to switch to -www, if -devel is not ideal
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
Erm, WTF? How about keeping this on a main project list. And maybe
even include the current people working on web and wiki admin?
Unimpressed...
WTF what? Should we stop work in smaller groups and discuss everything
here in one giant mailinglist for every detail of the project, or what
in particular makes this detail so F'ing global?
What on earth makes the blend-tinker-devel list a reasonable place to
discuss the Debian wiki and maybe working on replacing its software?
Serafeim suggested debian-www, which seems like a much more reasonable
place to me at least.
Post by Jonas Smedegaard
Offended...
What, that people questioned this? That the people in the team
*running the service you're talking about* might care about
discussions about it?
No, but that my proposal for discussion place was so out of this earth
that you felt the need to curse over it.

Obviously I am so fucking stupid that I didn't know that the wiki team
used same mailinglist as the www team.

I stand fucking corrected.

- Jonas
--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
* Sponsorship: https://ko-fi.com/drjones

[x] quote me freely [ ] ask before reusing [ ] keep private
Andrej Shadura
2025-01-15 10:30:01 UTC
Reply
Permalink
Hello,
Post by Jonas Smedegaard
Post by Jonas Smedegaard
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
No, but that my proposal for discussion place was so out of this earth
that you felt the need to curse over it.
I hate to be the person to break the news to you, but some random unrelated mailing list of which very few relevant people are members, seems like a very wrong suggestion indeed.
--
Cheers,
Andrej
Jonas Smedegaard
2025-01-15 10:50:01 UTC
Reply
Permalink
Quoting Andrej Shadura (2025-01-15 11:24:31)
Post by Jan Dittberner
Hello,
Post by Jonas Smedegaard
Post by Jonas Smedegaard
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
No, but that my proposal for discussion place was so out of this earth
that you felt the need to curse over it.
I hate to be the person to break the news to you, but some random unrelated mailing list of which very few relevant people are members, seems like a very wrong suggestion indeed.
Then it might please you know know that you didn't break it to me,
others fucking did.

But thanks for stepping on it. Anyone else while we are at it?

- Jonas
--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
* Sponsorship: https://ko-fi.com/drjones

[x] quote me freely [ ] ask before reusing [ ] keep private
Chris Hofstaedtler
2025-01-15 10:50:02 UTC
Reply
Permalink
Post by Jonas Smedegaard
Quoting Andrej Shadura (2025-01-15 11:24:31)
Post by Jan Dittberner
Hello,
Post by Jonas Smedegaard
Post by Jonas Smedegaard
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
No, but that my proposal for discussion place was so out of this earth
that you felt the need to curse over it.
I hate to be the person to break the news to you, but some random unrelated mailing list of which very few relevant people are members, seems like a very wrong suggestion indeed.
Then it might please you know know that you didn't break it to me,
others fucking did.
But thanks for stepping on it. Anyone else while we are at it?
Actually, yes. It's really unclear to me what you are trying to
achieve here, with the continuation of this communication style.

Chris
Jonas Smedegaard
2025-01-15 11:20:02 UTC
Reply
Permalink
Quoting Chris Hofstaedtler (2025-01-15 11:44:19)
Post by Chris Hofstaedtler
Post by Jonas Smedegaard
Quoting Andrej Shadura (2025-01-15 11:24:31)
Post by Jan Dittberner
Hello,
Post by Jonas Smedegaard
Post by Jonas Smedegaard
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/blend-tinker-devel
No, but that my proposal for discussion place was so out of this earth
that you felt the need to curse over it.
I hate to be the person to break the news to you, but some random unrelated mailing list of which very few relevant people are members, seems like a very wrong suggestion indeed.
Then it might please you know know that you didn't break it to me,
others fucking did.
But thanks for stepping on it. Anyone else while we are at it?
Actually, yes. It's really unclear to me what you are trying to
achieve here, with the continuation of this communication style.
Nothing. My posts here contribute nothing - they are not moving the
conversation forward. Speaking just for myself, that is...

- Jonas
--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
* Sponsorship: https://ko-fi.com/drjones

[x] quote me freely [ ] ask before reusing [ ] keep private
Jonathan Dowland
2025-01-13 10:10:01 UTC
Reply
Permalink
Post by Serafeim (Serafi) Zanikolas
what would be truly amazing, imho, would be the whole wiki on git.
that'd allow for mass-updates, and reusing one's code (salsa)
workflows for documentation
Back before we adopted MoinMoin (the previous wiki tech was kwiki iirc)
a couple of us did a brief exploration of using IkiWiki for Debian; it
didn't go anywhere (on my part I didn't put enough into it) but one of
the advantages of it would have been that IkiWiki supports using git as
back-end storage.

If there's anyone interested in the Debian Wiki at FOSDEM I would be
interested in having a chat. I haven't personally done any work on it
for many years but I've remained interested in it all along. My
particular passion (back at the kwiki→MoinMoin transition) was ensuring
that wiki content was DFSG compatible; that's still something I would
like to see improved.


Best wishes
--
Please do not CC me for listmail.

👱🏻 Jonathan Dowland
✎ ***@debian.org
🔗 https://jmtd.net
Jonathan Dowland
2025-01-13 10:20:02 UTC
Reply
Permalink
Post by Ahmad Khalifa
Understandable of course, but the email slows things down a bit.
MoinMoin doesn't have SSO support, but if anyone's interested in OAuth2
https://moinmo.in/EasyToDo/implement%20oauth
I'm *fairly* sure that a more pressing issue is that MoinMoin requires
Python 2, which has been abandoned.

At least the stable and deployed versions: 2.0.0a1 is the first release
from the 2.x branch which is a rewrite for Python 3, released 2024-03-24.
--
Please do not CC me for listmail.

👱🏻 Jonathan Dowland
✎ ***@debian.org
🔗 https://jmtd.net
Jan Dittberner
2025-01-13 10:40:01 UTC
Reply
Permalink
Post by Jonathan Dowland
Post by Ahmad Khalifa
Understandable of course, but the email slows things down a bit.
MoinMoin doesn't have SSO support, but if anyone's interested in OAuth2
https://moinmo.in/EasyToDo/implement%20oauth
I'm *fairly* sure that a more pressing issue is that MoinMoin requires
Python 2, which has been abandoned.
At least the stable and deployed versions: 2.0.0a1 is the first release from
the 2.x branch which is a rewrite for Python 3, released 2024-03-24.
Hello,

I take care of the administrative side of https://wiki.cacert.org/ which is
also based on MoinMoin. We did a test installation and content copy of the
CAcert wiki based on the 2.x branch after FrOSCon 2024. A a lot of our
content had blocking issues (we use many macros to include shared content on
multiple pages.

We (another CAcert member, not myself) were in contact with MoinMoin people,
but progress on the issues is quite slow. I think the MoinMoin people would
appreciate help to get the 2.x-branch ready. Unfortunately I don't have
enough time to help there.


Best regards
Jan
--
Jan Dittberner - Debian Developer
GPG-key: 4096R/0xA73E0055558FB8DD 2009-05-10
B2FF 1D95 CE8F 7A22 DF4C F09B A73E 0055 558F B8DD
https://portfolio.debian.net/ - https://people.debian.org/~jandd/
Steve McIntyre
2025-01-14 16:50:02 UTC
Reply
Permalink
Post by Jonathan Dowland
I'm *fairly* sure that a more pressing issue is that MoinMoin requires
Python 2, which has been abandoned.
At least the stable and deployed versions: 2.0.0a1 is the first release
*nod*

I think we have moin2 packages just about ready for use, but I have
worries: it's not simply a dropin replacement, and the first few
attempts I've made at migrating some test wikis didn't go very
well. And then lots of ENOTIME :-(

"We" are a small team running the wiki. Like a number of places in
Debian infrastructure, ongoing admin is a time commitment that it
seems very few people are ready to make.
--
Can't keep my eyes from the circling sky,
Tongue-tied & twisted, Just an earth-bound misfit, I...
Otto Kekäläinen
2025-01-12 03:20:01 UTC
Reply
Permalink
...
Post by M. Zhou
Debian should consider allocating some budget like several hundred USD
per month for the LLM API calls for all members and new-comers' usage.
I don't think Debian should as an organization pay for LLMs. On the
contrary I would expect LLM providers to offer API keys for free to
Debian Developers just like we have other perks listed at
https://wiki.debian.org/MemberBenefits.

Considering how much the LLMs have utilized open source software when
building and training them, it would actually make a lot of sense for
those companies to step up and partner with Debian just like many web
hosting companies have, as they all likewise have built their
businesses on top of open source software. Currently, I don't see any
AI companies at https://www.debian.org/partners/.

If anyone has contacts at OpenAI, Anthropic, xAI, DeepSeek, 01 AI,
Zhipu AI, Meta, Mistral, Nexus, Alibaba, AI21 Labs, Cohere etc, please
tell them about the opportunity to sponsor Debian :)


Also, in my view LLMs are still far from being able to do Debian
development, they can't even write good git commit messages explaining
*why* a particular change is made. They may in some cases, however, be
useful assistants doing proof-reading and simple tasks. Debian
contributors should be exploring sensible way to use LLMs in ways that
fit their personal workflows, so that the human originated Merge
Requests / patches get written faster and have a higher quality. But
no thanks to LLM written stuff that wasn't closely co-authored and
reviewed by a human. The risk of getting garbage is far too high.
Colin Watson
2025-01-12 17:00:01 UTC
Reply
Permalink
Post by Otto Kekäläinen
I don't think Debian should as an organization pay for LLMs. On the
contrary I would expect LLM providers to offer API keys for free to
Debian Developers just like we have other perks listed at
https://wiki.debian.org/MemberBenefits.
I'd go further and say that the ecological and social costs of most of
the contraptions provided by those providers means that using them in
any way is unethical, and I personally refuse to do so. I recommend
that other people do the same.

(I have less fixed views on locally-trained models, but I see no very
compelling need to find more things to spend energy on even if the costs
are lower.)
--
Colin Watson (he/him) [***@debian.org]
Andrew M.A. Cater
2025-01-12 18:10:01 UTC
Reply
Permalink
Post by Colin Watson
Post by Otto Kekäläinen
I don't think Debian should as an organization pay for LLMs. On the
contrary I would expect LLM providers to offer API keys for free to
Debian Developers just like we have other perks listed at
https://wiki.debian.org/MemberBenefits.
You can label me as extremely reactionary on this: I think LLM providers
shouldn't be providing LLMs to Debian unless they are of proven benefit
to the whole Project. I don't think they are and I don't trust the motives
of those who suggest them as a cure-all. At the very best, LLMs are at
the level of a Google Translate: at worst, they're wholly inaccurate and
untrustworthy.

I chose Google Translate as a deliberate example here: it's provided
by one of the most influential agents on the Internet with the largest
corpus of text to train on.I wouldn't trust it against the opinion
of a good native speaker to translate to/from English, say, though I might
rely on it to give some vague sense of an unknown text in a language I don't
understand or read well..
Post by Colin Watson
I'd go further and say that the ecological and social costs of most of
the contraptions provided by those providers means that using them in
any way is unethical, and I personally refuse to do so. I recommend
that other people do the same.
I think the ecological costs have a bearing here: currently, it seems
improper to use systems that appear to imperil energy grids for entire -
countries. Watching other people find and fix bugs, even in code they
have written or know well, I can't trust systems built on modern
Markov chains to do better, no matter how much input you give them, and
that's without crediting LLMs as able to create good novel code..
Post by Colin Watson
(I have less fixed views on locally-trained models, but I see no very
compelling need to find more things to spend energy on even if the costs
are lower.)
Absolutely: even if the costs to the Debian project for LLMs were to be
nil at present, there are better things for the project to spend time on.

With every good wish, as ever,
Post by Colin Watson
--
Ángel
2025-01-13 01:40:01 UTC
Reply
Permalink
Post by Andrew M.A. Cater
Watching other people find and fix bugs, even in code they
have written or know well, I can't trust systems built on modern
Markov chains to do better, no matter how much input you give them, and
that's without crediting LLMs as able to create good novel code..
This is something I have thought before, and which I find lacking in
most (all?) instances of the "let's program with an LLM" topic.

When a human¹ programs something, I expect there is a logical process
through which he arrives to the decision to write a set of lines of
code. This doesn't mean those lines will be the right ones, or bug-
free. Just that it makes sense.

For example, a program that does chdir("/"); at the beginning may
suggest it my run as a daemon, as this allows it not to block
filesystems from umounting.
If it has a number of calls to getuid(), setuid(), setresuid()... it
might switch to a different user.

However, if the code was generated by a LLM, all bets are off, since
the lines could make no sense at all for this specific program.


It wouldn't be that strange if a LLM asked to generate a control file
for a perl module could suggest a line such as
Depends: libc6 (>= 2.34)
just because there are lots of packages with that dependency.²

A person could make a similar mistake of including unnecessary
dependencies if copying its work on an unrelated package, if not
properly cleaned. But how to fix those things if the mentor is a LLM?



¹ somewhat competent as a programmer
² hopefully, a LLM wouldn't be trained on the *output* of the
templates, though.
M. Zhou
2025-01-12 18:50:01 UTC
Reply
Permalink
Post by Colin Watson
(I have less fixed views on locally-trained models, but I see no very
compelling need to find more things to spend energy on even if the costs
are lower.)
Locally-trained models are not practical in the current stage. State-of-the-art
models can only be trained by the richest capitals who have GPU clusters. Training
and deploying smaller models like 1 billion can lead to a very wrong impression
and conclusion on those models.

Based on the comments, what I saw is that using LLMs as an organization is too
radical for Debian. In that sense leaving this new technology to individuals' personal
evaluation and usage is more reasonable.

So what I was talking is simply a choice among the two:
1. A contributor who needs help can leverage LLM for its immediate response and
help even if it only correct, for 30% of the time. It requires the contributor
to have knowledge and skill to properly use this new technology.
2. A contributor who needs help has to wait for a real human for indefinite time
period, but the correctness is above 99$.

The existing voice chose the second one. I want to mention that "waiting for a real
human for help on XXX for indefinite time" was a bad experience when I was a new comer.
The community not agreeing on using that new technology to aid such pain point, 
seems understandable to me.
Philipp Kern
2025-01-12 21:40:02 UTC
Reply
Permalink
Post by M. Zhou
1. A contributor who needs help can leverage LLM for its immediate response and
help even if it only correct, for 30% of the time. It requires the contributor
to have knowledge and skill to properly use this new technology.
2. A contributor who needs help has to wait for a real human for indefinite time
period, but the correctness is above 99$.
The existing voice chose the second one. I want to mention that "waiting for a real
human for help on XXX for indefinite time" was a bad experience when I was a new comer.
The community not agreeing on using that new technology to aid such pain point, 
seems understandable to me.
No-one is stopped from using any of the free offers. I don't think we
need our own chat bot. Of course that means, in turn, that we give up on
feeding it domain-specific knowledge and our own prompt. But that's...
probably fine?

If those LLMs support that, one could still produce a guide on how to
feed more interesting data into it - or provide a LoRA. It's not like
inference requires a GPU.

But then again saying things like "oh, look, I could easily answer the
NM templates with this" is the context you want to put this work in.

Kind regards
Philipp Kern
M. Zhou
2025-01-12 22:10:02 UTC
Reply
Permalink
Post by Philipp Kern
No-one is stopped from using any of the free offers. I don't think we
need our own chat bot. Of course that means, in turn, that we give up on
feeding it domain-specific knowledge and our own prompt. But that's...
probably fine?
One long term goal of debian deep learning team is to host an LLM with
the team's AMD GPUs and expose it to the members. That said, the necessary
packages to run that kind of service are still missing from our archive.
It is a good way to use existing GPUs any way.

Even if we get no commercial sponsorship of API calls, we will eventually
experiment and evaluate one with the team's infrastructure. We are still
working towards that.
Post by Philipp Kern
If those LLMs support that, one could still produce a guide on how to
feed more interesting data into it - or provide a LoRA. It's not like
inference requires a GPU.
First, DebGPT is designed to conveniently put any particular information
whether or not Debian-specific, to the context of LLM. I have also implemented
some mapreduce algorithm to let the LLM deal with extremely overlength
context such as a whole ratt buildlog directory.

LoRA is only sound when you have a clear definition on the task you want
the LLM to deal with. If we do not know what the user want, then forget
about LoRA and just carefully provide the context to LLM. DebGPT is
technically on the right way in terms of feasibility and efficiency.

RAG may help. I have already implemented the vector database and the
retrieval modules in DebGPT, but the frontend part for RAG is still
under development.
Post by Philipp Kern
But then again saying things like "oh, look, I could easily answer the
NM templates with this" is the context you want to put this work in.
My intention is always to explore possible and potential ways to make LLM
useful in any possible extent. To support my idea, I wrote DebGPT, and
I tend to only claim things that is *already implemented* and *reproducible*
in DebGPT.

For instance, I've added the automatic answering of the nm-tempaltes in
DebGPT and the following script can quickly give all the answer.
The answers are pretty good at a first glance. I'll postpone the full
evaluation when I wrote the code for all nm-templates.

I simply dislike saying nonsense that cannot be implemented in DebGPT.
But please do not limit your imagination with my readily available
demo examples or the use cases I claimed.


(you need to use the latest git version of DebGPT)
```
# nm_assigned.txt
debgpt -f nm:nm_assigned -a 'pretend to be ***@debian.org and answer the question. Give concrete examples, and links as evidence supporting them are preferred.' -o nm-assigned-selfintro.txt

# nm_pp1.txt
for Q in PH0 PH1 PH2 PH3 PH4 PH5 PH6 PH7 PHa; do
debgpt -HQf nm:pp1.${Q} -a 'Be concise and answer in just several sentences.' -o nm-pp1-${Q}-brief.txt;
debgpt -HQf nm:pp1.${Q} -a 'Be precise and answer with details explained.' -o nm-pp1-${Q}-detail.txt;
done

# nm_pp1_extras.txt
for Q in PH0 PH8 PH9 PHb; do
debgpt -HQf nm:pp1e.${Q} -a 'Be concise and answer in just several sentences.' -o nm-pp1e-${Q}-brief.txt;
debgpt -HQf nm:pp1e.${Q} -a 'Be precise and answer with details explained.' -o nm-pp1e-${Q}-detail.txt;
done
```

[1] DebGPT: https://salsa.debian.org/deeplearning-team/debgpt
Helmut Grohne
2025-01-15 13:10:01 UTC
Reply
Permalink
Hi Mo,

before going into criticizing things, I would like to thank you for your
continued work in the AI space. In particular, your way of classifying
models into different degrees of freedom demonstrates how much you care
about Debian's values. I see that your focus is on enabling users and
that's amazing!
Post by M. Zhou
One long term goal of debian deep learning team is to host an LLM with
the team's AMD GPUs and expose it to the members. That said, the necessary
packages to run that kind of service are still missing from our archive.
It is a good way to use existing GPUs any way.
Disregarding the ecological and political aspects that have been
discussed elsewhere at length, let me also note that successfully using
a LLM is not trivial. Whether you get something useful very much depends
on what you ask. Typically, you should expect that something between 30%
and 70% of answers are factually wrong in some regard. As a result,
asking questions where the answer is not verifiable is not very useful.
Using the answers without verifying them poses a risk to us as a
community that myths are propagated and become harder to falsify. We are
talking about ai dementia already and can observe it happening. How do
you see us mitigating this problem?

That said, it can be useful to use a LLM if your premise is that you
verify the answer. For instance if you are searching for a library
function that does something particular, searching the documentation can
be time consuming. Describing your function to a LLM and then looking up
the documentation of the presented suggestions has a significant chance
of turning up something useful (in addition to the rubbish included).
Likewise, searching for a particular statement in a mailing list thread
can be a daunting task where a LLM may be able to interpret your vague
memory sufficiently well that it can provide a few candidates.

This way of using LLMs is effectively limiting it to behave as a search
engine with a different query language. I suspect that most of us use
search engines all day without thinking much about them being driven by
corporate actors running big clusters. Is using LLMs in this way that
much different from using a search engine?

Ecological and economical reasons aside, would anyone see an issue with
providing an AI-driven search engine to Debian's documentation, mailing
lists, bugs and the wiki?
Post by M. Zhou
Even if we get no commercial sponsorship of API calls, we will eventually
experiment and evaluate one with the team's infrastructure. We are still
working towards that.
I am actually looking forward to this as I trust you to do it in a
responsible way given your earlier work.
Post by M. Zhou
First, DebGPT is designed to conveniently put any particular information
whether or not Debian-specific, to the context of LLM. I have also implemented
some mapreduce algorithm to let the LLM deal with extremely overlength
context such as a whole ratt buildlog directory.
Indeed. I can imagine that a LLM may be able to also suggest particular
lines in a build log that may hint at the cause of a failure whereas we
now have a set of patterns (e.g. "Waiting for unfinished jobs") and hope
that one of them locates the problem.
Post by M. Zhou
My intention is always to explore possible and potential ways to make LLM
useful in any possible extent. To support my idea, I wrote DebGPT, and
I tend to only claim things that is *already implemented* and *reproducible*
in DebGPT.
I have not experimented with DebGPT yet. Maybe I should. Unless already
done so, let me suggest that you engineer the default prompt in such a
way that your LLM references its information sources whenever possible.
Also including some text that asks the user to verify the answer using
external sources may help (in an opt-out way).

Practically speaking, the use of neural networks is not something we can
stop even if we wanted to. In ten years, all of us will be using neural
networks on a daily basis. Even today, they're difficult to avoid (e.g.
when dealing with customer service of any corporation). The best we can
do here is making the use of them as free as possible and that's what I
see Mo is doing.

Helmut
Simon Richter
2025-01-13 06:20:01 UTC
Reply
Permalink
Hi,
Post by M. Zhou
1. A contributor who needs help can leverage LLM for its immediate response and
help even if it only correct, for 30% of the time. It requires the contributor
to have knowledge and skill to properly use this new technology.
The "skill required" aspect is an important one. I'm willing to review
packages and provide feedback so learning can occur.

LLMs interrupt the learning process here, because they provide a
solution that makes me assume that the contributor already has a
specific mental model, so my feedback is designed to fine tune this.

When the contributor is using LLM generated code, my feedback is no
longer helpful for learning -- at best, it can be used to iteratively
arrive at a correct solution, but playing a game of telephone is a very
inefficient way to do it, and does not prepare the new contributor to
work unsupervised -- so LLMs basically disrupt our intake pipeline in
the same way they are doing in commercial software development.

Simon
Philip Hands
2025-01-14 08:00:01 UTC
Reply
Permalink
Post by M. Zhou
Post by Colin Watson
(I have less fixed views on locally-trained models, but I see no very
compelling need to find more things to spend energy on even if the costs
are lower.)
Locally-trained models are not practical in the current stage. State-of-the-art
models can only be trained by the richest capitals who have GPU clusters. Training
and deploying smaller models like 1 billion can lead to a very wrong impression
and conclusion on those models.
Isn't the corollary of that statement that all the useful models
available have been trained on material that we have no clues about
regarding the status of the copyright/licensing?

Even without considering the case where the training data belongs to
some litigious corporation, the thing that concerns me is that these
models have presumably sucked up every scrap of e.g. GPL code on the
net.

Having done that, they produce answers that are somehow informed by that
data, without any indication of how they arrived at the answer, and
certainly not a notice that the authors who produced the training data
intended there to be restrictions on the use of their creativity (that
an ethical person would want to honour).

I'd really like to know how it is possible for one to use an LLM to make
a contribution to a permissively licensed project (e.g. Expat) without
in effect stealing the code from one's own tribe of Copyleft authors.

Can one even play with an LLM without somehow contaminating one's brain?

Cheers, Phil.

P.S. AFAIK the likes of OpenAI declare that the output of the model
belongs to the prompter, but that strikes me as self-serving nonsense
that the courts will eventually rule on. I'd love to be proved wrong on
that though, because I'd quite like to play with LLMs, if only to do
things like generating potentially lethal cooking recipes to try out ;-)
--
Philip Hands -- https://hands.com/~phil
Andrey Rakhmatullin
2025-01-14 08:10:01 UTC
Reply
Permalink
Post by Philip Hands
I'd really like to know how it is possible for one to use an LLM to make
a contribution to a permissively licensed project (e.g. Expat) without
in effect stealing the code from one's own tribe of Copyleft authors.
Can one even play with an LLM without somehow contaminating one's brain?
Is this the same with reading some GPL code directly instead of using an
LLM that have read it?
--
WBR, wRAR
Holger Levsen
2025-01-13 08:40:01 UTC
Reply
Permalink
Post by M. Zhou
Post by Fabio Fantoni
Today trying to see how a new person who wants to start maintaining new
packages would do and trying to do research thinking from his point of
view and from simple searches on the internet I found unfortunately that
these parts are fragmented and do not help at all to aim for something
unified but not even simple and fast enough.
And those fragments also changes as the time goes by. Such as the sbuild
schroot -> unshare changes. They are not necessarily well documented in
every introduction material for new comers.
Even if somebody in Debian community has enough time to overhaul everything
and create a new documentation, it will become the situation described
in XKCD meme "standards": xkcd.com/927/ -- we just got yet another document
as a fragment as time goes by.
LLMs are good companions as long as the strong ones are used. In order to
help new comers to learn, it is better for Debian to allocate some LLM API
credits to them, instead of hoping for someone to work on the documentation
and falling in the XKCD-927 infinite loop.
Considering the price, the LLM API call for helping all DDs + newcomers,
I believe, will be cheaper than hiring a real person to overhaul those
documentations and keep them up to date. This is a feasible way to partly
solve the issue without endlessly waiting for the HERO to appear.
Debian should consider allocating some budget like several hundred USD
per month for the LLM API calls for all members and new-comers' usage.
DebGPT can be hooked somewhere within the debian development process,
such as sbuild/ratt for build log analysis, etc. It is cheap enough
and people will eventually figure out the useful apsect of them.
Opinion against this post will include something about hallucination.
In the case LLM write something that does not compile at all, or write
some non-existent API, a human is intelligent enough to easily notice
that build failure or lintian error and tell whether it is hallucination
or not. I personally believe LLMs, at the current stage, is useful
as long as used and interpreted properly.
BTW, I was in the middle of evaluation LLMs for the nm-template. I did lots
of procrastinations towards finishing the evaluation, but the first
several questions were answered perfectly.
https://salsa.debian.org/lumin/ai-noises/-/tree/main/nm-templates?ref_type=heads
If anybody is interested in seeing the LLM evaluation against nm-templates,
please let me know and your message will be significantly useful for me
to conquer my procrastination on it.
--
cheers,
Holger

⢀⣎⠟⠻⢶⣊⠀
⣟⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org
⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
⠈⠳⣄

Wir sollten allen MilliardÀren weltweit ein Ultimatum setzen: Wenn ihr in
einem Jahr die Klimakrise nicht gelöst habt, werdet ihr enteignet!“
(@nicosemsrott)
Loading...