Discussion:
Debian Monthly [debian-devel]: AI News Report 2024/10
Add Reply
DebGPT
2024-11-09 09:30:01 UTC
Reply
Permalink
This is an experiment, by letting LLM go through all 369 emails from
debian-devel on Oct. The command for producing the news report
is included below. Use debgpt's git HEAD if you want to try.

----------------------------------------------------------------------------
THE FOLLOWING CONTAINS AI-GENERATED CONTENT
----------------------------------------------------------------------------

(pth25) ~/Git/debgpt ❯❯❯❯ debgpt -Hx ldo:debian-devel/2024/10 -a 'gather
interesting and important information from there and write a news
report. You may expand on important matter. Important matters include
but are not limited to technical discussions, incidents, critical bugs,
and heated discussions.'
[01:07:31] OpenAIFrontend> Starting conversation                
frontend.py:99
d9fb92a7-19ab-4dca-8fc5-4c642a5fadbc
[01:07:32] Got 369 threads from reader.py:470
https://lists.debian.org/debian-devel/2024/10/threads.
html.
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
MapReduce: Got 369 chunks from 'ldo:debian-devel/2024/10'
MapReduce: reducing 369 intermediate results
MapReduce: reducing 9 intermediate results
LLM[2]>
       Debian Community News: Key Developments and Technical Discussions

The Debian development community has been buzzing with activity, marked by
significant technical discussions, critical bug reports, and proposals
aimed at
enhancing the Debian ecosystem. This report summarizes the most important
developments from recent communications within the community.

                           New Package Introductions

One of the most notable advancements is the introduction of the libqt6pas
package, which provides bindings for Qt6 in Pascal. This package is
expected to
streamline GUI development for Pascal applications, building on the
success of
its predecessor, libqtpas, which catered to Qt5. Additionally, a
proposal for
the esphome package has been submitted, designed to facilitate firmware
creation for ESP32 and ESP8266 microcontrollers, thereby bolstering the
integration of smart home technologies.

                        Critical Bugs and Urgent Issues

The community is currently grappling with several critical bugs that could
impact the stability and performance of the Debian system:

 • dpkg-maintscript-helper Warnings: A contentious discussion has emerged
   regarding the firmware-sof-signed package upgrade process.
Developers are
   urged to investigate the warnings generated during the usrmerge
transition,
   highlighting the complexities of package management that can affect
user
experiences.
 • Hanging fakeroot Package: A critical bug in the fakeroot package has
been
   identified, causing it to hang and consume excessive CPU resources.
   Developer Chris Hofstaedtler has stressed the urgency of backporting
fixes
   to prevent delays in software updates for the stable release,
indicating a
   pressing need for immediate action.
 • Lintian Dependency Management: Reports of installation issues with the
   Lintian package have surfaced, underscoring ongoing challenges with
   dependency management in Debian's unstable distribution, Sid. These
issues
   are critical as they affect the ability of developers to maintain
package
integrity.

                  Technical Discussions on Package Management

The community has also engaged in important discussions regarding package
management practices:

 • Signify Package Renaming: A proposal to rename the signify package to
   signify-mail has been put forth to eliminate confusion with the OpenBSD
   version. This highlights the importance of clear naming conventions in
   package management.
 • Future of i386 Architecture Support: Concerns have been raised about
the
   future of i386 architecture support, prompting suggestions for a more
   streamlined package removal process. This could alleviate the burden on
   developers and ensure more efficient package management.

                             Security Enhancements

In a move to bolster security, rebuilds for arm64 packages are planned to
enable Pointer Authentication (PAC) and Branch Target Identification (BTI)
support. These enhancements are crucial for maintaining a secure
environment
within the Debian ecosystem, reflecting the community's commitment to
safeguarding user data and software integrity.

                      Compatibility Concerns with zlib-ng

Discussions have also centered around the compatibility of zlib-ng with
existing software systems. Developers have raised critical concerns about
backward compatibility, particularly its implications for widely-used
applications such as nginx. Ensuring compatibility is vital for
maintaining the
seamless operation of software within the Debian environment.

                        New Module Proposals for Tryton

In a positive development, Mathias Behrle has submitted several
proposals for
new Tryton modules aimed at enhancing the platform's functionality. These
proposals include modules for managing invoices, shipment costs, and
document
processing, showcasing a commitment to expanding Tryton's capabilities
within
the Debian ecosystem.

                    Community Engagement and Collaboration

The ongoing dialogues within the Debian community underscore a
collaborative
spirit as developers tackle technical challenges and work towards improving
user experiences. The collective efforts to address critical bugs, enhance
security, and expand software offerings reflect the community's
dedication to
maintaining the integrity and reliability of the Debian project.

As these developments continue to unfold, stakeholders are encouraged to
engage
in discussions and contribute to the ongoing enhancement of the Debian
ecosystem. The active participation of the community is essential for
navigating the complexities of package management and ensuring the
continued
success of Debian as a leading free software distribution.
Hector Oron
2024-11-09 13:20:01 UTC
Reply
Permalink
Hello Lumin,
Post by DebGPT
This is an experiment, by letting LLM go through all 369 emails from
debian-devel on Oct. The command for producing the news report
is included below. Use debgpt's git HEAD if you want to try.
First time I see this kind of email, I thought time ago that'd be a
really cool use of AI, to produce a summary of mailing lists - since I
struggle to read everything.

I just want to thank you for putting this together and, at least from
my side, this is very much appreciated.

Regards
--
Héctor Orón -.. . -... .. .- -. -.. . ...- . .-.. --- .--. . .-.
PICCA Frederic-Emmanuel
2024-11-09 13:40:01 UTC
Reply
Permalink
is it via ChatGPT or an llm self hosted ?

Can we imagine having a Debian hosted computer with and AMD GPU dedicated to this use case ?

Se should provide these summaries letter for most of our mailing list :)

cheers

Fred
Post by Hector Oron
Hello Lumin,
Post by DebGPT
This is an experiment, by letting LLM go through all 369 emails from
debian-devel on Oct. The command for producing the news report
is included below. Use debgpt's git HEAD if you want to try.
First time I see this kind of email, I thought time ago that'd be a
really cool use of AI, to produce a summary of mailing lists - since I
struggle to read everything.
I just want to thank you for putting this together and, at least from
my side, this is very much appreciated.
Regards
--
Héctor Orón -.. . -... .. .- -. -.. . ...- . .-.. --- .--. . .-.
Jeremy Stanley
2024-11-09 13:50:01 UTC
Reply
Permalink
Post by PICCA Frederic-Emmanuel
is it via ChatGPT or an llm self hosted ?
[...]

It's DebGPT: https://salsa.debian.org/deeplearning-team/debgpt
--
Jeremy Stanley
Mo Zhou
2024-11-09 16:00:01 UTC
Reply
Permalink
The LLM I used to produce that exact news report was gpt-4o-mini,
from openai. ChatGPT is the name of openai's LLM web interface and
its underlying LLM model name could change. It took roughly 3
minutes to perform the bulk API calls.

That said, I basically implemented support for all commonly seen
LLM inference services:

(4 commercial ones)
  openai, anthropic, google, xai,
(4 self-hosted)
  llamafile, ollama, vllm, zmq (built-in but kind of outdated.)

Other services missing from the list are also supported as long
as it has compatibility mode to the openai api.

For the particular use case like summarizing a mailing list, self-hosted
one will be much slower to respond to the bulk API call unless it is
hosted on a GPU cluster :-)

Small LLMs are not necessarily smart enough. The open llm leaderboard[3]
is a good reference for figuring out the best open-access llm for
self-hosting.

In terms of "Debian hosted computer with AMD GPU for LLM inference" --
that is exactly one of the long term goals of debian deep learning
team (debian-***@l.d.o). Team members are working to prepare the ROCm
packages and the ROCm version of pytorch.

I find ollama[1] and llamafile[2] quite handy to use locally if do not
mind using software from outside of debian archive, with a spare GPU.

[1] https://github.com/ollama/ollama
[2] https://github.com/Mozilla-Ocho/llamafile
[3] https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
Post by PICCA Frederic-Emmanuel
is it via ChatGPT or an llm self hosted ?
Can we imagine having a Debian hosted computer with and AMD GPU dedicated to this use case ?
Se should provide these summaries letter for most of our mailing list :)
cheers
Fred
Post by Hector Oron
Hello Lumin,
Post by DebGPT
This is an experiment, by letting LLM go through all 369 emails from
debian-devel on Oct. The command for producing the news report
is included below. Use debgpt's git HEAD if you want to try.
First time I see this kind of email, I thought time ago that'd be a
really cool use of AI, to produce a summary of mailing lists - since I
struggle to read everything.
I just want to thank you for putting this together and, at least from
my side, this is very much appreciated.
Regards
--
Héctor Orón -.. . -... .. .- -. -.. . ...- . .-.. --- .--. . .-.
Mo Zhou
2024-11-09 21:10:01 UTC
Reply
Permalink
I just realized that the news report could be more useful if it cites
the information source. Here we go:

debgpt -Hx ldo:debian-devel/2024/10 -a 'write a news report based on the
provided information. Cover as many topics as possible. You may expand a
little bit on important matter. include links to the report.' --no-render

BTW, which mailing lists should I cover with those montly reports?
I currently tried on debian-ai, debian-devel, and debian-science.

------------------------------------------------------------------
THE FOLLOWING CONTENTS ARE GENERATED BY AI
------------------------------------------------------------------

### Recent Developments in the Debian Community: A Snapshot

**Date:** October 31, 2024

The Debian development community has seen a flurry of activity in recent
weeks, marked by new package proposals, discussions concerning
architecture support, and enhancements in package management practices.
Here’s a comprehensive overview of the latest developments.

#### New Package Proposals for Tryton

Developer Mathias Behrle has put forth several promising new modules for
the Tryton application platform, which is gaining traction in business
and healthcare sectors. The proposed modules include:

- **tryton-modules-account-invoice-watermark**: This module introduces
the ability to add watermarks to invoices, enhancing document authenticity.
- **tryton-modules-marketing-campaign**: Aimed at improving marketing
efforts, this module facilitates the collection and analysis of data
related to campaigns.
- **tryton-modules-account-stock-shipment-cost**: This module allows for
the allocation of shipment costs based on invoices, streamlining
financial processes.
- **tryton-modules-document-incoming-invoice**: This innovative module
automates the creation of supplier invoices from incoming documents,
improving efficiency.

These proposals signify a strong commitment to bolstering the
functionalities of the Tryton platform. For more details, visit [Tryton
Downloads](https://downloads.tryton.org/7.0).

#### Lomiri Applications Packaging

In a parallel effort, Mike Gabriel has commenced the packaging of
several applications designed for the Lomiri operating environment,
which is essential to the Ubuntu Touch mobile OS. Key applications being
packaged include:

- **Lomiri Address Book App**: A user-friendly application for managing
contacts.
- **Lomiri Dialer App**: An application providing essential telephony
features.
- **Lomiri Messaging App**: A messaging platform facilitating seamless
communication.

These applications aim to significantly enhance the user experience
within the Lomiri environment. Further information can be found in the
[Debian Development Mailing
List](https://lists.debian.org/debian-devel/2024/10/msg00368.html).

#### Discussions on s390x Architecture Support

A significant topic of discussion has been the future of the s390x
architecture in Debian. Developer Chris Hofstaedtler has raised concerns
regarding the dwindling number of active maintainers and a growing
number of packages that do not support s390x. The community is currently
evaluating the viability of continuing support for this architecture,
especially in light of IBM's involvement and commitment to maintaining
s390x. For more insights, see the ongoing discussion
[here](https://lists.debian.org/debian-devel/2024/10/msg00295.html).

#### Challenges with Lintian and Package Uploads

The community continues to address challenges associated with Lintian,
the Debian package checker. Recent issues concerning license compliance
in the `warzone2100` package have sparked discussions on how to manage
Lintian overrides more effectively. Developers are actively exploring
solutions to streamline the package upload process and minimize
friction. More information is available
[here](https://lists.debian.org/debian-devel/2024/10/msg00203.html).

#### Upcoming Changes and Improvements

In a bid to enhance package management practices, the Debian community
is focusing on refining the Non-Maintainer Uploads (NMUs) process.
Developers are advocating for standardized workflows that would improve
collaboration and efficiency, ultimately reducing confusion and workload
for maintainers. For further details on NMUs, check out the discussion
[here](https://lists.debian.org/debian-devel/2024/10/msg00276.html).

#### Technical Updates and Backlogs

Colin Watson has confirmed that the Debian upload server is currently
facing delays due to a backlog caused by a mass rebuild for Pointer
Authentication (PAC) and Branch Target Identification (BTI) support.
Developers are advised to anticipate longer processing times for uploads
during this period. For updates on the upload delays, refer to the
discussion
[here](https://lists.debian.org/debian-devel/2024/10/msg00365.html).

### Conclusion

These developments highlight the vibrant and dynamic nature of the
Debian community, characterized by ongoing efforts to enhance software
offerings, improve user experiences, and address architectural support
challenges. The collaborative spirit remains robust as developers work
together to navigate these changes and continue to uphold the principles
of software freedom.

For more detailed reports and discussions, please refer to the [Debian
Development mailing list archives](https://lists.debian.org/debian-devel/).
Post by Mo Zhou
The LLM I used to produce that exact news report was gpt-4o-mini,
from openai. ChatGPT is the name of openai's LLM web interface and
its underlying LLM model name could change. It took roughly 3
minutes to perform the bulk API calls.
That said, I basically implemented support for all commonly seen
(4 commercial ones)
  openai, anthropic, google, xai,
(4 self-hosted)
  llamafile, ollama, vllm, zmq (built-in but kind of outdated.)
Other services missing from the list are also supported as long
as it has compatibility mode to the openai api.
For the particular use case like summarizing a mailing list, self-hosted
one will be much slower to respond to the bulk API call unless it is
hosted on a GPU cluster :-)
Small LLMs are not necessarily smart enough. The open llm leaderboard[3]
is a good reference for figuring out the best open-access llm for
self-hosting.
In terms of "Debian hosted computer with AMD GPU for LLM inference" --
that is exactly one of the long term goals of debian deep learning
packages and the ROCm version of pytorch.
I find ollama[1] and llamafile[2] quite handy to use locally if do not
mind using software from outside of debian archive, with a spare GPU.
[1] https://github.com/ollama/ollama
[2] https://github.com/Mozilla-Ocho/llamafile
[3]
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
Post by PICCA Frederic-Emmanuel
is it via ChatGPT or an llm self hosted ?
Can we imagine having a Debian hosted computer with and AMD GPU
dedicated to this use case ?
Se should provide these summaries letter for most of our mailing list :)
cheers
Fred
Post by Hector Oron
Hello Lumin,
Post by DebGPT
This is an experiment, by letting LLM go through all 369 emails from
debian-devel on Oct. The command for producing the news report
is included below. Use debgpt's git HEAD if you want to try.
First time I see this kind of email, I thought time ago that'd be a
really cool use of AI, to produce a summary of mailing lists - since I
struggle to read everything.
I just want to thank you for putting this together and, at least from
my side, this is very much appreciated.
Regards
--
  Héctor Orón  -.. . -... .. .- -.   -.. . ...- . .-.. --- .--. . .-.
Serafeim (Serafi) Zanikolas
2024-11-09 21:20:01 UTC
Reply
Permalink
Post by Mo Zhou
I just realized that the news report could be more useful if it cites
thanks! the links are really key, given the known hallucination issues of LLMs
Steve McIntyre
2024-11-09 22:30:01 UTC
Reply
Permalink
Post by Mo Zhou
I just realized that the news report could be more useful if it cites
debgpt -Hx ldo:debian-devel/2024/10 -a 'write a news report based on the
provided information. Cover as many topics as possible. You may expand a
little bit on important matter. include links to the report.' --no-render
BTW, which mailing lists should I cover with those montly reports?
I currently tried on debian-ai, debian-devel, and debian-science.
Please, no further. We don't need hallucinated summaries on our
lists. If you want to publish them, publish them somewhere separately
IMHO.
--
Steve McIntyre, Cambridge, UK. ***@einval.com
Can't keep my eyes from the circling sky,
Tongue-tied & twisted, Just an earth-bound misfit, I...
G. Branden Robinson
2024-11-09 22:30:01 UTC
Reply
Permalink
Post by Steve McIntyre
Please, no further. We don't need hallucinated summaries on our
lists. If you want to publish them, publish them somewhere separately
IMHO.
Oh, good--since it's not a CoC violation to express an unflattering
opinion of this experiment, did anyone notice how deadly dull the prose
style is? It's like the LLM had been trained solely on corporate press
releases.

I miss the Joeys.

Regards,
Branden
Mo Zhou
2024-11-09 22:50:01 UTC
Reply
Permalink
The tone can change: http://paste.debian.net/1335055/
LLMs are being improved rapidly over time.

I guess it's due to some potential safety issues so that LLM uses a dull
corporate tone by default. Those models should have been trained on
different tones, as long as we instruct it to use them.
Post by G. Branden Robinson
Post by Steve McIntyre
Please, no further. We don't need hallucinated summaries on our
lists. If you want to publish them, publish them somewhere separately
IMHO.
Oh, good--since it's not a CoC violation to express an unflattering
opinion of this experiment, did anyone notice how deadly dull the prose
style is? It's like the LLM had been trained solely on corporate press
releases.
I miss the Joeys.
Regards,
Branden
G. Branden Robinson
2024-11-09 23:20:01 UTC
Reply
Permalink
Post by Mo Zhou
The tone can change: http://paste.debian.net/1335055/
I recognize a different style there...
Post by Mo Zhou
LLMs are being improved rapidly over time.
...but I'm not sure I would call the new example an improvement. With
all those exclamation marks it seems more like advertising or a similar
hyperventilating form.
Post by Mo Zhou
I guess it's due to some potential safety issues
Safety? As in human life and limb? If we're already entrusting AI-
generated text to the instruction of humans in potentially hazardous
situations, that could be enough to make even most wild-eyed anarchist
cry out for the heavy boot of government regulation.

I think Steve's point about hallucination should be taken seriously.
In any communication the construction of meaning is effectively a
negotiated process between the speaker and listener.[1] Humans have
evolved our capacity for this over tens of thousands of years, such that
one model of human language facility is a "mental organ".[2]

From what I've seen of LLM language output, it seems to reside in an
analogue to the "uncanny valley"; it closely resembles human language,
but the metaphorical point at which meaning is constructed shifts from
the site where discourses carried out by humans place it. (That such a
point is susceptible to shifting at all may be the one of the theses of
Derrida; all I can say for sure is that if it was, his English
translators seemed bent on proving the point by example.)

Chomsky's famous example of "colorless green ideas sleep furiously" has
the virtue of advertising (to most humans) that it is nonsense. The
construct of a network of Markov chains in a large number of dimensions
isn't necessarily so obvious.
Post by Mo Zhou
Those models should have been trained on different tones, as long as
we instruct it to use them.
Has it got sliders for "Hunter S. Thompson" or "Christopher Hitchens"?

;-)

Regards,
Branden

[1] or reader/writer, in which case the "negotiation" is more static
than dynamic

[2] My favorite piece of supporting evidence is the spontaneous
development of sophisticated grammatical features by children in the
case of Nicaraguan Sign Language. This is one of the coolest
natural experiments ever to have taken place. Judy Shepherd-Kegl
should be awarded a prize for applying an ethic of non-interference
when so many others would have assumed a didactic one.
Charles Plessy
2024-11-09 23:50:01 UTC
Reply
Permalink
Thanks a lot Mo for this exciting experiment!

And having two ex-DPLs pressing the big red stop button is not
necessarly a bad sign in an ageing project. Often you will see ideas
rejected in a very dismissive if not insulting way (for example
source-only uploads or HTTPS URLs in /etc/apt/sources.list), and a
couple of years later they are mainstream!

What do people do when there is a long thread on debian-devel,
debian-project, debian-private or the like? Well, what I do is that I
check the first ~6 messages and then cherry-pick 3 or 4 answers deeper
in the thread from, for instance Andreas, Russ, Stefano, and maybe also
Steve and Branden to broaden my perspectives, and the rest I do not
read. It means: people with little reputation have little chance to
contribute original points of view to Debian unless they jump on the
keyboard as fast as they can, which does not increase the quality of the
discussion and biases against people who are in non-mainstream timezones
and who are slower to write in English.

Our mailing lists were a ground-breaking technological avance in the
past that would open Debian to the whole World, but now are they not
working exactly against that?

I see a big transformative potential for our future discussions: even if
a crowd is shouting circular arguments around, we can use AI to reassure
participants that original point of views can have good chances to be
part of a summary. Taking the effort to contribute is rewarded. This
can change Debian considerably. So please, more DebGPT summaries !

Have a nice week-end,

Charles
Mo Zhou
2024-11-10 00:30:01 UTC
Reply
Permalink
Hi Charles,
Post by Charles Plessy
Thanks a lot Mo for this exciting experiment!
And having two ex-DPLs pressing the big red stop button is not
necessarly a bad sign in an ageing project. Often you will see ideas
rejected in a very dismissive if not insulting way (for example
source-only uploads or HTTPS URLs in /etc/apt/sources.list), and a
couple of years later they are mainstream!
Those new technologies are sure to make more impact in the future. We
need to find a proper place where people are comforatable with LLM
noises. Currently it seems to me posting AI-generated news on the main
mailing lists leads to a "pollution effect" -- LLM will grab its report
from the last month If I don't mention it...
Post by Charles Plessy
What do people do when there is a long thread on debian-devel,
debian-project, debian-private or the like? Well, what I do is that I
check the first ~6 messages and then cherry-pick 3 or 4 answers deeper
in the thread from
I do more or less the same, or even worse -- keep them marked as unread
and see my unread mail in inbox go to 10k+. Those LLM tools, while
noisy, can still provide some help for me to keep me updated a little
bit -- to know what happened, what are being discussed and what is the
conclusion.

The LLM reports are not quite accurate. But I still see value there --
as long as we ask it to provide URL citations to the information source,
I can still catch up something with minimum read.
Post by Charles Plessy
I see a big transformative potential for our future discussions: even if
a crowd is shouting circular arguments around, we can use AI to reassure
participants that original point of views can have good chances to be
part of a summary. Taking the effort to contribute is rewarded. This
can change Debian considerably. So please, more DebGPT summaries !
DebGPT is now more about a general terminal LLM tool which I develop
and use very frequently. Summarizing mailing list is just one of the
not yet announced new features. Efforts in exploring the way to make
LLM useful will continue.

While hallucinating too much, LLMs can still correctly
teach me how to use urwid (which I could never understand by going
through their tutorial many times...).
That leads to the `debgpt config` TUI configuration wizard.

Any suggestion on a place where I can safely direct those generated
contents, with the audience being comfortable with LLM noise? As noted
by the ex-DPLs, it should happen at a dedicated experiment ground.
Steve McIntyre
2024-11-15 14:40:02 UTC
Reply
Permalink
Post by Mo Zhou
While hallucinating too much, LLMs can still correctly
teach me how to use urwid (which I could never understand by going
through their tutorial many times...).
That leads to the `debgpt config` TUI configuration wizard.
Any suggestion on a place where I can safely direct those generated
contents, with the audience being comfortable with LLM noise? As noted
by the ex-DPLs, it should happen at a dedicated experiment ground.
A few suggestions:

* Maybe the debian-ai list?

* Publish via your own space on people.debian.org for now and tell
people to look there.

* Set up ai-summary.debian.net and post there, alongside suitable
disclaimer text.
--
Steve McIntyre, Cambridge, UK. ***@einval.com
Can't keep my eyes from the circling sky,
Tongue-tied & twisted, Just an earth-bound misfit, I...
Holger Levsen
2024-11-10 08:30:02 UTC
Reply
Permalink
Post by Charles Plessy
Our mailing lists were a ground-breaking technological avance in the
past that would open Debian to the whole World, but now are they not
working exactly against that?
first: citation needed.
second: summaries written by applied statistics systems will not help.
(also citation needed I guess.)

-> please post these summaries to a dedicated applied statistics summary
mailinglist, but please dont spam the original lists with this bot content.

btw, the signature below was choosen by an 'artificial intelligence'
system called fortune. it's so amazing and wise!!1
--
cheers,
Holger

⢀⣎⠟⠻⢶⣊⠀
⣟⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org
⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
⠈⠳⣄

Smart things make us dumb.
Tiago Bortoletto Vaz
2024-11-10 15:30:01 UTC
Reply
Permalink
Hi,
Post by Holger Levsen
Post by Charles Plessy
Our mailing lists were a ground-breaking technological avance in the
past that would open Debian to the whole World, but now are they not
working exactly against that?
first: citation needed.
second: summaries written by applied statistics systems will not help.
(also citation needed I guess.)
I'm glad I know Holger in person, so I can smile instead of feeling furious
about this kind of input :-)
Post by Holger Levsen
-> please post these summaries to a dedicated applied statistics summary
mailinglist, but please dont spam the original lists with this bot content.
Hmm... An opt-in llm-generated-monthly-digest mailinglist/feed/whatever... Not
really a fan of this kind of content (yet?), but I believe that something like
this could be seen as a fair balance between avoiding annoyance and providing a
space for experimentation on the subject within Debian.
Post by Holger Levsen
btw, the signature below was choosen by an 'artificial intelligence'
system called fortune. it's so amazing and wise!!1
See how this whole thing has its fun side. Let's just relax and play,
responsibly(-ish). That said, please no LLM feeds in planet.d.o :p

+thanks lumin for all the effort and thought you have put into this.

Bests,
--
Tiago
Sean Whitton
2024-11-14 08:50:01 UTC
Reply
Permalink
Hello,
Post by Holger Levsen
Post by Charles Plessy
Our mailing lists were a ground-breaking technological avance in the
past that would open Debian to the whole World, but now are they not
working exactly against that?
first: citation needed.
second: summaries written by applied statistics systems will not help.
(also citation needed I guess.)
-> please post these summaries to a dedicated applied statistics summary
mailinglist, but please dont spam the original lists with this bot content.
I would prefer this, and with a disclaimer at the top of every post saying:

- that it's not pre-vetted by project representatives

- briefly describing the issue of hallucination, and that readers are
strongly advised to follow up on links before relying on any
information.

-- outside readers may not be as suspicious of LLMs as many of us
are, so it is good for us to remind them.

--------

If I am being honest, I would prefer these were not generated and posted
at all. I'm sorry that I feel I have to be discouraging, as I know you
want to use these tools in a way that is compatible with Debian's values.

But doesn't it seem like we're trying to apply a technical solution,
with many well-known shortcomings, to a social problem? That problem
is, we should be more disciplined about writing less.

One of our conduct documents even has this as one of its points: that we
should continually bear in mind the many demands on everyone's time.
--
Sean Whitton
Richard Lewis
2024-11-10 11:30:01 UTC
Reply
Permalink
Post by Mo Zhou
The tone can change: http://paste.debian.net/1335055/
LLMs are being improved rapidly over time.
I guess it's due to some potential safety issues so that LLM uses a dull
corporate tone by default.
I think it's slightly misdiagnosed here. to me, it comes accross as
"tedious" to read because the tone is more like a university essay with
everything being overly emphatic adjectives and adverbs everywhere.

(everything is a "flurry of activity" or "gaining traction" or
"bolstering" or "a bid to enhance" ... and everything is a "significant
devlopment".. it's all just too much.)

It's perhaps the training data that causes this -- this is how a lot of
people-who-dont-write write because schools and universities teach them
to write essays but not read/edit -- and the aim ends up being to
convince the reader that they are giving an opinion, which is not what
you want here.

So it might help to tell it to be more factual and neutral in
tone. Maybe you could even try telling it to adopt the style used on the
debian lists themselves? (lwn.net is also a good reference point)
G. Branden Robinson
2024-11-10 16:50:01 UTC
Reply
Permalink
Post by Richard Lewis
Post by Mo Zhou
The tone can change: http://paste.debian.net/1335055/
LLMs are being improved rapidly over time.
I guess it's due to some potential safety issues so that LLM uses a
dull corporate tone by default.
I think it's slightly misdiagnosed here. to me, it comes accross as
"tedious" to read because the tone is more like a university essay
with everything being overly emphatic adjectives and adverbs
everywhere.
This is a better description than mine. I carelessly attributed the
impact the generated content had on me as a reader to properties of the
prose style, which wasn't quite right. I'm so accustomed to that style
being simultaneously purple and lackadaisical that I expressed the
boredom it aroused in me through long, miserable experience with
corporate communications that do not reward the effort of reading.
Post by Richard Lewis
(everything is a "flurry of activity" or "gaining traction" or
"bolstering" or "a bid to enhance" ... and everything is a "significant
devlopment".. it's all just too much.)
Yes. Such phrases rapidly weary the reader when the factual payoff for
what they promise is almost always meager, and thus has a perversely
opposite effect from what was apparently intended. The style touts a
firm's business doings as matters of high adventure when they better
resemble a long session in a sensory deprivation tank. Most comm majors
seem to think they're writing for the J. Peterman catalog.

As is often the case in matters of style, William Strunk and George
Orwell knew whereof they spoke.

Regards,
Branden
Mo Zhou
2024-11-15 17:30:02 UTC
Reply
Permalink
Hi folks,
Post by DebGPT
This is an experiment, by letting LLM go through all 369 emails from
debian-devel on Oct.
I received lots of feedbacks from the experiments, from positive ones
to negative ones. It wasn't discouraging to see negative feedbacks since
that is usually what would happen when people see something that has
not appeared in the past. However, I see the core value of this experiment
as the opportunity that the community and I can learn from it.

Speaking of AI summary in the context of Debian community, I think it
brings more harm than benefit if broadcasted to general audience --
unless the industry can realiably address the hallucination issues.
While AI summary might be helpful to some extent, it in fact requires
a certain level of expertise from the user, in order to properly tell
its good parts and bad parts, and make use of its good parts. However,
assuming a public general audience with such expertise (or at least
tolerance) is not practical -- there might be people who consume the
bad parts as well along with the good. We simply cannot bear the cost
of forcing people to learn to tell the truth and the hallucinations.

Whenever the community needs to face more interaction with AIs, this
experiment can serve as an example for precaution.


Apart from that, I'll continue trying to explore the ways to make such
new technology useful since I'm interested in it. I can leak some of
my plans in this regard:

1. Let LLM answer the NM templates (maybe with debian policy or debian
developer reference in context) and see the percentage of questions
that can be answered correctly. Even if I don't do it, maybe new DD
applicants will.

2. Continue adding features to DebGPT and make a major release.

Since the first time when DebGPT was announced, I wrote many new stuff to
this tool -- and it gradually became my daily terminal LLM tool (I cannot
find a better one on pypi).

One of the interesting features is to edit file inplace, automatically
git add, generate git message and commit. This has been very reliable and
useful for simple tasks like adding documentation and type annotations
in DebGPT's python code. Really saves lots of time.

An real example for the described fully automated pipeline:
https://salsa.debian.org/deeplearning-team/debgpt/-/commit/4735b38141eafd6aa9b0863fc73296aa41562aed
What I did is just type the instruction in natural language, then the
implementation is automatically committed in git.

For me that's a fun part.
Charles Plessy
2024-11-16 02:40:01 UTC
Reply
Permalink
Hi Mo,

thanks again for your posts,

I was just thinking that the debian-mentors list could be a good target
for summarisation too: it is high traffic, email subject lines are
focused on what to upload, but discussions are focused on
problem-solving, thus some intersting tips & trick will be easy to miss
by people who do not read everything.

(Maybe also you can ask in the propmpt to avoid reporting people names
for the moment?)

Have a nice week-end,

Charles
Joerg Jaspert
2024-11-16 13:30:01 UTC
Reply
Permalink
Post by Mo Zhou
1. Let LLM answer the NM templates (maybe with debian policy or debian
developer reference in context) and see the percentage of questions
that can be answered correctly. Even if I don't do it, maybe new DD
applicants will.
And those who actually do this should, if we catch them, NOT ever end up
a DD. So I hope noone is as stupid.
--
bye, Joerg
Loading...