Discussion:
Package statistics by downloads
Add Reply
Erik Schulz
2025-04-23 08:10:01 UTC
Reply
Permalink
I'm interested in package popularity. I'm aware of popcon
(https://popcon.debian.org/), but I'm more interested in actual
downloads.

Do the debian mirrors track unique downloads (e.g. by hashed IP
address), and if no, why not? I can understand the privacy argument,
but arguably package downloads aren't particularly revealing? And data
could be aggregated daily, thus limiting exposure.

Boyuan Yang pointed out in the debian-www list that the "repository
mirrors" often use third-party CDNs. I assume it uses DNS response
load-balancing.
There's potential for the request log to be biased geographically, but
it might add interesting data. Another bias would be people using a
VPN, but they'd only be counted once per exit node (so you'd have some
IPs using an extreme number of packages).

Parsing the request logs could be fairly trivial:
1. reduce to unique pairs every 24h: (ip, package)
2. sum by package
Philipp Kern
2025-04-23 09:20:01 UTC
Reply
Permalink
Post by Erik Schulz
I'm interested in package popularity. I'm aware of popcon
(https://popcon.debian.org/), but I'm more interested in actual
downloads.
What would this be useful for? You only described technical details, not
why we would want to do this.

Kind regards
Philipp Kern

Loading...