Planet Debian

Subscribe to Planet Debian feed
Planet Debian - https://planet.debian.org/
Updated: 2 hours 40 min ago

Dirk Eddelbuettel: #38: Faster Feedback Systems

19 June, 2022 - 22:46

Engineers build systems. Good engineers always stress and focus efficiency of these systems.

Two recent examples of engineering thinking follow. One was in a video / podcast interview with Martin Thompson (who is a noted high-performance code expert) I came across recently. The overall focus of the hour-long interview is on ‘managing software complexity’. Around minute twenty-two, the conversation turns to feedback loops and systems, and a strong preference for simple and fast systems for more immediate feedback. An important topic indeed.

The second example connects to this and permeates many tweets and other writings by Erik Bernhardsson. He had an earlier 2017 post on ‘Optimizing for iteration speed’, as well as a 17 May 2022 tweet on minimizing feedback loop size, another 28 Mar 2022 tweet reply on shorter feedback loops, then a 14 Feb 2022 post on problems with slow feedback loops, as well as a 13 Jan 2022 post on a priority for tighter feedback loops, and lastly a 23 Jul 2021 post on fast feedback cycles. You get the idea: Erik really digs faster feedback loops. Nobody likes to wait: immediatecy wins each time.

A few years ago, I had touched on this topic with two posts on how to make (R) package compilation (and hence installation) faster. One idea (which I still use whenever I must compile) was in post #11 on caching compilation. Another idea was in post #13: make it faster by not doing it, in this case via binary installation which skip the need for compilation (and which is what I aim for with, say, CI dependencies). Several subsequent posts can be found by scrolling down the r^4 blog section: we stressed the use of the amazing Rutter PPA ‘c2d4u’ for CRAN binaries (often via Rocker containers, the (post #28) promise of RSPM, and the (post #29) awesomeness of bspm. And then in the more recent post #34 from last December we got back to a topic which ties all these things together: Dependencies. We quoted Mies van der Rohe: Less is more. Especially when it comes to dependencies as these elongate the feedback loop and thereby delay feedback.

Our most recent post #37 on r2u connects these dots. Access to a complete set of CRAN binaries with full-dependency resolution accelerates use and installation. This of course also covers testing and continuous integration. Why wait minutes to recompile the same packages over and over when you can install the full Tidyverse in 18 seconds or the brms package and all it needs in 13 seconds as shown in the two gifs also on the r2u documentation site.

You can even power up the example setup of the second gif via this gitpod link giving you a full Ubuntu 22.04 session in your browser to try this: so go forth and install something from CRAN with ease! The benefit of a system such our r2u CRAN binaries is clear: faster feedback loops. This holds whether you work with few or many dependencies, tiny or tidy. Faster matters, and feedback can be had sooner.

And with the title of this post we now get a rallying cry to advocate for faster feedback systems: “FFS”.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

John Goerzen: Pipes, deadlocks, and strace annoyingly fixing them

19 June, 2022 - 10:46

This is a complex tale I will attempt to make simple(ish). I’ve (re)learned more than I cared to about the details of pipes, signals, and certain system calls – and the solution is still elusive.

For some time now, I have been using NNCP to back up my files. These backups are sent to my backup system, which effectively does this to process them (each ZFS send is piped to a shell script that winds up running this):

gpg -q -d | zstdcat -T0 | zfs receive -u -o readonly=on "$STORE/$DEST"

This processes tens of thousands of zfs sends per week. Recently, having written Filespooler, I switched to sending the backups using Filespooler over NNCP. Now fspl (the Filespooler executable) opens the file for each stream and then connects it to what amounts to this pipeline:

bash -c 'gpg -q -d 2>/dev/null | zstdcat -T0' | zfs receive -u -o readonly=on "$STORE/$DEST"

Actually, to be more precise, it spins up the bash part of it, reads a few bytes from it, and then connects it to the zfs receive.

And this works well — almost always. In something like 1/1000 of the cases, it deadlocks, and I still don’t know why. But I can talk about the journey of trying to figure it out (and maybe some of you will have some ideas).

Filespooler is written in Rust, and uses Rust’s Command system. Effectively what happens is this:

  1. The fspl process has a File handle, which after forking but before invoking bash, it dup2’s to stdin.
  2. The connection between bash and zfs receive is a standard Unix pipe.

I cannot get the problem to duplicate when I run the entire thing under strace -f. So I am left trying to peek at it from the outside. What happens if I try to attach to each component with strace -p?

  • bash is blocking in wait4(), which is expected.
  • gpg is blocking in write().
  • If I attach to zstdcat with strace -p, then all of a sudden the deadlock is cleared and everything resumes and completes normally.
  • Attaching to zfs receive with strace -p causes no output at all from strace for a few seconds, then zfs just writes “cannot receive incremental stream: incomplete stream” and exits with error code 1.

So the plot thickens! Why would connecting to zstdcat and zfs receive cause them to actually change behavior? strace works by using the ptrace system call, and ptrace in a number of cases requires sending SIGSTOP to a process. In a complicated set of circumstances, a system call may return EINTR when a SIGSTOP is received, with the idea that the system call should be retried. I can’t see, from either zstdcat or zfs, if this is happening, though.

So I thought, “how about having Filespooler manually copy data from bash to zfs receive in a read/write loop instead of having them connected directly via a pipe?” That is, there would be two pipes going there: one where Filespooler reads from the bash command, and one where it writes to zfs. If nothing else, I could instrument it with debugging.

And so I did, and I found that when it deadlocked, it was deadlocking on write — but with no discernible pattern as to where or when. So I went back to directly connected.

In analyzing straces, I found a Rust bug which I reported in which it is failing to close the read end of a pipe in the parent post-fork. However, having implemented a workaround for this, it doesn’t prevent the deadlock so this is orthogonal to the issue at hand.

Among the two strange things here are things returning to normal when I attach strace to zstdcat, and things crashing when I attach strace to zfs. I decided to investigate the latter.

It turns out that the ZFS code that is reading from stdin during zfs receive is in the kernel module, not userland. Here is the part that is triggering the “imcomplete stream” error:

                int err = zfs_file_read(fp, (char *)buf + done,
                    len - done, &resid);
                if (resid == len - done) {
                        /*
                         * Note: ECKSUM or ZFS_ERR_STREAM_TRUNCATED indicates
                         * that the receive was interrupted and can
                         * potentially be resumed.
                         */
                        err = SET_ERROR(ZFS_ERR_STREAM_TRUNCATED);
                }

resid is an output parameter with the number of bytes remaining from a short read, so in this case, if the read produced zero bytes, then it sets that error. What’s zfs_file_read then?

It boils down to a thin wrapper around kernel_read(). This winds up calling __kernel_read(), which calls read_iter on the pipe, which is pipe_read(). That’s where I don’t have the knowledge to get into the weeds right now.

So it seems likely to me that the problem has something to do with zfs receive. But, what, and why does it only not work in this one very specific situation, and only so rarely? And why does attaching strace to zstdcat make it all work again? I’m indeed puzzled!

Bastian Venthur: blag is now available in Debian

18 June, 2022 - 23:00

Last year, I wrote my own blog-aware static site generator in Python. I called it “blag” – named after the blag of the webcomic xkcd. Now I finally got around packaging- and uploading blag to Debian. It passed the NEW queue and is now part of the distribution. That means if you’re using Debian, you can install it via:

sudo aptitude install blag

Ubuntu will probably follow soon. For every other system, blag is also available on PyPI:

pip install blag

To get started, you can

mkdir blog && cd blog
blag quickstart                        # fill out some info
nvim content/hello-world.md            # write some content
blag build                             # build the website

Blag is aware of articles and pages: the difference is that articles are part of the blog and will be added to the atom feed, the archive and aggregated in the tag pages. Pages are just rendered out to HTML. Articles and pages can be freely mixed in the content directory, what differentiates an article from a page is the existence of the dade metadata element:

title: My first article
description: Short description of the article
date: 2022-06-18 23:00
tags: blogging, markdown


## Hello World!

Lorem ipsum.

[...]

blag also comes with a dev-server that rebuilds the website automatically on every change detected, you can start it using:

blag serve

The default theme looks quite ugly, and you probably want to create your own styling to make it more beautiful. The process is not very difficult if you’re familiar with jinja templating. Help on that can be found in the “Templating” section of the online documentation, the offline version in the blag-doc package, or the man page, respectively.

Speaking of the blag-doc package: packaging it was surprisingly tricky, and it also took me a lot of trial and error to realize that dh_sphinxdocs alone does not automatically put the generated html output into the appropriate package, you rather have to list them in the package.docs-file (i.e. blag-doc.docs) so dh_installdocs can install them properly.

Antoine Beaupré: Matrix notes

17 June, 2022 - 22:34

I have some concerns about Matrix (the protocol, not the movie that came out recently, although I do have concerns about that as well). I've been watching the project for a long time, and it seems more a promising alternative to many protocols like IRC, XMPP, and Signal.

This review may sound a bit negative, because it focuses on those concerns. I am the operator of an IRC network and people keep asking me to bridge it with Matrix. I have myself considered just giving up on IRC and converting to Matrix. This space is a living document exploring my research of that problem space. The TL;DR: is that no, I'm not setting up a bridge just yet, and I'm still on IRC.

This article was written over the course of the last three months, but I have been watching the Matrix project for years (my logs seem to say 2016 at least). The article is rather long. It will likely take you half an hour to read, so copy this over to your ebook reader, your tablet, or dead trees, and lean back and relax as I show you around the Matrix. Or, alternatively, just jump to a section that interest you, most likely the conclusion.

Introduction to Matrix

Matrix is an "open standard for interoperable, decentralised, real-time communication over IP. It can be used to power Instant Messaging, VoIP/WebRTC signalling, Internet of Things communication - or anywhere you need a standard HTTP API for publishing and subscribing to data whilst tracking the conversation history".

It's also (when compared with XMPP) "an eventually consistent global JSON database with an HTTP API and pubsub semantics - whilst XMPP can be thought of as a message passing protocol."

According to their FAQ, the project started in 2014, has about 20,000 servers, and millions of users. Matrix works over HTTPS but over a special port: 8448.

Security and privacy

I have some concerns about the security promises of Matrix. It's advertised as a "secure" with "E2E [end-to-end] encryption", but how does it actually work?

Data retention defaults

One of my main concerns with Matrix is data retention, which is a key part of security in a threat model where (for example) an hostile state actor wants to surveil your communications and can seize your devices.

On IRC, servers don't actually keep messages all that long: they pass them along to other servers and clients as fast as they can, only keep them in memory, and move on to the next message. There are no concerns about data retention on messages (and their metadata) other than the network layer. (I'm ignoring the issues with user registration, which is a separate, if valid, concern.) Obviously, an hostile server could log everything passing through it, but IRC federations are normally tightly controlled. So, if you trust your IRC operators, you should be fairly safe. Obviously, clients can (and often do, even if OTR is configured!) log all messages, but this is generally not the default. Irssi, for example, does not log by default. IRC bouncers are more likely to log to disk, of course, to be able to do what they do.

Compare this to Matrix: when you send a message to a Matrix homeserver, that server first stores it in its internal SQL database. Then it will transmit that message to all clients connected to that server and room, and to all other servers that have clients connected to that room. Those remote servers, in turn, will keep a copy of that message and all its metadata in their own database, by default forever. On encrypted rooms those messages are encrypted, but not their metadata.

There is a mechanism to expire entries in Synapse, but it is not enabled by default. So one should generally assume that a message sent on Matrix is never expired.

GDPR in the federation

But even if that setting was enabled by default, how do you control it? This is a fundamental problem of the federation: if any user is allowed to join a room (which is the default), those user's servers will log all content and metadata from that room. That includes private, one-on-one conversations, since those are essentially rooms as well.

In the context of the GDPR, this is really tricky: who is the responsible party (known as the "data controller") here? It's basically any yahoo who fires up a home server and joins a room.

In a federated network, one has to wonder whether GDPR enforcement is even possible at all. But in Matrix in particular, if you want to enforce your right to be forgotten in a given room, you would have to:

  1. enumerate all the users that ever joined the room while you were there
  2. discover all their home servers
  3. start a GDPR procedure against all those servers

I recognize this is a hard problem to solve while still keeping an open ecosystem. But I believe that Matrix should have much stricter defaults towards data retention than right now. Message expiry should be enforced by default, for example. (Note that there are also redaction policies that could be used to implement part of the GDPR automatically, see the privacy policy discussion below on that.)

Also keep in mind that, in the brave new peer-to-peer world that Matrix is heading towards, the boundary between server and client is likely to be fuzzier, which would make applying the GDPR even more difficult.

In fact, maybe Synapse should be designed so that there's no configurable flag to turn off data retention. A bit like how most system loggers in UNIX (e.g. syslog) come with a log retention system that typically rotate logs after a few weeks or month. Historically, this was designed to keep hard drives from filling up, but it also has the added benefit of limiting the amount of personal information kept on disk in this modern day. (Arguably, syslog doesn't rotate logs on its own, but, say, Debian GNU/Linux, as an installed system, does have log retention policies well defined for installed packages, and those can be discussed. And "no expiry" is definitely a bug.

Matrix.org privacy policy

When I first looked at Matrix, five years ago, Element.io was called Riot.im and had a rather dubious privacy policy:

We currently use cookies to support our use of Google Analytics on the Website and Service. Google Analytics collects information about how you use the Website and Service.

[...]

This helps us to provide you with a good experience when you browse our Website and use our Service and also allows us to improve our Website and our Service.

When I asked Matrix people about why they were using Google Analytics, they explained this was for development purposes and they were aiming for velocity at the time, not privacy (paraphrasing here).

They also included a "free to snitch" clause:

If we are or believe that we are under a duty to disclose or share your personal data, we will do so in order to comply with any legal obligation, the instructions or requests of a governmental authority or regulator, including those outside of the UK.

Those are really broad terms, above and beyond what is typically expected legally.

Like the current retention policies, such user tracking and ... "liberal" collaboration practices with the state set a bad precedent for other home servers.

Thankfully, since the above policy was published (2017), the GDPR was "implemented" (2018) and it seems like both the Element.io privacy policy and the Matrix.org privacy policy have been somewhat improved since.

Notable points of the new privacy policies:

  • 2.3.1.1: the "federation" section actually outlines that "Federated homeservers and Matrix clients which respect the Matrix protocol are expected to honour these controls and redaction/erasure requests, but other federated homeservers are outside of the span of control of Element, and we cannot guarantee how this data will be processed"
  • 2.6: users under the age of 16 should not use the matrix.org service
  • 2.10: Upcloud, Mythic Beast, Amazon, and CloudFlare possibly have access to your data (it's nice to at least mention this in the privacy policy: many providers don't even bother admitting to this kind of delegation)
  • Element 2.2.1: mentions many more third parties (Twilio, Stripe, Quaderno, LinkedIn, Twitter, Google, Outplay, PipeDrive, HubSpot, Posthog, Sentry, and Matomo (phew!) used when you are paying Matrix.org for hosting

I'm not super happy with all the trackers they have on the Element platform, but then again you don't have to use that service. Your favorite homeserver (assuming you are not on Matrix.org) probably has their own Element deployment, hopefully without all that garbage.

Overall, this is all a huge improvement over the previous privacy policy, so hats off to the Matrix people for figuring out a reasonable policy in such a tricky context. I particularly like this bit:

We will forget your copy of your data upon your request. We will also forward your request to be forgotten onto federated homeservers. However - these homeservers are outside our span of control, so we cannot guarantee they will forget your data.

It's great they implemented those mechanisms and, after all, if there's an hostile party in there, nothing can prevent them from using screenshots to just exfiltrate your data away from the client side anyways, even with services typically seen as more secure, like Signal.

As an aside, I also appreciate that Matrix.org has a fairly decent code of conduct, based on the TODO CoC which checks all the boxes in the geekfeminism wiki.

Metadata handling

Overall, privacy protections in Matrix mostly concern message contents, not metadata. In other words, who's talking with who, when and from where is not well protected. Compared to a tool like Signal, which goes through great lengths to anonymize that data with features like private contact discovery, disappearing messages, sealed senders, and private groups, Matrix is definitely behind.

This is a known issue (opened in 2019) in Synapse, but this is not just an implementation issue, it's a flaw in the protocol itself. Home servers keep join/leave of all rooms, which gives clear text information about who is talking to. Synapse logs may also contain privately identifiable information that home server admins might not be aware of in the first place. Those log rotation policies are separate from the server-level retention policy, which may be confusing for a novice sysadmin.

Combine this with the federation: even if you trust your home server to do the right thing, the second you join a public room with third-party home servers, those ideas kind of get thrown out because those servers can do whatever they want with that information. Again, a problem that is hard to solve in any federation.

To be fair, IRC doesn't have a great story here either: any client knows not only who's talking to who in a room, but also typically their client IP address. Servers can (and often do) obfuscate this, but often that obfuscation is trivial to reverse. Some servers do provide "cloaks" (sometimes automatically), but that's kind of a "slap-on" solution that actually moves the problem elsewhere: now the server knows a little more about the user.

Overall, I would worry much more about a Matrix home server seizure than a IRC or Signal server seizure. Signal does get subpoenas, and they can only give out a tiny bit of information about their users: their phone number, and their registration, and last connection date. Matrix carries a lot more information in its database.

Amplification attacks on URL previews

I (still!) run an Icecast server and sometimes share links to it on IRC which, obviously, also ends up on (more than one!) Matrix home servers because some people connect to IRC using Matrix. This, in turn, means that Matrix will connect to that URL to generate a link preview.

I feel this outlines a security issue, especially because those sockets would be kept open seemingly forever. I tried to warn the Matrix security team but somehow, I don't think this issue was taken very seriously. Here's the disclosure timeline:

  • January 18: contacted Matrix security
  • January 19: response: already reported as a bug
  • January 20: response: can't reproduce
  • January 31: timeout added, considered solved
  • January 31: I respond that I believe the security issue is underestimated, ask for clearance to disclose
  • February 1: response: asking for two weeks delay after the next release (1.53.0) including another patch, presumably in two weeks' time
  • February 22: Matrix 1.53.0 released
  • April 14: I notice the release, ask for clearance again
  • April 14: response: referred to the public disclosure

There are a couple of problems here:

  1. the bug was publicly disclosed in September 2020, and not considered a security issue until I notified them, and even then, I had to insist

  2. no clear disclosure policy timeline was proposed or seems established in the project (there is a security disclosure policy but it doesn't include any predefined timeline)

  3. I wasn't informed of the disclosure

  4. the actual solution is a size limit (10MB, already implemented), a time limit (30 seconds, implemented in PR 11784), and a content type allow list (HTML, "media" or JSON, implemented in PR 11936), and I'm not sure it's adequate

  5. (pure vanity:) I did not make it to their Hall of fame

I'm not sure those solutions are adequate because they all seem to assume a single home server will pull that one URL for a little while then stop. But in a federated network, many (possibly thousands) home servers may be connected in a single room at once. If an attacker drops a link into such a room, all those servers would connect to that link all at once. This is an amplification attack: a small amount of traffic will generate a lot more traffic to a single target. It doesn't matter there are size or time limits: the amplification is what matters here.

It should also be noted that clients that generate link previews have more amplification because they are more numerous than servers. And of course, the default Matrix client (Element) does generate link previews as well.

That said, this is possibly not a problem specific to Matrix: any federated service that generates link previews may suffer from this.

I'm honestly not sure what the solution is here. Maybe moderation? Maybe link previews are just evil? All I know is there was this weird bug in my Icecast server and I tried to ring the bell about it, and it feels it was swept under the rug. Somehow I feel this is bound to blow up again in the future, even with the current mitigation.

Moderation

In Matrix like elsewhere, Moderation is a hard problem. There is a detailed moderation guide and much of this problem space is actively worked on in Matrix right now. A fundamental problem with moderating a federated space is that a user banned from a room can rejoin the room from another server. This is why spam is such a problem in Email, and why IRC networks have stopped federating ages ago (see the IRC history for that fascinating story).

The mjolnir bot

The mjolnir moderation bot is designed to help with some of those things. It can kick and ban users, redact all of a user's message (as opposed to one by one), all of this across multiple rooms. It can also subscribe to a federated block list published by matrix.org to block known abusers (users or servers). Bans are pretty flexible and can operate at the user, room, or server level.

Matrix people suggest making the bot admin of your channels, because you can't take back admin from a user once given.

The command-line tool

There's also a new command line tool designed to do things like:

  • System notify users (all users/users from a list, specific user)
  • delete sessions/devices not seen for X days
  • purge the remote media cache
  • select rooms with various criteria (external/local/empty/created by/encrypted/cleartext)
  • purge history of theses rooms
  • shutdown rooms

This tool and Mjolnir are based on the admin API built into Synapse.

Rate limiting

Synapse has pretty good built-in rate-limiting which blocks repeated login, registration, joining, or messaging attempts. It may also end up throttling servers on the federation based on those settings.

Fundamental federation problems

Because users joining a room may come from another server, room moderators are at the mercy of the registration and moderation policies of those servers. Matrix is like IRC's +R mode ("only registered users can join") by default, except that anyone can register their own homeserver, which makes this limited.

Server admins can block IP addresses and home servers, but those tools are not currently available to room admins. So it would be nice to have room admins have that capability, just like IRC channel admins can block users based on their IP address.

Matrix has the concept of guest accounts, but it is not used very much, and virtually no client supports it. This contrasts with the way IRC works: by default, anyone can join an IRC network even without authentication. Some channels require registration, but in general you are free to join and look around (until you get blocked, of course).

I have heard anecdotal evidence that "moderating bridges is hell", and I can imagine why. Moderation is already hard enough on one federation, when you bridge a room with another network, you inherit all the problems from that network but without the entire abuse control tools from the original network's API...

Room admins

Matrix, in particular, has the problem that room administrators (which have the power to redact messages, ban users, and promote other users) are bound to their Matrix ID which is, in turn, bound to their home servers. This implies that a home server administrators could (1) impersonate a given user and (2) use that to hijack the room. So in practice, the home server is the trust anchor for rooms, not the user themselves.

That said, if server B administrator hijack user joe on server B, they will hijack that room on that specific server. This will not (necessarily) affect users on the other servers, as servers could refuse parts of the updates or ban the compromised account (or server).

It does seem like a major flaw that room credentials are bound to Matrix identifiers, as opposed to the E2E encryption credentials. In an encrypted room even with fully verified members, a compromised or hostile home server can still take over the room by impersonating an admin. That admin (or even a newly minted user) can then send events or listen on the conversations.

This is even more frustrating when you consider that Matrix events are actually signed and therefore have some authentication attached to them, acting like some sort of Merkle tree (as it contains a link to previous events). That signature, however, is made from the homeserver PKI keys, not the client's E2E keys, which makes E2E feel like it has been "bolted on" later.

Availability

While Matrix has a strong advantage over Signal in that it's decentralized (so anyone can run their own homeserver,), I couldn't find an easy way to run a "multi-primary" setup, or even a "redundant" setup (even if with a single primary backend), short of going full-on "replicate PostgreSQL and Redis data", which is not typically for the faint of heart.

How this works in IRC

On IRC, it's quite easy to setup redundant nodes. All you need is:

  1. a new machine (with it's own public address with an open port)

  2. a shared secret (or certificate) between that machine and an existing one on the network

  3. a connect {} block on both servers

That's it: the node will join the network and people can connect to it as usual and share the same user/namespace as the rest of the network. The servers take care of synchronizing state: you do not need about replicating a database server.

(Now, experienced IRC people will know there's a catch here: IRC doesn't have authentication built in, and relies on "services" which are basically bots that authenticate users (I'm simplifying, don't nitpick). If that service goes down, the network still works, but then people can't authenticate, and they can start doing nasty things like steal people's identity if they get knocked offline. But still: basic functionality still works: you can talk in rooms and with users that are on the reachable network.)

User identities

Matrix is more complicated. Each "home server" has its own identity namespace: a specific user (say @anarcat:matrix.org) is bound to that specific home server. If that server goes down, that user is completely disconnected. They could register a new account elsewhere and reconnect, but then they basically lose all their configuration: contacts, joined channels are all lost.

(Also notice how the Matrix IDs don't look like a typical user address like an email in XMPP. They at least did their homework and got the allocation for the scheme.)

Rooms

Users talk to each other in "rooms", even in one-to-one communications. (Rooms are also used for other things like "spaces", they're basically used for everything, think "everything is a file" kind of tool.) For rooms, home servers act more like IRC nodes in that they keep a local state of the chat room and synchronize it with other servers. Users can keep talking inside a room if the server that originally hosts the room goes down. Rooms can have a local, server-specific "alias" so that, say, #room:matrix.org is also visible as #room:example.com on the example.com home server. Both addresses refer to the same room underlying room.

(Finding this in the Element settings is not obvious though, because that "alias" are actually called a "local address" there. So to create such an alias (in Element), you need to go in the room settings' "General" section, "Show more" in "Local address", then add the alias name (e.g. foo), and then that room will be available on your example.com homeserver as #foo:example.com.)

So a room doesn't belong to a server, it belongs to the federation, and anyone can join the room from any serer (if the room is public, or if invited otherwise). You can create a room on server A and when a user from server B joins, the room will be replicated on server B as well. If server A fails, server B will keep relaying traffic to connected users and servers.

A room is therefore not fundamentally addressed with the above alias, instead ,it has a internal Matrix ID, which basically a random string. It has a server name attached to it, but that was made just to avoid collisions. That can get a little confusing. For example, the #fractal:gnome.org room is an alias on the gnome.org server, but the room ID is !hwiGbsdSTZIwSRfybq:matrix.org. That's because the room was created on matrix.org, but the preferred branding is gnome.org now.

As an aside, rooms, by default, live forever, even after the last user quits. There's an admin API to delete rooms and a tombstone event to redirect to another one, but neither have a GUI yet. The latter is part of MSC1501 ("Room version upgrades") which allows a room admin to close a room, with a message and a pointer to another room.

Spaces

Discovering rooms can be tricky: there is a per-server room directory, but Matrix.org people are trying to deprecate it in favor of "Spaces". Room directories were ripe for abuse: anyone can create a room, so anyone can show up in there. It's possible to restrict who can add aliases, but anyways directories were seen as too limited.

In contrast, a "Space" is basically a room that's an index of other rooms (including other spaces), so existing moderation and administration mechanism that work in rooms can (somewhat) work in spaces as well. This enables a room directory that works across federation, regardless on which server they were originally created.

New users can be added to a space or room automatically in Synapse. (Existing users can be told about the space with a server notice.) This gives admins a way to pre-populate a list of rooms on a server, which is useful to build clusters of related home servers, providing some sort of redundancy, at the room -- not user -- level.

Home servers

So while you can workaround a home server going down at the room level, there's no such thing at the home server level, for user identities. So if you want those identities to be stable in the long term, you need to think about high availability. One limitation is that the domain name (e.g. matrix.example.com) must never change in the future, as renaming home servers is not supported.

The documentation used to say you could "run a hot spare" but that has been removed. Last I heard, it was not possible to run a high-availability setup where multiple, separate locations could replace each other automatically. You can have high performance setups where the load gets distributed among workers, but those are based on a shared database (Redis and PostgreSQL) backend.

So my guess is it would be possible to create a "warm" spare server of a matrix home server with regular PostgreSQL replication, but that is not documented in the Synapse manual. This sort of setup would also not be useful to deal with networking issues or denial of service attacks, as you will not be able to spread the load over multiple network locations easily. Redis and PostgreSQL heroes are welcome to provide their multi-primary solution in the comments. In the meantime, I'll just point out this is a solution that's handled somewhat more gracefully in IRC, by having the possibility of delegating the authentication layer.

Delegations

If you do not want to run a Matrix server yourself, it's possible to delegate the entire thing to another server. There's a server discovery API which uses the .well-known pattern (or SRV records, but that's "not recommended" and a bit confusing) to delegate that service to another server. Be warned that the server still needs to be explicitly configured for your domain. You can't just put:

{ "m.server": "matrix.org:443" }

... on https://example.com/.well-known/matrix/server and start using @you:example.com as a Matrix ID. That's because Matrix doesn't support "virtual hosting" and you'd still be connecting to rooms and people with your matrix.org identity, not example.com as you would normally expect. This is also why you cannot rename your home server.

The server discovery API is what allows servers to find each other. Clients, on the other hand, use the client-server discovery API: this is what allows a given client to find your home server when you type your Matrix ID on login.

Performance

The high availability discussion brushed over the performance of Matrix itself, but let's now dig into that.

Horizontal scalability

There were serious scalability issues of the main Matrix server, Synapse, in the past. So the Matrix team has been working hard to improve its design. Since Synapse 1.22 the home server can horizontally to multiple workers (see this blog post for details) which can make it easier to scale large servers.

Other implementations

There are other promising home servers implementations from a performance standpoint (dendrite, Golang, entered beta in late 2020; conduit, Rust, beta; others), but none of those are feature-complete so there's a trade-off to be made there. Synapse is also adding a lot of feature fast, so it's an open question whether the others will ever catch up. (I have heard that Dendrite might actually surpass Synapse in features within a few years, which would put Synapse in a more "LTS" situation.)

Latency

Matrix can feel slow sometimes. For example, joining the "Matrix HQ" room in Element (from matrix.debian.social) takes a few minutes and then fails. That is because the home server has to sync the entire room state when you join the room. There was promising work on this announced in the lengthy 2021 retrospective, and some of that work landed (partial sync) in the 1.53 release already. Other improvements coming include sliding sync, lazy loading over federation, and fast room joins. So that's actually something that could be fixed in the fairly short term.

But in general, communication in Matrix doesn't feel as "snappy" as on IRC or even Signal. It's hard to quantify this without instrumenting a full latency test bed (for example the tools I used in the terminal emulators latency tests), but even just typing in a web browser feels slower than typing in a xterm or Emacs for me.

Even in conversations, I "feel" people don't immediately respond as fast. In fact, this could be an interesting double-blind experiment to make: have people guess whether they are talking to a person on Matrix, XMPP, or IRC, for example. My theory would be that people could notice that Matrix users are slower, if only because of the TCP round-trip time each message has to take.

Transport

Some courageous person actually made some tests of various messaging platforms on a congested network. His evaluation was basically:

  • Briar: uses Tor, so unusable except locally
  • Matrix: "struggled to send and receive messages", joining a room takes forever as it has to sync all history, "took 20-30 seconds for my messages to be sent and another 20 seconds for further responses"
  • XMPP: "worked in real-time, full encryption, with nearly zero lag"

So that was interesting. I suspect IRC would have also fared better, but that's just a feeling.

Other improvements to the transport layer include support for websocket and the CoAP proxy work from 2019 (targeting 100bps links), but both seem stalled at the time of writing. The Matrix people have also announced the pinecone p2p overlay network which aims at solving large, internet-scale routing problems. See also this talk at FOSDEM 2022.

Usability Onboarding and workflow

The workflow for joining a room, when you use Element web, is not great:

  1. click on a link in a web browser
  2. land on (say) https://matrix.to/#/#matrix-dev:matrix.org
  3. offers "Element", yeah that's sounds great, let's click "Continue"
  4. land on https://app.element.io/#/room%2F%23matrix-dev%3Amatrix.org and then you need to register, aaargh

As you might have guessed by now, there is a specification to solve this, but web browsers need to adopt it as well, so that's far from actually being solved. At least browsers generally know about the matrix: scheme, it's just not exactly clear what they should do with it, especially when the handler is just another web page (e.g. Element web).

In general, when compared with tools like Signal or WhatsApp, Matrix doesn't fare so well in terms of user discovery. I probably have some of my normal contacts that have a Matrix account as well, but there's really no way to know. It's kind of creepy when Signal tells you "this person is on Signal!" but it's also pretty cool that it works, and they actually implemented it pretty well.

Registration is also less obvious: in Signal, the app confirms your phone number automatically. It's friction-less and quick. In Matrix, you need to learn about home servers, pick one, register (with a password! aargh!), and then setup encryption keys (not default), etc. It's a lot more friction.

And look, I understand: giving away your phone number is a huge trade-off. I don't like it either. But it solves a real problem and makes encryption accessible to a ton more people. Matrix does have "identity servers" that can serve that purpose, but I don't feel confident sharing my phone number there. It doesn't help that the identity servers don't have private contact discovery: giving them your phone number is a more serious security compromise than with Signal.

There's a catch-22 here too: because no one feels like giving away their phone numbers, no one does, and everyone assumes that stuff doesn't work anyways. Like it or not, Signal forcing people to divulge their phone number actually gives them critical mass that means actually a lot of my relatives are on Signal and I don't have to install crap like WhatsApp to talk with them.

5 minute clients evaluation

Throughout all my tests I evaluated a handful of Matrix clients, mostly from Flathub because almost none of them are packaged in Debian.

Right now I'm using Element, the flagship client from Matrix.org, in a web browser window, with the PopUp Window extension. This makes it look almost like a native app, and opens links in my main browser window (instead of a new tab in that separate window), which is nice. But I'm tired of buying memory to feed my web browser, so this indirection has to stop. Furthermore, I'm often getting completely logged off from Element, which means re-logging in, recovering my security keys, and reconfiguring my settings. That is extremely annoying.

Coming from Irssi, Element is really "GUI-y" (pronounced "gooey"). Lots of clickety happening. To mark conversations as read, in particular, I need to click-click-click on all the tabs that have some activity. There's no "jump to latest message" or "mark all as read" functionality as far as I could tell. In Irssi the former is built-in (alt-a) and I made a custom /READ command for the latter:

/ALIAS READ script exec \$_->activity(0) for Irssi::windows

And yes, that's a Perl script in my IRC client. I am not aware of any Matrix client that does stuff like that, except maybe Weechat, if we can call it a Matrix client, or Irssi itself, now that it has a Matrix plugin (!).

As for other clients, I have looked through the Matrix Client Matrix (confusing right?) to try to figure out which one to try, and, even after selecting Linux as a filter, the chart is just too wide to figure out anything. So I tried those, kind of randomly:

  • Fractal
  • Mirage
  • Nheko
  • Quaternion

Unfortunately, I lost my notes on those, I don't actually remember which one did what. I still have a session open with Mirage, so I guess that means it's the one I preferred, but I remember they were also all very GUI-y.

Maybe I need to look at weechat-matrix or gomuks. At least Weechat is scriptable so I could continue playing the power-user. Right now my strategy with messaging (and that includes microblogging like Twitter or Mastodon) is that everything goes through my IRC client, so Weechat could actually fit well in there. Going with gomuks, on the other hand, would mean running it in parallel with Irssi or ... ditching IRC, which is a leap I'm not quite ready to take just yet.

Oh, and basically none of those clients (except Nheko and Element) support VoIP, which is still kind of a second-class citizen in Matrix. It does not support large multimedia rooms, for example: Jitsi was used for FOSDEM instead of the native videoconferencing system.

Bots

This falls a little aside the "usability" section, but I didn't know where to put this... There's a few Matrix bots out there, and you are likely going to be able to replace your existing bots with Matrix bots. It's true that IRC has a long and impressive history with lots of various bots doing various things, but given how young Matrix is, there's still a good variety:

  • maubot: generic bot with tons of usual plugins like sed, dice, karma, xkcd, echo, rss, reminder, translate, react, exec, gitlab/github webhook receivers, weather, etc
  • opsdroid: framework to implement "chat ops" in Matrix, connects with Matrix, GitHub, GitLab, Shell commands, Slack, etc
  • matrix-nio: another framework, used to build lots more bots like:
    • hemppa: generic bot with various functionality like weather, RSS feeds, calendars, cron jobs, OpenStreetmaps lookups, URL title snarfing, wolfram alpha, astronomy pic of the day, Mastodon bridge, room bridging, oh dear
    • devops: ping, curl, etc
    • podbot: play podcast episodes from AntennaPod
    • cody: Python, Ruby, Javascript REPL
    • eno: generic bot, "personal assistant"
  • mjolnir: moderation bot
  • hookshot: bridge with GitLab/GitHub
  • matrix-monitor-bot: latency monitor

One thing I haven't found an equivalent for is Debian's MeetBot. There's an archive bot but it doesn't have topics or a meeting chair, or HTML logs.

Working on Matrix

As a developer, I find Matrix kind of intimidating. The specification is huge. The official specification itself looks somewhat digestable: it's only 6 APIs so that looks, at first, kind of reasonable. But whenever you start asking complicated questions about Matrix, you quickly fall into the Matrix Spec Change specification (which, yes, is a separate specification). And there are literally hundreds of MSCs flying around. It's hard to tell what's been adopted and what hasn't, and even harder to figure out if your specific client has implemented it.

(One trendy answer to this problem is to "rewrite it in rust": Matrix are working on implementing a lot of those specifications in a matrix-rust-sdk that's designed to take the implementation details away from users.)

Just taking the latest weekly Matrix report, you find that three new MSCs proposed, just last week! There's even a graph that shows the number of MSCs is progressing steadily, at 600+ proposals total, with the majority (300+) "new". I would guess the "merged" ones are at about 150.

That's a lot of text which includes stuff like 3D worlds which, frankly, I don't think you should be working on when you have such important security and usability problems. (The internet as a whole, arguably, doesn't fare much better. RFC600 is a really obscure discussion about "INTERFACING AN ILLINOIS PLASMA TERMINAL TO THE ARPANET". Maybe that's how many MSCs will end up as well, left forgotten in the pits of history.)

And that's the thing: maybe the Matrix people have a different objective than I have. They want to connect everything to everything, and make Matrix a generic transport for all sorts of applications, including virtual reality, collaborative editors, and so on.

I just want secure, simple messaging. Possibly with good file transfers, and video calls. That it works with existing stuff is good, and it should be federated to remove the "Signal point of failure". So I'm a bit worried with the direction all those MSCs are taking, especially when you consider that clients other than Element are still struggling to keep up with basic features like end-to-end encryption or room discovery, never mind voice or spaces...

Conclusion

Overall, Matrix is somehow in the space XMPP was a few years ago. It has a ton of features, pretty good clients, and a large community. It seems to have gained some of the momentum that XMPP has lost. It may have the most potential to replace Signal if something bad would happen to it (like, I don't know, getting banned or going nuts with cryptocurrency)...

But it's really not there yet, and I don't see Matrix trying to get there either, which is a bit worrisome.

Looking back at history

I'm also worried that we are repeating the errors of the past. The history of federated services is really fascinating:. IRC, FTP, HTTP, and SMTP were all created in the early days of the internet, and are all still around (except, arguably, FTP, which was removed from major browsers recently). All of them had to face serious challenges in growing their federation.

IRC had numerous conflicts and forks, both at the technical level but also at the political level. The history of IRC is really something that anyone working on a federated system should study in detail, because they are bound to make the same mistakes if they are not familiar with it. The "short" version is:

  • 1988: Finish researcher publishes first IRC source code
  • 1989: 40 servers worldwide, mostly universities
  • 1990: EFnet ("eris-free network") fork which blocks the "open relay", named Eris - followers of Eris form the A-net, which promptly dissolves itself, with only EFnet remaining
  • 1992: Undernet fork, which offered authentication ("services"), routing improvements and timestamp-based channel synchronisation
  • 1994: DALnet fork, from Undernet, again on a technical disagreement
  • 1995: Freenode founded
  • 1996: IRCnet forks from EFnet, following a flame war of historical proportion, splitting the network between Europe and the Americas
  • 1997: Quakenet founded
  • 1999: (XMPP founded)
  • 2001: 6 million users, OFTC founded
  • 2002: DALnet peaks at 136,000 users
  • 2003: IRC as a whole peaks at 10 million users, EFnet peaks at 141,000 users
  • 2004: (Facebook founded), Undernet peaks at 159,000 users
  • 2005: Quakenet peaks at 242,000 users, IRCnet peaks at 136,000 (Youtube founded)
  • 2006: (Twitter founded)
  • 2009: (WhatsApp, Pinterest founded)
  • 2010: (TextSecure AKA Signal, Instagram founded)
  • 2011: (Snapchat founded)
  • ~2013: Freenode peaks at ~100,000 users
  • 2016: IRCv3 standardisation effort started (TikTok founded)
  • 2021: Freenode self-destructs, Libera chat founded
  • 2022: Libera peaks at 50,000 users, OFTC peaks at 30,000 users

(The numbers were taken from the Wikipedia page and Netsplit.de. Note that I also include other networks launch in parenthesis for context.)

Pretty dramatic, don't you think? Eventually, somehow, IRC became irrelevant for most people: few people are even aware of it now. With less than a million users active, it's smaller than Mastodon, XMPP, or Matrix at this point.1 If I were to venture a guess, I'd say that infighting, lack of a standardization body, and a somewhat annoying protocol meant the network could not grow. It's also possible that the decentralised yet centralised structure of IRC networks limited their reliability and growth.

But large social media companies have also taken over the space: observe how IRC numbers peak around the time the wave of large social media companies emerge, especially Facebook (2.9B users!!) and Twitter (400M users).

Where the federated services are in history

Right now, Matrix, and Mastodon (and email!) are at the "pre-EFnet" stage: anyone can join the federation. Mastodon has started working on a global block list of fascist servers which is interesting, but it's still an open federation. Right now, Matrix is totally open, but matrix.org publishes a (federated) block list of hostile servers (#matrix-org-coc-bl:matrix.org, yes, of course it's a room).

Interestingly, Email is also in that stage, where there are block lists of spammers, and it's a race between those blockers and spammers. Large email providers, obviously, are getting closer to the EFnet stage: you could consider they only accept email from themselves or between themselves. It's getting increasingly hard to deliver mail to Outlook and Gmail for example, partly because of bias against small providers, but also because they are including more and more machine-learning tools to sort through email and those systems are, fundamentally, unknowable. It's not quite the same as splitting the federation the way EFnet did, but the effect is similar.

HTTP has somehow managed to live in a parallel universe, as it's technically still completely federated: anyone can start a web server if they have a public IP address and anyone can connect to it. The catch, of course, is how you find the darn thing. Which is how Google became one of the most powerful corporations on earth, and how they became the gatekeepers of human knowledge online.

I have only briefly mentioned XMPP here, and my XMPP fans will undoubtedly comment on that, but I think it's somewhere in the middle of all of this. It was co-opted by Facebook and Google, and both corporations have abandoned it to its fate. I remember fondly the days where I could do instant messaging with my contacts who had a Gmail account. Those days are gone, and I don't talk to anyone over Jabber anymore, unfortunately. And this is a threat that Matrix still has to face.

It's also the threat Email is currently facing. On the one hand corporations like Facebook want to completely destroy it and have mostly succeeded: many people just have an email account to register on things and talk to their friends over Instagram or (lately) TikTok (which, I know, is not Facebook, but they started that fire).

On the other hand, you have corporations like Microsoft and Google who are still using and providing email services — because, frankly, you still do need email for stuff, just like fax is still around — but they are more and more isolated in their own silo. At this point, it's only a matter of time they reach critical mass and just decide that the risk of allowing external mail coming in is not worth the cost. They'll simply flip the switch and work on an allow-list principle. Then we'll have closed the loop and email will be dead, just like IRC is "dead" now.

I wonder which path Matrix will take. Could it liberate us from these vicious cycles?

  1. According to Wikipedia, there are currently about 500 distinct IRC networks operating, on about 1,000 servers, serving over 250,000 users. In contrast, Mastodon seems to be around 5 million users, Matrix.org claimed at FOSDEM 2021 to have about 28 million globally visible accounts, and Signal lays claim to over 40 million souls. XMPP claims to have "millions" of users on the xmpp.org homepage but the FAQ says they don't actually know. On the proprietary silo side of the fence, this page says

    • Facebook: 2.9 billion users
    • WhatsApp: 2B
    • Instagram: 1.4B
    • TikTok: 1B
    • Snapchat: 500M
    • Pinterest: 480M
    • Twitter: 397M

    Notable omission from that list: Youtube, with its mind-boggling 2.6 billion users...

    Those are not the kind of numbers you just "need to convince a brother or sister" to grow the network...

Dima Kogan: Ricoh GR IIIx 802.11 reverse engineering

17 June, 2022 - 12:04

I just got a fancy new camera: Ricoh GR IIIx. It's pretty great, and I strongly recommend it to anyone that wants a truly pocketable camera with fantastic image quality and full manual controls. One annoyance is the connectivity. It does have both Bluetooth and 802.11, but the only official method of using them is some dinky closed phone app. This is silly. I just did some reverse-engineering, and I now have a functional shell script to download the last few images via 802.11. This is more convenient than plugging in a wire or pulling out the memory card. Fortunately, Ricoh didn't bend over backwards to make the reversing difficult, so to figure it out I didn't even need to download the phone app, and sniff the traffic.

When you turn on the 802.11 on the camera, it says stuff about essid and password, so clearly the camera runs its own access point. Not ideal, but it's good-enough. I connected, and ran nmap to find hosts and open ports: only port 80 on 192.168.0.1 is open. Pointing curl at it yields some error, so I need to figure out the valid endpoints. I downloaded the firmware binary, and tried to figure out what's in it:

dima@shorty:/tmp$ binwalk fwdc243b.bin

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
3036150       0x2E53F6        Cisco IOS microcode, for "8"
3164652       0x3049EC        Certificate in DER format (x509 v3), header length: 4, sequence length: 5412
5472143       0x537F8F        Copyright string: "Copyright ("
6128763       0x5D847B        PARity archive data - file number 90
10711634      0xA37252        gzip compressed data, maximum compression, from Unix, last modified: 2022-02-15 05:47:23
13959724      0xD5022C        MySQL ISAM compressed data file Version 11
24829873      0x17ADFB1       MySQL MISAM compressed data file Version 4
24917663      0x17C369F       MySQL MISAM compressed data file Version 4
24918526      0x17C39FE       MySQL MISAM compressed data file Version 4
24921612      0x17C460C       MySQL MISAM compressed data file Version 4
24948153      0x17CADB9       MySQL MISAM compressed data file Version 4
25221672      0x180DA28       MySQL MISAM compressed data file Version 4
25784158      0x1896F5E       Cisco IOS microcode, for "\"
26173589      0x18F6095       MySQL MISAM compressed data file Version 4
28297588      0x1AFC974       MySQL ISAM compressed data file Version 6
28988307      0x1BA5393       MySQL ISAM compressed data file Version 3
28990184      0x1BA5AE8       MySQL MISAM index file Version 3
29118867      0x1BC5193       MySQL MISAM index file Version 3
29449193      0x1C15BE9       JPEG image data, JFIF standard 1.01
29522133      0x1C278D5       JPEG image data, JFIF standard 1.08
29522412      0x1C279EC       Copyright string: "Copyright ("
29632931      0x1C429A3       JPEG image data, JFIF standard 1.01
29724094      0x1C58DBE       JPEG image data, JFIF standard 1.01

The gzip chunk looks like what I want:

dima@shorty:/tmp$ tail -c+10711635 fwdc243b.bin> /tmp/tst.gz


dima@shorty:/tmp$ < /tmp/tst.gz gunzip | file -

/dev/stdin: ASCII cpio archive (SVR4 with no CRC)


dima@shorty:/tmp$ < /tmp/tst.gz gunzip > tst.cpio

OK, we have some .cpio thing. It's plain-text. I grep around it in, looking for GET and POST and such, and I see various URI-looking things at /v1/..... Grepping for that I see

dima@shorty:/tmp$ strings tst.cpio | grep /v1/

GET /v1/debug/revisions
GET /v1/ping
GET /v1/photos
GET /v1/props
PUT /v1/params/device
PUT /v1/params/lens
PUT /v1/params/camera
GET /v1/liveview
GET /v1/transfers
POST /v1/device/finish
POST /v1/device/wlan/finish
POST /v1/lens/focus
POST /v1/camera/shoot
POST /v1/camera/shoot/compose
POST /v1/camera/shoot/cancel
GET /v1/photos/{}/{}
GET /v1/photos/{}/{}/info
PUT /v1/photos/{}/{}/transfer
/v1/photos/<string>/<string>
/v1/photos/<string>/<string>/info
/v1/photos/<string>/<string>/transfer
/v1/device/finish
/v1/device/wlan/finish
/v1/lens/focus
/v1/camera/shoot
/v1/camera/shoot/compose
/v1/camera/shoot/cancel
/v1/changes
/v1/changes message received.
/v1/changes issue event.
/v1/changes new websocket connection.
/v1/changes websocket connection closed. reason({})
/v1/transfers, transferState({}), afterIndex({}), limit({})

Jackpot. I pointed curl at most of these, and they do interesting things. Generally they all spit out JSON. /v1/liveview sends out a sequence of JPEG images. The thing I care about is /v1/photos/DIRECTORY/FILE and /v1/photos/DIRECTORY/FILE/info. The result is a script I just wrote to connect to the camera, download N images, and connect back to the original access point:

https://github.com/dkogan/ricoh-download

Kinda crude, but works for now. I'll improve it with time.

After I did this I found an old thread from 2015 where somebody was using an apparently-compatible camera, and wrote a fancier tool:

https://www.pentaxforums.com/forums/184-pentax-k-s1-k-s2/295501-k-s2-wifi-laptop-2.html

Dirk Eddelbuettel: RcppArmadillo 0.11.2.0.0 on CRAN: New Upstream

16 June, 2022 - 07:11

Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language–and is widely used by (currently) 991 other packages on CRAN, downloaded over 25 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 476 times according to Google Scholar.

This release brings a second upstream fix by Conrad in the release series 11.*. We once again tested this very rigorously via a complete reverse-depedency check (for which results are always logged here). It so happens that CRAN then had a spurious error when re-checking on upload, and it took a fews days to square this as everybody remains busy – but the release prepared on June 10 is now on CRAN.

The full set of changes (since the last CRAN release 0.11.1.1.0) follows.

Changes in RcppArmadillo version 0.11.2.0.0 (2022-06-10)
  • Upgraded to Armadillo release 11.2 (Classic Roast)

    • faster handling of sparse submatrix column views by norm(), accu(), nonzeros()

    • extended randu() and randn() to allow specification of distribution parameters

    • internal refactoring, leading to faster compilation times

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Dirk Eddelbuettel: AsioHeaders 1.22.1-1 on CRAN

16 June, 2022 - 06:45

An updated version of the AsioHeaders package arrived at CRAN yesterday (in one of those pleasant fully-automated uploads and transitions). Asio provides a cross-platform C++ library for network and low-level I/O programming. It is also included in Boost – but requires linking when used as part of Boost. This standalone version of Asio is a header-only C++ library which can be used without linking (just like our BH package with parts of Boost).

This release brings a new upstream version, following a two-year period without updated. This was tickled by OpenSSL 3.0 header changes as seen in a package using both AsioHeaders and OpenSSL.

Changes in version 1.22.1-1 (2022-06-14)
  • Upgraded to Asio 1.22.1 (Dirk in #7 fixing #6).

Thanks to my CRANberries, there is also a diffstat report relative to the previous release.

Comments and suggestions about AsioHeaders are welcome via the issue tracker at the GitHub GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Edward Betts: Find link needs a rewrite, the visual editor broke it

16 June, 2022 - 05:22

Find link is a tool that I wrote for adding links between articles in Wikipedia. Given an article title, find link will find other articles that include the entered article title but no link to the article. There is the option to edit the found articles and add the missing link.

For example, you might want to find missing links to the gig economy article.

I originally wrote the tool in 2008 when the MediaWiki software didn't have a rich-text editor. Wikipedia articles were edited by writing wiki markup in MediaWiki syntax. Since then MediaWiki has evolved and now has rich-text editing via the visual editor. Users don't need to know how to write wiki markup to modify an article.

Within MediaWiki there is a user preference to disable the visual editor and stick with editing via the original wiki markup.

Find link edits articles by taking the article text, adding the missing link, and sending the user to the changes view of the modified article on Wikipedia, if they're happy with the change they hit save. This only works with the original editor, it doesn't work with the visual editor.

English Wikipedia has had the visual editor enabled by default since 2016. For somebody to use find link they need to disable the visual editor in their Wikipedia preferences first.

Fixing this bug means quite a significant change to how the tool works.

My plan is to rewrite find link to save edits directly without needing to send the user to Wikipedia article edit change view page to make the edits. Users will authenticate with their Wikipedia account via OAuth and give permission for find link to edit articles on their behalf.

Some of my other tools use OAuth for editing OpenStreetMap and Wikidata, so I'm confident about using it to edit Wikipedia.

The source code for find link is on GitHub.

I'll post updates here as I make progress on the rewrite.

John Goerzen: Really Enjoyed Jason Scott’s BBS Documentary

14 June, 2022 - 07:13

Like many young programmers of my age, before I could use the Internet, there were BBSs. I eventually ran one, though in my small town there were few callers.

Some time back, I downloaded a copy of Jason Scott’s BBS Documentary. You might know Jason Scott from textfiles.com and his work at the Internet Archive.

The documentary was released in 2005 and spans 8 episodes on 3 DVDs. I’d watched parts of it before, but recently watched the whole series.

It’s really well done, and it’s not just about the technology. Yes, that figures in, but it’s about the people. At times, it was nostalgic to see people talking about things I clearly remembered. Often, I saw long-forgotten pioneers interviewed. And sometimes, such as with the ANSI art scene, I learned a lot about something I was aware of but never really got into back then.

BBSs and the ARPANet (predecessor to the Internet) grew up alongside each other. One was funded by governments and universities; the other, by hobbyists working with inexpensive equipment, sometimes of their own design.

You can download the DVD images (with tons of extras) or watch just the episodes on Youtube following the links on the author’s website.

The thing about BBSs is that they never actually died. Now I’m looking forward to watching the Back to the BBS documentary series about modern BBSs as well.

Ben Hutchings: Debian LTS work, May 2022

14 June, 2022 - 01:30

In May I was assigned 11 hours of work by Freexian's Debian LTS initiative and carried over 13 hours from April. I worked 8 hours, and will carry over the remaining time to June.

I spent some time triaging security issues for Linux, working out which of them were fixed upstream and which actually applied to the versions provided in Debian 9 "stretch". I rebased the Linux 4.9 (linux) package on the latest stable update, but did not make an upload this month. I started backporting several security fixes to 4.9, but those still have to be tested and reviewed.

Edward Betts: Fixing spelling in GitHub repos using codespell

13 June, 2022 - 22:31
<p>Codespell is a spell checker specifically designed for finding misspellings in source code.</p> <p>I've been using it to <a href="https://github.com/pulls?q=author%3AEdwardBetts+spelling+OR+typo">correct spelling mistakes in GitHub repos</a> sine 2016.</p> <p>Most spell checkers use a list of valid words and highlighting any word in a document that is not in the word list. This method doesn't work for source code because code contains abbreviations and words joined together without spaces, a spell checker will generate too many false positives.</p> <p>Codespell uses a different approach, instead of a list of valid words it has a dictionary of common misspellings.</p> <p>Currently the codespell dictionary includes 34,466 known misspellings. I've <a href="https://github.com/codespell-project/codespell/pulls?q=author%3AEdwardBetts">contributed 300 misspellings</a> to the dictionary.</p> <p>Whenever I find an interesting open source project I run codespell to check for spelling mistakes. Most projects have spelling mistakes and I can send a pull request to fix them.</p> <p>In 2019 <a href="https://blogs.windows.com/windowsdeveloper/2019/03/06/announcing-the-open-sourcing-of-windows-calculator/">Microsoft made the Windows calculator open source</a> and uploaded it to GitHub. I used codespell to find some spelling mistakes, sent them a <a href="https://github.com/microsoft/calculator/pull/106">pull request</a> and they accepted it.</p> <p>A great source for GitHub repos to spell check is Hacker News. Let's have a look.</p> <div class="mb-3 text-center"> <img src="https://edwardbetts.com/img/flarum_hacker_news.png" class="img-thumbnail" width="836" height="683"/> </div> <p>Hacker News has a link to forum software called <a href="https://github.com/flarum/flarum">Flarum</a>. I can use codespell to look for spelling mistakes. When I'm looking for errors in a GitHub repo I don't fork the project until I know there is a spelling mistake to fix.</p> <div style="background: black; padding: 5px; width:800px; margin: auto" class="mb-3"> <pre style="color: white"> <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling</font>&gt; <font color="#A1B56C">git</font> <font color="#D8D8D8">clone</font> <font color="#D8D8D8">git@github.com:flarum/flarum.git</font> Cloning into &apos;flarum&apos;... remote: Enumerating objects: 1338, done. remote: Counting objects: 100% (42/42), done. remote: Compressing objects: 100% (23/23), done. remote: Total 1338 (delta 21), reused 36 (delta 19), pack-reused 1296 Receiving objects: 100% (1338/1338), 725.02 KiB | 1.09 MiB/s, done. Resolving deltas: 100% (720/720), done. <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling</font>&gt; <font color="#A1B56C">cd</font> <font color="#D8D8D8"><u style="text-decoration-style:single">flarum/</u></font> <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (master)&gt; <font color="#A1B56C">codespell</font> <font color="#D8D8D8">-q3</font> <font color="#A2734C">./public/web.config</font>:<font color="#A2734C">13</font>: <font color="#C01C28">sensitve</font> ==&gt; <font color="#26A269">sensitive</font> <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (master)&gt; <font color="#A1B56C">gh</font> <font color="#D8D8D8">repo</font> <font color="#D8D8D8">fork</font> <font color="#26A269">✓</font> Created fork <b>EdwardBetts/flarum</b> <font color="#33DA7A"><b>? </b></font><b>Would you like to add a remote for the fork? </b><font color="#2AA1B3">Yes</font> <font color="#26A269">✓</font> Added remote <b>origin</b> <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (master)&gt; <font color="#A1B56C">git</font> <font color="#D8D8D8">checkout</font> <font color="#D8D8D8">-b</font> <font color="#D8D8D8">spelling</font> Switched to a new branch &apos;spelling&apos; <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (spelling)&gt; <font color="#A1B56C">codespell</font> <font color="#D8D8D8">-q3</font> <font color="#A2734C">./public/web.config</font>:<font color="#A2734C">13</font>: <font color="#C01C28">sensitve</font> ==&gt; <font color="#26A269">sensitive</font> <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (spelling)&gt; <font color="#A1B56C">codespell</font> <font color="#D8D8D8">-q3</font> <font color="#D8D8D8">-w</font> <font color="#26A269">FIXED:</font> ./public/web.config <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (spelling)&gt; <font color="#A1B56C">git</font> <font color="#D8D8D8">commit</font> <font color="#D8D8D8">-am</font> <font color="#F7CA88">&quot;Correct spelling mistakes&quot;</font> [spelling bbb04c7] Correct spelling mistakes 1 file changed, 1 insertion(+), 1 deletion(-) <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (spelling)&gt; <font color="#A1B56C">git</font> <font color="#D8D8D8">push</font> <font color="#D8D8D8">-u</font> <font color="#D8D8D8">origin</font> Enumerating objects: 7, done. Counting objects: 100% (7/7), done. Delta compression using up to 8 threads Compressing objects: 100% (4/4), done. Writing objects: 100% (4/4), 360 bytes | 360.00 KiB/s, done. Total 4 (delta 3), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (3/3), completed with 3 local objects. remote: remote: Create a pull request for &apos;spelling&apos; on GitHub by visiting: remote: https://github.com/EdwardBetts/flarum/pull/new/spelling remote: To github.com:EdwardBetts/flarum.git * [new branch] spelling -&gt; spelling branch &apos;spelling&apos; set up to track &apos;origin/spelling&apos;. <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (spelling)&gt; <font color="#A1B56C">gh</font> <font color="#D8D8D8">pr</font> <font color="#D8D8D8">create</font> Creating pull request for <font color="#2AA1B3">EdwardBetts:spelling</font> into <font color="#2AA1B3">master</font> in flarum/flarum <font color="#33DA7A"><b>? </b></font><b>Title </b><font color="#2AA1B3">Correct spelling mistakes</font> <font color="#33DA7A"><b>? </b></font><b>Choose a template</b><font color="#2AA1B3"> Open a blank pull request</font> <font color="#33DA7A"><b>? </b></font><b>Body </b><font color="#2AA1B3">&lt;Received&gt;</font> <font color="#33DA7A"><b>? </b></font><b>What&apos;s next?</b><font color="#2AA1B3"> Submit</font> https://github.com/flarum/flarum/pull/81 <font color="#33DA7A">edward</font>@x1c9 <font color="#26A269">~/spelling/flarum</font> (spelling)&gt; </pre> </div> <p>That worked. I found one spelling mistake, the word "sensitive" was spelled wrong. I forked the repo, fixed the spelling mistake and submitted the fix as a <a href="https://github.com/flarum/flarum/pull/81">pull request</a>.</p> <div class="mb-3 text-center"> <img src="https://edwardbetts.com/img/flarum_pull_request.png" class="img-thumbnail" width="796" height="778"/> </div> <p>The maintainer of Flarum accepted my pull request.</p> <p>Fixing spelling mistakes in <a href="https://github.com/twbs/bootstrap/pull/24778">Bootstrap</a> helped me unlocked the <a href="https://github.com/EdwardBetts?tab=achievements&amp;achievement=mars-2020-contributor">Mars 2020 Contributor</a> achievements on GitHub.</p> <div class="mb-3 text-center"> <img src="https://edwardbetts.com/img/github_mars_badge.png" class="img-thumbnail" width="806" height="867"/> </div> <p>Why not try running codespell on your own codebase? You'll probably find some spelling mistakes to fix.</p>

Iustin Pop: Somewhat committing to a new sport

12 June, 2022 - 21:00

Quite a few years ago - 4, to be precise, so in 2018 - I did a couple of SUP trainings, organised by a colleague. That was enjoyable, but not really matching with me (asymmetric paddling, ugh!), so I also did learn some kayaking, which I really love, but that’s way higher overhead - no sea around in Switzerland, and lakes are generally too small. So I basically postponed any more water sports 😞, until sometime in the future when I’ll finally decide what I want to do (and in what setup).

I did a couple of one-off SUP rides in various places (2019, 2021), but I really was out of practice, so it wasn’t really enjoyable. But with family, SUP offers a much easier way to carry a passenger (than a kayak), so slowly I started thinking more about doing it more seriously.

So last week, after much deliberation, bought an inflatable board, paddle and various other accessories, and on Saturday went to try it out, on excellent weather (completely flat) and hot but not overly so. The board choosing in itself was something I like to do (research options), so for a bit I was concerned whether I’m more interested in the gear, or the actual paddling itself…

To my surprise, it went way better than I feared - last time I tried it, paddled 30 minutes on my knees (knee-paddling?!), since I didn’t dare stand up. But this time, I launched and then did stand up, and while very shaky, I didn’t fall in. Neither by myself, nor with an extra passenger 😉

And hour later, and my initial shakiness went away, with the trainings slowly coming back to mind. Another half hour, and - for completely flat water - I felt quite confident. The view was awesome, the weather nice, the water cold enough to be refreshing… and the only question on my mind was - why didn’t I do this 2, 3 years ago? Well, Corona aside.

I forgot how much I love just being on the water. It definitely pays off the cost of going somewhere, unpacking the stuff, pumping up the board (that’s a bit of a sport in itself 😃), because the blue-green-light-blue colour palette is just how things should be:

Small lake, but beautiful view

Well, approximately blue. This being a small lake, it’s more blue-green than proper blue. That’s next level, since bigger lakes mean waves, and more traffic.

Of course, this could also turn up like many other things I tried (a device in a corner that’s not used anymore), but at least for yesterday, I was a happy paddler!

Russ Allbery: Review: The Shattered Sphere

12 June, 2022 - 10:48

Review: The Shattered Sphere, by Roger MacBride Allen

Series: Hunted Earth #2 Publisher: Tor Copyright: July 1994 Printing: September 1995 ISBN: 0-8125-3016-0 Format: Mass market Pages: 491

The Shattered Sphere is a direct sequel to The Ring of Charon and spoils everything about the plot of the first book. You don't want to start here. Also be aware that essentially everything you can read about this book will spoil the major plot driver of The Ring of Charon in the first sentence. I'm going to review the book without doing that, but it's unlikely anyone else will try.

The end of the previous book stabilized matters, but in no way resolved the plot. The Shattered Sphere opens five years later. Most of the characters from the first novel are joined by some new additions, and all of them are trying to make sense of a drastically changed and far more dangerous understanding of the universe. Humanity has a new enemy, one that's largely unaware of humanity's existence and able to operate on a scale that dwarfs human endeavors. The good news is that humans aren't being actively attacked. The bad news is that they may be little more than raw resources, stashed in a safe spot for future use.

That is reason enough to worry. Worse are the hints of a far greater danger, one that may be capable of destruction on a scale nearly beyond human comprehension. Humanity may be trapped between a sophisticated enemy to whom human activity is barely more noticeable than ants, and a mysterious power that sends that enemy into an anxious panic.

This series is an easily-recognized example of an in-between style of science fiction. It shares the conceptual bones of an earlier era of short engineer-with-a-wrench stories that are full of set pieces and giant constructs, but Allen attempts to add the characterization that those books lacked. But the technique isn't there; he's trying, and the basics of characterization are present, but with none of the emotional and descriptive sophistication of more recent SF. The result isn't bad, exactly, but it's bloated and belabored. Most of the characterization comes through repetition and ham-handed attempts at inner dialogue.

Slow plotting doesn't help. Allen spends half of a nearly 500 page novel on setup in two primary threads. One is mostly people explaining detailed scientific theories to each other, mixed with an attempt at creating reader empathy that's more forceful than effective. The other is a sort of big dumb object exploration that failed to hold my attention and that turned out to be mostly irrelevant. Key revelations from that thread are revealed less by the actions of the characters than by dumping them on the reader in an extended monologue. The reading goes quickly, but only because the writing is predictable and light on interesting information, not because the plot is pulling the reader through the book. I found myself wishing for an earlier era that would have cut about 300 pages out of this book without losing any of the major events.

Once things finally start happening, the book improves considerably. I grew up reading large-scale scientific puzzle stories, and I still have a soft spot for a last-minute scientific fix and dramatic set piece even if the descriptive detail leaves something to be desired. The last fifty pages are fast-moving and satisfying, only marred by their failure to convince me that the humans were required for the plot. The process of understanding alien technology well enough to use it the right way kept me entertained, but I don't understand why the aliens didn't use it themselves.

I think this book falls between two stools. The scientific mysteries and set pieces would have filled a tight, fast-moving 200 page book with a minimum of characterization. It would have been a throwback to an earlier era of science fiction, but not a bad one. Allen instead wanted to provide a large cast of sympathetic and complex characters, and while I appreciate the continued lack of villains, the writing quality is not sufficient to the task.

This isn't an awful book, but the quality bar in the genre is so much higher now. There are better investments of your reading time available today.

Like The Ring of Charon, The Shattered Sphere reaches a satisfying conclusion but does not resolve the series plot. No sequel has been published, and at this point one seems unlikely to materialize.

Rating: 5 out of 10

Junichi Uekawa: What's in sid?

11 June, 2022 - 12:27
What's in sid? I wanted to check what version of libc was used in current Debian sid. I am lazy and I used podman to check. However I noticed that apt update is very slow for me, 20kB/s. Wondering if something is wrong.

Lisandro Damián Nicanor Pérez Meyer: Qt 6 in Debian bullseye

10 June, 2022 - 23:45

As announced some time ago on Debian Backport’s mailing list I will be backporting Qt 6 to Debian 11 “Bullseye”. This comprises the (so far) 29 source packages that compose Qt 6 and libassimp.

The Qt Company wanted to let us Debian users also enjoy Qt 6 on Bullseye, so they contacted me (and by extension my employer ICS) to bring this forward. As said in the mail I sent to the backports list I’m making the commitment to maintain the packages myself, but I’m really happy the Qt Company asked me for this.

You can download Qt 6 for Debian Bullseye’s backports by following the instructions.

Also a big kudos to IOhannes m zmölnig, the assimp maintainer, who promptly helped me to get it backported.

Enjoy!

Louis-Philippe Véronneau: Updating a rooted Pixel 3a

10 June, 2022 - 22:45

A short while after getting a Pixel 3a, I decided to root it, mostly to have more control over the charging procedure. In order to preserve battery life, I like my phone to stop charging at around 75% of full battery capacity and to shut down automatically at around 12%. Some Android ROMs have extra settings to manage this, but LineageOS unfortunately does not.

Android already comes with a fairly complex mechanism to handle the charge cycle, but it is mostly controlled by the kernel and cannot be easily configured by end-users. acc is a higher-level "systemless" interface for the Android kernel battery management, but one needs root to do anything interesting with it. Once rooted, you can use the AccA app instead of playing on the command line to fine tune your battery settings.

Sadly, having a rooted phone also means I need to re-root it each time there is an OS update (typically each week).

Somehow, I keep forgetting the exact procedure to do this! Hopefully, I will be able to use this post as a reference in the future :)

Note that these instructions might not apply to your exact phone model, proceed with caution!

Extract the boot.img file

This procedure mostly comes from the LineageOS documentation on extracting proprietary blobs from the payload.

  1. Download the latest LineageOS image for your phone.

  2. unzip the image to get the payload.bin file inside it.

  3. Clone the LineageOS scripts git repository:

    $ git clone https://github.com/LineageOS/scripts

  4. extract the boot image (requires python3-protobuf):

    $ mkdir extracted-payload $ python3 scripts/update-payload-extractor/extract.py payload.bin --output_dir extracted-payload

You should now have a boot.img file.

Patch the boot image file using Magisk
  1. Upload the boot.img file you previously extracted to your device.

  2. Open Magisk and patch the boot.img file.

  3. Download the patched file back on your computer.

Flash the patched boot image
  1. Enable ADB debug mode on your phone.

  2. Reboot into fastboot mode.

    $ adb reboot fastboot

  3. Flash the patched boot image file:

    $ fastboot flash boot magisk_patched-foo.img

  4. Disable ADB debug mode on your phone.

Troubleshooting

In an ideal world, you would do this entire process each time you upgrade to a new LineageOS version. Sadly, this creates friction and makes updating much more troublesome.

To simplify things, you can try to flash an old patched boot.img file after upgrading, instead of generating it each time.

In my experience, it usually works. When it does not, the device behaves weirdly after a reboot and things that require proprietary blobs (like WiFi) will stop working.

If that happens:

  1. Download the latest LineageOS version for your phone.

  2. Reboot into recovery (Power + Volume Down).

  3. Click on "Apply Updates"

  4. Sideload the ROM:

    $ adb sideload lineageos-foo.zip

Iustin Pop: Still alive, 2022 version

10 June, 2022 - 21:22

Still alive, despite the blog being silent for more than a year.

Nothing bad happened, but there was always something more important (or interesting) to do than write a post. And I did say many, many times - “Oh, I should write a post about this thing I just did or learned about”, but I never followed up.

And I was close to forgetting entirely about blogging (ahem, it’s a bit much calling it “blogging”), until someone I follow posted something along the lines “I have this half-written post for many months that I can’t finish, here’s some pictures instead”. And from that followed an interesting discussion, and the similarity between “why I didn’t blog recently” were very interesting, despite different countries, continents/etc.

So yes, I don’t know what happened - beside the chaos that even the end of Covid caused in our lives, and the psychological impact of the Ukraine invasion, but all this is relatively recent - that I couldn’t muster the energy to write posts again.

I even had a half-written post in late June last year, never finished. Sigh. I won’t even bring up open-source work, since I haven’t done that either.

Life. Sometimes things just happen. But yes, I did get many Garmin badges in the last 12 months 🙂 Oh, and “Top Gun: Maverick” is awesome. A movie, but an awesome movie.

See you!

Thomas Koch: lsp-java coming to debian

10 June, 2022 - 18:40
Posted on March 12, 2022 Tags: debian

The Language Server Protocol (LSP) standardizes communication between editors and so called language servers for different programming languages. This reduces the old problem that every editor had to implement many different plugins for all different programming languages. With LSP an editor just needs to talk LSP and can immediately provide typicall IDE features.

I already packaged the Emacs packages lsp-mode and lsp-haskell for Debian bullseye. Now lsp-java is waiting in the NEW queue.

I’m always worried about downloading and executing binaries from random places of the internet. It should be a matter of hygiene to only run binaries from official Debian repositories. Unfortunately this is not feasible when programming and many people don’t see a problem with running multiple curl-sh pipes to set up their programming environment.

I prefer to do such stuff only in virtual machines. With Emacs and LSP I can finally have a lightweight textmode programming environment even for Java.

Unfortunately the lsp-java mode does not yet work over tramp. Once this is solved, I could run emacs on my host and only isolate the code and language server inside the VM.

The next step would be to also keep the code on the host and mount it with Virtio FS in the VM. But so far the necessary daemon is not yet in Debian (RFP: #1007152).

In Detail I uploaded these packages:

Thomas Koch: Waiting for a STATE folder in the XDG basedir spec

10 June, 2022 - 18:40
Posted on February 18, 2014 Tags: debian, free software

The XDG Basedirectory specification proposes default homedir folders for the categories DATA (~/.local/share), CONFIG (~/.config) and CACHE (~/.cache). One category however is missing: STATE. This category has been requested several times but nothing happened.

Examples for state data are:

  • history files of shells, repls, anything that uses libreadline
  • logfiles
  • state of application windows on exit
  • recently opened files
  • last time application was run
  • emacs: bookmarks, ido last directories, backups, auto-save files, auto-save-list

The missing STATE category is especially annoying if you’re managing your dotfiles with a VCS (e.g. via VCSH) and you care to keep your homedir tidy.

If you’re as annoyed as me about the missing STATE category, please voice your opinion on the XDG mailing list.

Of course it’s a very long way until applications really use such a STATE directory. But without a common standard it will never happen.

Thomas Koch: shared infrastructure coop

10 June, 2022 - 18:40
Posted on February 5, 2014 Tags: debian, free software

I’m working in a very small web agency with 4 employees, one of them part time and our boss who doesn’t do programming. It shouldn’t come as a surprise, that our development infrastructure is not perfect. We have many ideas and dreams how we could improve it, but not the time. Now we have two obvious choices: Either we just do nothing or we buy services from specialized vendors like github, atlassian, travis-ci, heroku, google and others.

Doing nothing does not work for me. But just buying all this stuff doesn’t please me either. We’d depend on proprietary software, lock-in effects or one-size-fits-all offerings. Another option would be to find other small web shops like us, form a cooperative and share essential services. There are thousands of web shops in the same situation like us and we all need the same things:

  • public and private Git hosting
  • continuous integration (Jenkins)
  • code review (Gerrit)
  • file sharing (e.g. git-annex + webdav)
  • wiki
  • issue tracking
  • virtual windows systems for Internet Explorer testing
  • MySQL / Postgres databases
  • PaaS for PHP, Python, Ruby, Java
  • staging environment
  • Mails, Mailing Lists
  • simple calendar, CRM
  • monitoring

As I said, all of the above is available as commercial offerings. But I’d prefer the following to be satisfied:

  • The infrastructure itself should be open (but not free of charge), like the OpenStack Project Infrastructure as presented at LCA. I especially like how they review their puppet config with Gerrit.

  • The process to become an admin for the infrastructure should work much the same like the process to become a Debian Developer. I’d also like the same attitude towards quality as present in Debian.

Does something like that already exists? There already is the German cooperative hostsharing which is kind of similar but does provide mainly hosting, not services. But I’ll ask them next after writing this blog post.

Is your company interested in joining such an effort? Does it sound silly?

Comments:

Sounds promising. I already answered by mail. Dirk Deimeke (Homepage) am 16.02.2014 08:16 Homepage: http://d5e.org

I’m sorry for accidentily removing a comment that linked to https://mayfirst.org while moderating comments. I’m really looking forward to another blogging engine… Thomas Koch am 16.02.2014 12:20

Why? What are you missing? I am using s9y for 9 years now. Dirk Deimeke (Homepage) am 16.02.2014 12:57

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้