<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:xhtml="http://www.w3.org/1999/xhtml"><title>defanor's notes</title><link rel="self" href="https://thunix.net/~defanor/notes/atom.xml"/><link rel="alternate" href="https://thunix.net/~defanor/notes/"/><id>https://thunix.net/~defanor/notes/</id><updated>2018-05-01T01:00:00Z</updated>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/distributed-systems.html"/><id>https://thunix.net/~defanor/notes/distributed-systems.html</id><author><name>defanor</name></author><title>Distributed systems</title><summary>An overview of—and musings on—distributed computing</summary><published>2015-05-16T12:00:00Z</published><updated>2026-05-28T11:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><header>
      <h1>Distributed systems</h1>
      <p>
        There are quite a few definitions of "distributed" and
        "decentralized" in use, in this note I'm using the following
        ones:
      </p>
      <dl>
        <dt>Centralized</dt>
        <dd>
          Clients interacting with a single server (either physical or
          controlled by the same entity).
        </dd>
        <dt>Decentralized</dt>
        <dd>
          Clients interacting with multiple servers (controlled by different
          entities), which often build a federated network.
        </dd>
        <dt>Distributed</dt>
        <dd>
          Clients interacting with other clients directly, acting as servers
          themselves.
        </dd>
      </dl>
      <p>
        A "system" may also mean different things; here I focus on
        network protocols, on systems of network-connected independent
        actors.
      </p>
      <p>
        Distributed systems are useful for various purposes, but the
        commonly considered and achievable niceties are:
      </p>
      <ul>
        <li>No single point of failure.</li>
        <li>No necessity in a central authority.</li>
        <li>Potentially good software: more motivation to work on it
          if it doesn't mean putting time and effort into assisting
          unethical activities and wasting it once a service is
          discontinued.</li>
      </ul>
      <p>
        These are mostly shared with federated systems, but take it
        further.
      </p>
      <p>
        The common advantages of centralized systems over these seem
        to be search/discovery, often sort-of-free hosting for end
        users, greater UI uniformity in some cases, easier/faster
        introduction of new features.
      </p>
    </header><section>
      <h2>Usable systems</h2>
      <p>
        Actually usable (reliably working, specified, having users and
        decent software) systems so far are usually
        federated/decentralized; those can, in principle, be quite
        close to distributed systems (simply by setting their servers
        on user machines). So, generally it seems more useful to focus
        on those if the intention is to get things done: SMTP (email,
        possibly with OpenPGP), NNTP (Usenet), XMPP (jabber), IRC, and
        HTTP (World Wide Web; possibly together with RSS or Atom, and
        their aggregators, and/or RDF) are relatively well-supported,
        standardized, and usable for various kinds of communication.
      </p>
      <p>
        Sometimes even centralized but non-commercial projects and
        services are okay: OpenStreetMap, The Internet Archive,
        Wikimedia Foundation projects (Wikipedia, Wiktionary,
        Wikidata, Wikibooks, etc), arXiv, FLOSS projects, possibly
        LibGen and Sci-Hub (though they infringe copyright), possibly
        Libera.Chat (they had issues arising out of centralization,
        which is why it is not Freenode anymore, but it was also a
        good example of handling such issues well). As long as they
        are easy (and legal, and free) to fork and are not in a
        position to extort users, centralization can be
        fine. Conversely, there can be technically distributed systems
        effectively controlled by a single entity (e.g., a distributed
        PKI with a single root, or anything legally restricted). While
        this note is mostly about distributed network protocols, they
        are neither necessary nor sufficient for a community control
        over a system, but rather just may be a useful tool to achieve
        it.
      </p>
    </section><section>
      <h2>Existing systems</h2>
      <p>
        There are quite a few of them; I am going to write mostly
        about those that work over Internet. There's also the
        <a href="https://en.wikipedia.org/wiki/Category:Distributed_computing_architecture">"Distributed computing architecture" Wikipedia category</a>,
        including thing slike cluster computer, grid computing, etc.
      </p>

      <h3>Generic networks</h3>
      <p>
        <a href="https://www.torproject.org/">Tor</a> and <a href="https://geti2p.net/">I2P</a>: both support "hidden services", on top of which
        many regular protocols can be used, but it is more about
        privacy (and a bit about routing) than about decentralisation:
        they provide NAT traversal, encryption, and static
        addresses. <a href="https://svn.torproject.org/svn/projects/design-paper/tor-design.pdf">Tor documentation</a> is relatively nice, and there
        are <a href="https://geti2p.net/en/docs">I2P docs</a>. Tor provides a nice C client, I2P uses Java.
      </p>

      <h3>Mesh networks</h3>
      <p>
        Some mesh networks, like <a href="http://telehash.org/">Telehash</a>, provide routing as well,
        though advantages for decentralisation seem to be similar to
        those of Tor and I2P; just better in that they extend it
        beyond the existing networks, aiming to build more. Telehash
        documentation is also pretty nice and full of references.
      </p>
      <p>
        Cjdns (or its name, at least) seems to be relatively well-known, but it
        relies on node.js. <a href="http://netsukuku.freaknet.org/">Netsukuku</a> and <a href="https://www.open-mesh.org">B.A.T.M.A.N.</a> are two more protocols the
        names of which are known.
      </p>
      <p>
        One of the large Wi-Fi mesh networking projects is <a href="https://freifunk.net/en/">Freifunk</a>,
        but apparently it's only widespread in DACH countries.
      </p>
      <p>
        Those would be nice to get someday, but they would require
        quite a lot of users to function, and various government
        restrictions seem to complicate their usage (this varies from
        jurisdiction to jurisdiction and from year to year, but seems
        to be pretty bad in Russia in 2018, and even worse by 2023).
      </p>
      <p>
        And then there are the ones working over Internet, building
        overlay networks, usually with technologies similar to those
        used for VPNs (though yet again, in Russia by 2023 they seem
        to be about to start blocking protocols used for VPNs, with
        occasional outages/likely testing reported; since about that
        time, IPsec or WrieGuard connections to the outside Internet
        are interrupted, based on DPI). Yggdrasil is like that. There
        is an overview of similar mesh networks: "<a href="https://changelog.complete.org/archives/10478-easily-accessing-all-your-stuff-with-a-zero-trust-mesh-vpn">Easily Accessing All
        Your Stuff with a Zero-Trust Mesh VPN</a>".
      </p>

      <h3>IM and other social services</h3>
      <ul>
        <li>Tox implements its own network (DHT, onion routing, NAT traversal,
          etc), and has some <a href="https://github.com/irungentoo/toxcore/tree/master/docs">documentation</a>. Works, though not particularly easy
          to build, and toxic (apparently the primary implementation) ceases to
          work after a few days here, requiring a restart.</li>
        <li>Rival Messenger and Bleep are based on Telehash and BitTorrent,
          respectively. Have not tried those.</li>
        <li><a href="http://retroshare.github.io/">RetroShare</a> provides a bunch of features, but with a
          web-based UI, and I gave up on building it.</li>
        <li><a href="https://matrix.org/">Matrix</a> seems to be getting relatively popular, but
          uses <a href="http-abuse.html">HTTP APIs</a>, the specification is not available without
          JS, there are SDKs (I wonder whether it's ever a useful
          thing to provide an SDK instead of a single documented
          library; usually it's just additional pain to work with),
          web-based clients, etc – seems to be pretty
          unpleasant overall, following poor practices. Though it's
          federated, not distributed; functionally it's similar to
          XMPP with a few XEPs included into the core. <a href="https://hachyderm.io/@erebion@chaos.social/114865902255925899">Apparently
          awkward security issues happen</a>.</li>
        <li><a href="https://en.wikipedia.org/wiki/Ricochet_%28software%29">Ricochet</a> reuses Tor network, its protocol is documented and doesn't
          seem to be bloated. Unfortunately, it's bundled with GUI, apparently
          there is no separate library, and it's in C++ anyway, which would make
          bindings harder if there was one. Probably it wouldn't be that hard to
          reimplement, or to extract the non-GUI code bits and make C bindings,
          to get a reusable library.</li>
        <li><a href="xmpp.html">XMPP</a> is nice and is supported relatively widely (with a
          choice of servers, clients, and libraries), but federated,
          rather than distributed, though the former may be converted
          into the latter. The XMPP Standards Foundation is prosecuted
          in Russia in 2025, for not complying with a law intended for
          information distribution systems, and some of the XMPP
          clients' websites (conversations.im, xabber.com) are
          blocked already.</li>
        <li><a href="email.html">Email</a>: likewise, but using it in a distributed fashion wouldn't be
          interoperable with common deployments in most cases, and some
          software may assume a federated setting.</li>
        <li><a href="https://www.w3.org/TR/activitypub/">ActivityPub</a>: federated, replaces <a href="https://en.wikipedia.org/wiki/OStatus">OStatus</a>, partially
          supported by <a href="https://joinmastodon.org/">Mastodon</a> (which seems to be getting popular);
          used for both microblogging and private
          messaging. RDF-compatible (though awkward JSON-LD is used in
          Activity Streams), W3 recommendation. Hence good
          specification, and generally doesn't look too bad, but the
          specification doesn't include authentication and
          authorization as of now (January 2018), and the existing
          implementations seem to be all awkward: rather poor web UIs,
          languages such as JS. I finally gave Mastodon a try in 2023,
          as a user; not bad and generally works, but that "RDF
          compatiiblity" (as opposed to actually using RDF) shows: for
          instance, to add metadata even within a single instance, <a href="https://glitch-soc.github.io/docs/features/local-only-toots/">the
          Mastodon Glitch edition appends emojis to textual
          messages</a>. I hear it is done that way to keep it compatible
          with the vanilla version. <a href="https://emacs.ch/@defanor/110600400883823853">The primary web-based UI is pretty
          awkward and buggy</a>. Another somewhat popular
          ActivityPub-based project is <a href="https://join-lemmy.org/">Lemmy</a>, a federated link
          aggregator and forum.</li>
        <li>Secure Scuttlebutt is akin to NNTP, RSS, or Atom feeds
          with signed posts, which include hashes of previous ones,
          but uses a gossip protocol, rather than a fixed address per
          feed. Perhaps it is more like a VCS repository with signed
          commits, where posts are only added. But apparently the
          primary client is in JS and buggy, and it does not seem to
          be actively developed (as of 2024), though there is
          the <a href="https://www.tildefriends.net/">Tilde Friends</a> client. The protocol itself is JSON-based
          (relatively awkward for implementations in some languages),
          while not quite human-readable/writable, relies on Ed25519
          (no key agility).</li>
        <li><a href="https://briarproject.org/">Briar</a>'s <a href="https://code.briarproject.org/briar/briar/-/wikis/home">Bramble protocol suite</a>.uses a custom binary data
          format, and custom cryptographic protocols, yet it
          piggybacks on Tor for Internet connections, and
          alternatively supports direct Bluetooth or Wi-Fi
          connections. Somewhat similar to Secure Scuttlebutt in
          incorporating sneakernet elements. Apparently will not help
          with private message relaying, but should work for that of
          news groups. Similar to people talking to each other. The
          messenger's website is blocked in Russia since 2024.</li>
        <li>Other social networking tools:
          the <a href="http://secushare.org/comparison">secushare's
          capability comparison of privacy-oriented distributed
          networking tools</a> and
          the <a href="https://en.wikipedia.org/wiki/Comparison_of_software_and_protocols_for_distributed_social_networking">Wikipedia
          comparison of software and protocols for distributed social
          networking</a>.</li>
      </ul>
      <p>See also: <a href="https://theta.eu.org/2019/10/10/nea-federation-design.html">Distributed state and network topologies in chat systems</a>.</p>

      <h3>File sharing and websites</h3>
      <ul>
        <li><a href="https://en.wikipedia.org/wiki/BitTorrent">BitTorrent</a>, of course, with <a href="https://en.wikipedia.org/wiki/Mainline_DHT">Mainline DHT</a>.</li>
        <li><a href="https://github.com/ipfs/ipfs">IPFS</a> seems to be getting, well, maybe not popular, but
          mentioned here and there. There are papers and it is
          documented, but the implementations are currently in Go
          (reference), JS (incomplete), and Python (started). So, that
          would involve setting the whole Go thing to try, but
          the <a href="https://github.com/ipfs/papers/raw/master/ipfs-cap2pfs/ipfs-p2p-file-system.pdf">IPFS whitepaper</a> looks nice. There is documentation, and
          a few separate parts (which can be and are isolated into
          libraries; though would be more helpful if they were
          actually reusable C libraries), but they still are a part of
          a single project, which is not small or simple. There's a
          growing number of projects using it, such as <a href="https://orbitdb.org/">OrbitDB</a>, and
          then distributed IMs like <a href="https://berty.tech">Berty</a> (though these projects tend
          to continue the awkward theme of semi-broken websites, Go +
          JS, poor interoperability and documentation). Though later
          it was merged with a cryptocurrency.</li>
        <li><a href="https://freenetproject.org/">Freenet</a> is a distributed data store, apparently not very
          interactive. Or maybe it is; it's in Java, and I didn't try it
          myself.</li>
        <li><a href="https://zeronet.io/">ZeroNet</a>: haven't tried it, and it's in Python, but apparently it's
          popular enough to at least mention. Apparently it doesn't care much
          about security (see <a href="https://news.ycombinator.com/item?id=14041596">a HN thread</a>). There are other similar projects
          (e.g., Beaker Browser), which seem to market slightly disguised WWW as
          a new invention.</li>
        <li>HTTP, rsync, Gopher, etc, possibly over Tor or a similar
          network to mitigate CGNAT. RSS and Atom can be quite useful
          there, along with their aggregators working as hubs (relays)
          for both distribution and discovery, and maybe with the help
          of RDF.</li>
        <li><a href="https://en.wikipedia.org/wiki/Gnutella">Gnutella</a>: see below.</li>
        <li>GNUNet: see below.</li>
        <li>
          <a href="https://dat.foundation/">Dat protocol</a> uses small public keys for addressing, and various
          discovery methods, somewhat similar to using regular file transfer
          protocols over Tor. The primary implementation is in JS, and the
          documentation suggests to install it with <code>curl ... |
          bash</code>. Apparently gets praised for its documentation, most of
          which is just awkward raster images.
        </li>
      </ul>

      <h3>Search</h3>
      <h4>Web crawling</h4>
      <p>
        <a href="http://yacy.net/en/index.html">YaCy</a> and <a href="http://en.wikipedia.org/wiki/Distributed_search_engine">a few more</a> (some of which are dead by now) distributed search
        engines exist. I have only tried YaCy, and it works, though haven't
        managed to find its technical documentation – so it's not clear how it
        works.
      </p>
      <h4>Other information</h4>
      <p>
        These networks include search for files, but by their names, not
        content-addressable (so they can't be easily verified, which brings
        additional challenges).
      </p>
      <ul>
        <li><a href="https://en.wikipedia.org/wiki/Gnutella">Gnutella</a> again: used for file sharing, with query-based search (an
          unstructured system, as opposed to DHT-based and content-addressable
          structured ones). Somewhat limited and hardly secure/reliable for
          search, but seemed to work in practice. The first version used
          query flooding, while gnutella2 uses a random walk.</li>
      </ul>
      <p>
        Related papers:
      </p>
      <ul>
        <li><a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.10.5444&amp;rank=1">Making Gnutella-like P2P Systems Scalable</a></li>
        <li><a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.132.970&amp;rank=1">Search and Replication in Unstructured Peer-to-Peer Networks</a></li>
        <li><a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.7508&amp;rank=1">Assisted Peer-to-Peer Search with Partial Indexing</a></li>
        <li><a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.513.3513&amp;rank=7">Porqpine: A Peer-to-Peer Search Engine</a></li>
        <li><a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.8050&amp;rank=5">Rich and Scalable Peer-to-Peer Search with SHARK</a> (this one is
          structured, but apparently prone to spam; so are unstructured ones,
          but at least there are rather simple mitigation techniques)</li>
        <li><a href="https://gnunet.org/szengel2012ms">Decentralized Evaluation of Regular Expressions for Capability
            Discovery in Peer-to-Peer Networks</a> (also structured)</li>
      </ul>

      <h3>Cryptocurrencies</h3>
      <p>
        Plenty of those popped up recently. Bitcoin-like ones (usually with a
        proof of work and block chaining) look like quite a waste of resources
        (and perhaps a pyramid scheme) to me, though the idea itself is
        interesting. I was rather interested in "digital cash" payment systems
        before, but those didn't quite take off so far.
      </p>
      <p>
        As of 2021, Bitcoin-like cryptocurrencies seem to be eating
        other distributed projects: many of those are merged with
        their custom cryptocurrencies, or occasionally piggyback on
        existing ones, but either way they become more complicated and
        commercialized. As of 2022, the "crypto" clipping seems to be
        associated more widely with cryptocurrencies and related
        technologies than with cryptography in general. But as of
        2024, it seems that the hype wave is mostly over, with "AI"
        filling up all the hype slots.
      </p>

      <h3>General P2P networking tools</h3>
      <h4>GNUnet</h4>
      <p>
        Not sure how to classify it, but here are some links: <a href="https://gnunet.org">gnunet.org</a>,
        <a href="https://en.wikipedia.org/wiki/GNUnet">GNUnet article in Wikipedia</a>, "<a href="https://gnunet.org/sites/default/files/NET-2015-02-1.pdf">A Secure and Resilent
        Communication Infrastructure for Decentralized Networking
        Applications</a>". Seems promising, but tricky to build, to figure
        how it all works, and to do anything with it now (a lack of
        documentation seems to be the primary issue, though probably
        there are others). Apparently it is also being blocked in
        Russia by 2024, at least the gnunet.org website is (via TSPU,
        it seems), which makes it yet harder to debug. Apparently it
        is easier to setup in a single-user mode, but none of the
        retrieved bootstrap peer addresses seem to be
        available. An <a href="https://lists.gnu.org/archive/html/help-gnunet/2023-12/msg00005.html">up-to-date hostlist</a> can be found (having to use
        some proxying to access lists.gnu.org from Russia, where it is
        blocked as well), and then bootstrapping works.
      </p>
      <p>
        <a href="https://taler.net/">Taler</a> and <a href="http://secushare.org/">secushare</a> (using <a href="http://about.psyc.eu/">PSYC</a>) are getting built on top of it, but
        it's not clear how's it going, how abandoned or alive it is, etc.
        Their documentation also seems to be
        obsolete/outdated/abandoned/incomplete. Update (January 2018):
        apparently secushare prototype won't be released this year.
      </p>
      <h4>libp2p</h4>
      <p>
        <a href="https://libp2p.io">libp2p</a> apparently provides common primitives needed for peer-to-peer
        networking in the presence of NATs and other obstructions. At the time
        of writing there's no C API (so it's only usable from a few languages)
        and its website is quite broken. At the same time worldwide IPv6
        adoption reaches more than 32%, so possibly NATs will disappear before
        workarounds will become usable.
      </p>

      <h3>General tools useful for P2P networking</h3>
      <p>
        Many netowrking-related tools can be used for peer-to-peer
        networking. <code>socat(1)</code> is among particularly
        flexible tools for relaying, which can be combined with many
        other Unix tools for ad hoc
        networking: <code>openssl</code>, <code>gnutls-cli</code>,
        and <code>netcat</code> for data encryption and
        transmission, <code>sox</code>, <code>opusenc</code>,
        <code>rec</code>, <code>play</code>, <code>pw-record</code>,
        <code>pw-play</code>, <code>ffplay</code> for audio capture,
        encoding, decoding, and playback.
      </p>

      <h3>Generic protocols</h3>
      <p>
        There are more or less generic network protocols that may be used,
        possibly together with Tor, to get working and secure peer-to-peer
        services.
      </p>
      <p>
        <a href="https://en.wikipedia.org/wiki/Ssh#Architecture">SSH</a> is quite nice and layered. Apparently its authentication is not
        designed for distributed systems (such as distributed IMs or file
        sharing), its connection layer looks rather bloated, and generally it's
        not particularly simple. Those are small bits of a large protocol, but
        they seem to make it not quite usable for peer-to-peer communication.
      </p>
      <p>
        TLS may provide mutual authentication, and there are readily
        available tools to work with it.
      </p>
      <p>
        IPsec uses similar to TLS, but is a generally better way to
        solve the same problems. Individual addresses (which IPv6
        should bring) are needed to use it for P2P widely though. IPv6
        gets adopted, but slowly. Once computers will become
        addressable individually (again), and transport layer
        encryption will be there by default, it may render plenty of
        the contemporary higher-level network protocols obsolete.
      </p>
      <p>
        Pretty much every distributed IM tries to reinvent everything,
        and virtually none are satisfactory, but at least some of the
        problems are already solved separately: one can use dynamic
        DNS, Tor, or a VPN to obtain reachable addresses (even if the
        involved IP addresses change, and/or are behind NAT), and then
        use any basic/common communication protocol on top. Or even
        set a VM and rely on SSH access, communicating inside that
        system then.
      </p>
    </section><section>
      <h2>Search, <a href="https://en.wikipedia.org/wiki/FOAF_(software)">FOAF</a>, and the rest of <a href="https://en.wikipedia.org/wiki/Resource_Description_Framework">RDF</a></h2>
      <p>
        Some kind of a distributed search/directory may connect small
        peer-to-peer islands into a usable network. While it is hard to decide
        on an algorithm, lists of known and/or somewhat trusted nodes are common
        for both structured and unstructured networks, as well as for use of
        social graphs: if those would be provided by peers, a client may decide
        by itself which algorithm to apply. This reduces the task to just
        including known nodes into local directory entries, which can be shipped
        over any other protocols (e.g., HTTP, possibly over Tor).
      </p>
      <p>
        Knowledge representation, which is needed for a generic
        directory structure, is tricky, but there is RDF (resource
        description framework) already. There is FOAF (friend of a
        friend ontology), specifically for describing persons, their
        relationships (including linking the persons they know), and
        other social things. A basic FOAF search engine must be fairly
        straightforward to set: basically a triple store filled with
        FOAF data. See also: <a href="semantic-web.html">Semantic Web</a>.
      </p>
    </section><section>
      <h2>Hubs and addressing</h2>
      <p>
        As mentioned in the "usable systems" section above, the
        systems relying on peering seem to fare better in practice:
        they are still distributed, on the level of servers (or hubs
        generally), which then take care of tricky parts on behalf of
        the users. This is also how postal systems, telephone ones,
        and the Internet itself are organized. And some of those
        federated systems can be quite close to distributed ones: for
        instance, it is easy and viable to set an XMPP or a WWW server
        on one's personal machine, although normally addressing is
        centralized in those cases.
      </p>
      <p>
        The <a href="https://en.wikipedia.org/wiki/Magnet_URI_scheme">Magnet URI scheme</a> combines content addressing, which is
        not centralized, with a list of addresses to bootstrap
        from. Perhaps one can similarly use public keys, with claims
        signed by those, which would be very similar to certificates
        and key servers. No nice and human-readable addresses that
        way, as usually is the case with distributed addressing, but
        this creates a decentralized identity, decoupled from any
        particular nodes.
      </p>
      <p>
        There is the similar concept of <a href="https://en.wikipedia.org/wiki/Self-sovereign_identity">self-sovereign identity</a>,
        with <a href="https://www.w3.org/TR/did-core/">decentralized identifiers</a> (DIDs) as a fairly generic
        framework. Similarly to Activity Streams, they are based on
        the awkward (but RDF-compatible) JSON-LD. See <a href="https://www.w3.org/TR/did-spec-registries/#did-methods">DID Methods</a> for
        more specific specifications, though many of those are
        blockchain-based (probably because DID appeared when those
        were particularly hyped/popular).
      </p>
      <p>
        GNUNet's GNS (<a href="https://datatracker.ietf.org/doc/html/rfc9498">RFC 9498</a>) has a DID method defined. It combines
        local "pet names" (aliases) and memorable labels (subdomains),
        with public keys as unique zones (identifiers). For DID
        identifiers, they simply use GNS zone keys, and store DID
        documents as records of type <code>DID_DOCUMENT</code> under
        the "apex label". Zone delegation is similar to that of
        regular DNS. Both GNS and R5N (GNUNet's DHT) look fine. But
        TLSA records don't seem to work with its dns2gns, and even if
        they did, they would not be trusted without DNSSEC, while CAs
        do not support GNS. So the software would have to support GNS
        explicitly, at which point it could as well use GNUNet's CADET
        instead of TLS. But the main GNUNet implementation is under
        AGPL, which is not likely to help a wide adoption via
        embedding into existing software.
      </p>
      <p>
        Another effort to organize name lookups not dependent on ICANN
        is OpenNIC, but there is an alternative DNSSEC hierarchy,
        including the keys at root, which breaks usual validation for
        ICANN domains. And it is still a centralized system. Maybe
        memorizable and human-readable addresses are not that
        important anyway: it seems that people rarely remember those,
        do not operate those directly (using non-unique nicknames
        instead), and happily use phone numbers, sometimes even
        preferring those over memorizable addresses.
      </p>
      <p>
        But back to more practical (readily usable) systems, OpenPGP
        certificates actually are quite similar to Magnet links, in
        that they ship a public key, along with one or more
        identities, which usually are email addresses, and those can
        be retrieved by various means (DANE, WKD, various key servers,
        manual exchange, etc). I think it keeps being "pretty good",
        for many use cases.
      </p>
    </section><section>
      <h2>Weather data</h2>
      <p>
        Except for common messaging and file sharing, one of the
        distributed (or at least federated) system applications I keep
        considering is weather data sharing: it'd be useful, and it's
        quite different from those other applications.
      </p>
      <p>
        Weather data is commonly of interest to people, and it's right
        out there, not encumbered by patents or copyright laws, just
        has to be measured and distributed. But commercial
        organizations working on that try to extract some profit, so
        they don't simply share that data with anyone for free. There
        are state agencies too, paid out of taxes, but at least in
        Russia apparently you can't easily get weather data out of it
        either -- only a lot of bureaucracy, and even if it was
        possible, there are many awkward custom formats and ways to
        access the data, which won't make a reliable system. People
        sharing this data with each other would solve that problem.
      </p>
      <p>
        Though there is at least one nice exception: <a href="https://developer.yr.no/">the Norwegian
        Meteorological Institute shares weather data freely and for
        the whole globe</a>. While Germany has <a href="https://dwd.api.bund.dev/">Deutscher Wetterdienst:
        API</a>, and the US has <a href="https://www.weather.gov/">weather.gov</a>. Also <a href="https://open-meteo.com/">open-meteo.com</a> appeared
        recently.
      </p>
      <p>
        The challenges/requirements also differ from those with
        messaging or file sharing, since there's a lot of data
        regularly updated by many people, and potentially being
        requested many times, but confidentiality isn't needed. There
        already are protocols somewhat suitable for that: NNTP (which
        is occasionally used for weather broadcasts, just in a free
        form), DNS, and IRC explicitly aim relaying; SMTP (with
        mailing lists) and XMPP (with pubsub) may be suitable too,
        possibly with ad hoc relaying.
      </p>
      <p>
        For reference, as of 2022 there are about 1200 cities with a
        population of more than 500 thousand people; individual hourly
        measurements from each of those would constitute a message per
        3 seconds. Wouldn't harm to have more than one weather station
        per city, to cover smaller cities, and so on, but the order
        seems to be manageable even with modest resources and without
        much of caching or relaying, assuming that there are not too
        many clients receiving all the data just as it arrives.
      </p>
      <p>
        The links/peering can be set manually, and/or data can be
        signed (DNSSEC, OpenPGP, etc) and verified by end users with a
        PKI/WOT; the former may just be simpler, and appears to work
        in practice.
      </p>
      <p>
        Collaboration/coordination/organization is likely to be
        tricky, though possible: plenty of people contribute their
        computing resources to BOINC projects, OONI, file sharing
        networks, and so on. But weather collection is different in
        requiring special equipment (at least a temperature sensor)
        being set outside, complicating contribution.
      </p>
    </section><section>
      <h2>Post-quantum cryptography</h2>
      <p>
        Many of the protocols mentioned here rely on asymmetric
        cryptography, which is particularly vulnerable to attacks by a
        quantum computer, and it seems that at this rate we may have
        usable quantum computers before widely used distributed
        systems. Use of symmetric cryptography, or at least
        cryptographic agility of the protocols, is needed to mitigate
        that.
      </p>
      <p>
        As a side note, it seems to me that asymmetric cryptography is
        often used where symmetric cryptography would fit better: even
        for messaging confidentiality, there are multiple competing
        standards using asymmetric cryptography of varied complexity
        and awkward failure modes, most of them requiring to verify
        keys over a safe channel (usually in person) to actually be
        reliable, at which point the parties can as well establish a
        secret shared key. While I am not aware of any standard for
        simple PSK-based messaging. Perhaps OpenPGP comes close, as a
        generic format supporting symmetric encryption, though
        anything else capable of symmetric encryption would be
        similar, and still with no special support in email or IM
        clients.
      </p>
    </section><section>
      <h2>Beyond technologies</h2>
      <p>
        Primarily technologies are covered here, but non-technical
        means may be quite helpful as well. Social skills and
        connections may be more useful to stay connected, and to
        actually engage into social activities. While a decent
        government is supposed to help people, rather than to be a
        threat actor, both online and offline. Throw in good ISPs, and
        a few centralized systems maintained by well-meaning and
        competent people, and one wouldn't even need any channel
        encryption for most tasks.
      </p>
      <p>
        People don't quite work that way though, with governments
        apparently trying to turn into autocracies, any non-awful
        ISPs being acquired by awful ones, people in general being
        prone to mischief, and some of them engaging into crime, so
        some technical measures are needed, but some social and
        organizational help is important as well.
      </p>
      <p>
        Additionally, the combination of social connections and
        relatively basic technologies allows to build friend-to-friend
        networks, reducing network abuse.
      </p>
      <p>
        Yet another approach to consider is the focus on a more
        delayed communication, through years or centuries, via books
        and similar larger works: near-real-time communication can be
        blocked or otherwise disrupted relatively easily, but if the
        delays are already longer than the regimes that impose network
        blocking, or than transient network issues, such communication
        is unaffected. Related notes: <a href="personal-data-storage.html">personal data storage</a>.
      </p>

      <h3>Users</h3>
      <p>
        Distributed systems, particularly when used for social
        activities, require users – so that there would be
        somebody to send messages to in case of an IM. It is quite a
        problem, since even by sticking to federated protocols it is
        easy to lose or decrease contact with people.
      </p>
      <p>
        People in general are capable of dealing with even more
        complicated and less sensible systems, as digital
        bureaucracies demonstrate, but apparently not motivated
        enough. I am somewhat interested and motivated myself, yet
        occasionally after looking at software with many dependencies,
        which reinvents many parts, and generally goes against what I
        view as good practices, I do not feel motivated enough to try
        it.
      </p>
      <p>
        Search in particular is tricky in such systems, though usually
        some form of communication with strangers and
        self-organization (e.g., via multi-user chats, web pages) is
        possible, so that people can find groups with shared
        interests. Perhaps being sociable is easier and more useful
        than technical solutions there, too.
      </p>

      <h3>Politics</h3>
      <p>
        As observed and stated by Aristotle ("Politics", book V,
        chapter XI), Niccolo Machiavelli ("The Prince"), Michel
        Foucault ("Discipline and Punish"), James C. Scott ("Seeing
        Like a State"), touched on by Benedict Anderson ("Imagined
        Communities"), and likely many others, surveillance is an
        important tool for control (including tyranny, but not only
        that), while orderly, simplified, measured, uniform,
        predictable structures and societies are legible, pliable, and
        manipulable. Distributed systems may help to limit or slow
        down the spread of unjust power (of either governments or
        commercial companies, as two common examples), but using this
        perspective, one may additionally consider other means for the
        same purpose: diverse, custom, and obscure protocols, varied
        and changing means of communication, heterogeneous networks,
        including overlay networks and steganography. Anything
        complicating mapping and analysis, turning the infrastructure
        into jungles instead of cities.
      </p>
      <p>
        Meantime, it is also important to advance towards a more just
        society, which is generally desirable, and any network
        protocols will be irrelevant if an oppressive government ends
        up shutting down the Internet access or creating other more
        pressing issues for its citizens. John Rawls's "A Theory of
        Justice" is a book I enjoyed on the topic, which also happens
        to remind me of distributed network protocol designs.
      </p>
    </section><section>
      <h2>See also</h2>
      <p>
        Not quite about collaborative protocols like those listed
        above, but just about distributed computing (including
        software design aiming multiple servers controlled by a single
        entity), there's a nice "<a href="https://scholar.harvard.edu/waldo/publications/note-distributed-computing">A Note on Distributed Computing</a>"
        paper.
      </p>
    </section></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/debian-11-workstation.html"/><id>https://thunix.net/~defanor/notes/debian-11-workstation.html</id><author><name>defanor</name></author><title>Debian 11 (to 13) workstation</title><summary>Debian 11 to 13 workstation maintenance</summary><published>2021-08-17T18:00:00Z</published><updated>2026-05-27T14:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Debian 11 (to 13) workstation</h1><p>
      These are my notes on setting and maintaining a
      desktop/workstation system, a successor to the older <a href="centos-7-workstation.html">CentOS 7
      workstation</a>, to be used--among other things--with the <a href="private-server-setup.html">private
      server setup</a> and <a href="simpler-server-setup.html">simpler server setup</a>.
    </p><h2>Installation</h2><p>
      My goals were a working setup, along with an old system, simple
      and close to the standard one, and with
      encrypted <code>/home</code> (see also: <a href="personal-data-storage.html">personal data
      storage</a>). To avoid possible confusion during installation or
      when some repairs are needed, I keep a sheet of paper with
      partitions listed on it.
    </p><p>
      I went for <a href="https://cdimage.debian.org/images/unofficial/non-free/images-including-firmware/">Unofficial non-free images including firmware
      packages</a>, since I need GNU documentation and the Nvidia
      proprietary driver anyway (unnecessary as of Debian 12, since
      proprietary firmware is included into official images, and that
      Nvidia card is not supported anymore), and it is more suitable
      for a rescue USB stick. Picked a live Xfce image, to be able to
      poke it briefly (and ensure that it works fine with the
      hardware) before installation, as well as for possible later use
      as a rescue system. Though live images come with a drawback of
      installing <code>live-task-*</code> packages, including
      localization ones for all the supported languages, so you end up
      with hundreds of additional and unused packages to upgrade
      regularly; <code>netinst</code> produces a cleaner system, but
      they can also be removed manually afterwards. Xfce is not as
      bloated and broken as GNOME and KDE, but not as half-baked and
      broken as most of the others. Apparently MATE and Cinnamon aim a
      similar level of complexity, and I hear good things about those,
      too. I downloaded the image via BitTorrent, and as
      the <a href="https://www.debian.org/releases/bullseye//installmanual.en.html">Installation Guide</a> suggests, did the equivalent of <code>cp
      debian.iso /dev/sdX &amp;&amp; sync</code>.
    </p><p>
      There is a graphical installer available from the live system
      itself, which is handy for looking up documentation on the web
      while installing, but its functionality differs from that of the
      regular installer: there is no option to make an EFI system
      partition (ESP) explicitly, so I rebooted and used the regular
      installer. Although while installing Debian on another machine a
      bit later, I noticed that it would handle fine a FAT32 partition
      mounted into /boot/efi, without requiring to mark it explicitly
      as ESP.
    </p><p>
      As usual, I wanted to keep the old system usable and
      independent, so I have set this one on a separate disk, with a
      separate ESP, which I had to add (about 500 MB in size); the
      installer presented a warning about possibly making other
      systems hard to boot into if EFI is forced, but I've installed
      it on a separate disk (and adjusted UEFI boot priorities
      accordingly), so it was fine.
    </p><p>
      I used btrfs for a while, but decided to go with ext4 this time,
      since I use btrfs's advanced features less and less, while a
      simpler filesystem may be more reliable. Decided to minimize
      dealing with partitioning in the installer, and just made a
      single 500 GB partition for everything (not counting ESP, and
      while having 1.5 TB unpartitioned on the disk). No swap
      partition either, since in my experience it's not helpful and
      only freezes the system when something goes wrong. Didn't choose
      a network mirror to download new packages either, so the
      installation went quickly and smoothly.
    </p><p>
      While the <code>en_US.UTF_8</code> locale is very
      common, <code>C.UTF_8</code> may be better to set at once, since
      it has 24-hour time format, sensible string sorting, and DBMSes
      (particularly PostgreSQL) are more portable when set with it,
      not running into collation version mismatches on replication
      between databases hosted on different operating systems. This is
      simply adjusted in <code>/etc/default/locale</code>.
    </p><h2>Initial setup</h2><p>
      As with CentOS about 7 years prior to this setup, apparently the
      nouveau driver was causing the system to freeze, so I installed
      the <a href="https://wiki.debian.org/NvidiaGraphicsDrivers#Debian_11_.22Bullseye.22">NVIDIA Proprietary Driver</a>.
    </p><p>
      Then I've added my user into the <code>sudo</code> group, have
      set the keyboard layout to colemak with <code>sudo
      dpkg-reconfigure keyboard-configuration</code> (since the
      installer doesn't provide that option), have set it in Xfce's
      settings to use the system layout (actually in a couple of
      places, not sure why there are so many). While at it, removed
      the useless bottom panel (application launcher), have set a dark
      theme, nicer icons, disabled icons on the desktop.
    </p><p>
      As with servers, and perhaps more importantly than with those,
      decent and varied nameservers should be set. In this
      case <code>/etc/resolv.conf</code> mentions that it's generated
      by NetworkManager (which is rather awkward and unnecessary, and
      an example of little bloat <code>task-xfce-desktop</code>
      pulls), so one can <a href="https://wiki.debian.org/NetworkConfiguration#DNS_configuration_for_NetworkManager">adjust nameservers with nm-connection-editor</a>.
    </p><p>
      Then I've set the previously mentioned
      encrypted <code>/home</code> (this method is a bit verbose,
      since I've checked that things work as intended):
    </p><pre>sudo fdisk /dev/sda
# created another 500 GB partition for /home, sda3
sudo apt install cryptsetup
sudo cryptsetup luksFormat /dev/sda3
sudo cryptsetup luksOpen /dev/sda3 enchome
sudo mkfs.ext4 -L home /dev/mapper/enchome
sudo cryptsetup close enchome
sudo blkid | grep sda3
sudo -e /etc/crypttab
# added the following:
# enchome		UUID=PARTITION_UUID_HERE none luks
sudo -e /etc/fstab
# added the following:
# /dev/mapper/enchome   /mnt/home          ext4    defaults        0       2</pre><p>
      Then rebooted to ensure that <code>/mnt/home</code> mounts fine,
      moved the files from <code>/home</code> there (with <code>cp
      -a</code>), renamed <code>/home</code>, have
      set <code>fstab</code> to mount it
      into <code>/home</code>. Rebooted again, checked again that
      everything is fine, and removed the old <code>/home</code>.
    </p><p>
      One may also mount <code>/tmp</code> into memory, reducing the
      data leaking to the unencrypted root filesystem, slightly
      speeding up some tasks, and reducing disk usage; it works for me
      and I like it, but there is plenty of criticizm and possible
      issues with that:
    </p><pre>tmpfs           /tmp            tmpfs   size=1g,nosuid      0       0</pre><p>
      Moved/imported my SSH and GPG keys, <code>~/.authinfo</code>,
      some other files.
    </p><p>
      I had to remap the "menu" key (keycode 135) to left alt, which
      is always awkward and different; in Xfce I had to enter the GUI
      settings, then "session and startup", and add the <code>xmodmap
      -e "keycode 135 = Alt_L"</code> command there. Also had to unmap
      C-M-f to be able to use it in Emacs, in "settings" - "keyboard"
      - "application shortcuts".
    </p><p>
      XFCE's default key bindings for basic tiling functionality aim a
      numpad, which I do not have, but those can be adjusted in
      "settings" - "window manager" - "keyboard".
    </p><p>
      To disable GnuPG's annoying requirment to use non-alpha
      characters in a passphrase (which is contrary to <a href="https://pages.nist.gov/800-63-3/sp800-63b.html">NIST SP
      800-63B</a>, and complains about passwords in the style of <a href="https://xkcd.com/936/">XKCD
      #936</a>, such as those generated with xkcdpass), <code>echo
      'min-passphrase-nonalpha 0' &gt;&gt;
      ~/.gnupg/gpg-agent.conf</code>.
    </p><p>
      More software: <code>sudo apt install emacs
      emacs-common-non-dfsg telnet vlc tor mu4e isync rsync xsltproc
      clementine git elpa-magit elpa-haskell-mode cabal-install lynx
      whois nmap ncat dnsutils knot-dnsutils tmux fbreader inkscape
      blender godot3 gimp darktable lmms musescore texlive
      texlive-plain-generic auctex texlive-latex-extra texlive-science
      python3-sympy octave octave-symbolic libxml2-utils
      jmtpfs xkcdpass</code>,
      and <code>better-defaults</code>, <code>mu4e-alert</code>,
      and <code>cdlatex</code> via Emacs's package manager (since they
      weren't in the system repositories). Generally it's a good idea
      to stick to a single package manager, since then you shouldn't
      run into version mismatches. <code>update-alternatives --config
      editor</code> to set vim as the default editor (running a new
      emacs instance may be a bit slow for quick <code>sudo -e</code>
      editos, emacsclient won't always work, setting a small emacs
      clone just for that seems excessive, and the default nano is
      just awkward, so vim is an okay option; though perhaps one can
      also set <code>emacs -Q -nw</code>). Over time a bunch of other
      things were added, including mpd (<a href="https://wiki.debian.org/mpd">running as a user service</a>) and
      mpc, strongSwan, likely more development tools.
    </p><p>
      Then I set xterm and Emacs themes (<code>.Xresources</code>,
      Elisp), from <a href="https://github.com/defanor/dotfiles">my dotfiles repository</a>.
    </p><p>
      By 2022, I had to start using Tor bridges (since Tor is being
      blocked around here, and Internet connectivity is crippled in
      general, with Tor helping to fix some of it):
      install <code>obfs4proxy</code>, then append
      to <code>/etc/tor/torrc</code>:
    </p><pre>UseBridges 1
ClientTransportPlugin obfs4 exec /usr/bin/obfs4proxy managed</pre><p>
      And bridge records received from <a href="https://bridges.torproject.org/">bridges.torproject.org</a> or by
      other means, prefixed with "Bridge" (<code>Bridge obfs4
      ...</code>). Though by 2024, many of those are blocked.
    </p><p>
      Configured Firefox: Sans Serif font, disallowed pages to choose
      their own fonts, increasing monospace font size to be the same
      as others (16), setting a minimal font size equal to those, "wp"
      keyword for Wikipedia search and "wt" for Wiktionary search,
      installing uBlock Origin (with "annoyance" lists additionally
      enabled) to cut out junk, NoScript to cut out more junk,
      FoxyProxy to use Tor for websites blacklisted around here and
      the ones I don't want to track me, HTTPS everywhere to mitigate
      local data retention practices (superceded by the Firefox's
      built-in HTTPS-Only Mode, which should be enabled in settings),
      Stylus to set a global dark theme for comfortable browsing when
      it is dark around.
    </p><p>
      Configured isync and Emacs, later installed rexmpp's
      xmpp.el. Attempted a minimal Emacs configuration this time
      (though most likely it'll grow), so used the built-in rcirc
      (with <code>rcirc-track-minor-mode</code> and just
      setting <code>rcirc-server-alist</code>), not much of mu4e
      configuration. Something like this:
    </p><pre>(require 'package)
(add-to-list 'package-archives '("melpa" . "https://melpa.org/packages/") t)
(package-initialize)

(require 'better-defaults)
(global-set-key [mode-line mouse-4] 'previous-buffer)
(global-set-key [mode-line mouse-5] 'next-buffer)

;; https://github.com/defanor/cyrillic-colemak
(require 'cyrillic-colemak)
(add-to-list 'custom-theme-load-path "~/.emacs.d/elisp/")
(load-theme 'blueish t)

(setq org-preview-latex-default-process 'dvisvgm
      org-babel-python-command "python3"
      org-src-preserve-indentation t)
(with-eval-after-load 'org
  (plist-put org-format-latex-options :scale 1.5)
  (require 'ob-python))

(rcirc-track-minor-mode t)
(setq rcirc-buffer-maximum-lines 2000
      rcirc-server-alist
      '(("irc.libera.chat" :port 6697 :encryption tls
         :user-name "defanor" :channels ("#emacs")))
      rcirc-authinfo
      '(("libera.chat" sasl "defanor" "password-here")))

(require 'haskell-interactive-mode)
(require 'haskell-process)
(add-hook 'haskell-mode-hook 'interactive-haskell-mode)
(add-hook 'haskell-mode-hook 'haskell-decl-scan-mode)

(require 'html-wysiwyg)
(add-hook 'html-mode-hook 'html-wysiwyg-mode)

(add-hook 'after-init-hook #'mu4e-alert-enable-mode-line-display)
(setq mail-user-agent 'mu4e-user-agent
      read-mail-command 'mu4e)
(with-eval-after-load "mu4e"
  (require 'smtpmail)
  (setq mml-secure-openpgp-encrypt-to-self t)
  (defun suppress-messages (old-fun &amp;rest args)
    (cl-flet ((silence (&amp;rest args1) (ignore)))
      (advice-add 'message :around #'silence)
      (unwind-protect
          (apply old-fun args)
        (advice-remove 'message #'silence))))
  (advice-add 'mu4e-update-mail-and-index :around #'suppress-messages)
  (advice-add 'mu4e-index-message :around #'suppress-messages)
  (advice-add 'progress-reporter-done :around #'suppress-messages)
  (setq mu4e-change-filenames-when-moving t)
(add-to-list
   'mu4e-contexts
   (make-mu4e-context
    :name "thunix"
    :enter-func (lambda ()
                  (mu4e-message "Switch to the thunix IMAP context")
                  ;; (mu4e~request-contacts)
                  )
    :leave-func (lambda () (mu4e-clear-caches))
    :match-func (lambda (msg)
                  (when msg
                    (mu4e-message-contact-field-matches
                     msg
                     :to "defanor@thunix.net")))
    :vars '( (user-mail-address            . "defanor@thunix.net")
             (user-full-name               . "defanor")
             (smtpmail-default-smtp-server . "thunix.net")
             (smtpmail-local-domain        . "thunix.net")
             (smtpmail-smtp-user           . "defanor")
             (smtpmail-smtp-server         . "thunix.net")
             (smtpmail-stream-type         . starttls)
             (smtpmail-smtp-service        . 25)
             (message-send-mail-function   . message-smtpmail-send-it)
             (mu4e-get-mail-command        . "mbsync -q thunix")
             (mu4e-update-interval         . 300)
             (mu4e-view-show-addresses     . t)
             (mu4e-maildir                 . "~/Maildir/thunix/")
             (mu4e-mu-home                 . "~/.mu/thunix")
             (mu4e-user-mail-address-list  . ("defanor@thunix.net"))
             )))
;; more contexts here
)</pre><p>
      And <code>.mbsyncrc</code> records like this:
    </p><pre>IMAPAccount thunix
Host thunix.net
Port 993
User defanor
SSLType IMAPS
Pass "password-here"
AuthMechs *

IMAPStore thunix-remote
Account thunix

MaildirStore thunix-local
Path ~/Maildir/thunix/
Inbox ~/Maildir/thunix/inbox/

Channel thunix
Far :thunix-remote:
Near :thunix-local:
Patterns * !drafts
Create Both
Remove Both
Expunge Both
SyncState *</pre><p>Then mu stores can be initialized with commands like <code>mu
        init --muhome=~/.mu/thunix --maildir=~/Maildir/thunix
        --my-address=defanor@thunix.net</code>.</p><p>
      This was a sufficient setup to listen to a radio (<code>vlc
      'http://s3.radionetz.de/1a-rock.mp3'</code>; as of 2025-10-27
      and 2026-01-14, that is blocked here, along with many CDNs and
      hosting companies, some of the alternatives are <code>vlc
      'http://113fm.cdnstream1.com/1740_128'</code>, <code>vlc
      'https://s8.yesstreaming.net:7099/RblLgn'</code>,
      see <a href="https://dir.xiph.org/">dir.xiph.org</a> for other online radios), local music
      collection (which I keep on a separate partition, so just
      mounted it via <code>fstab</code> into the same path as before,
      and the playlist also stored on it contained correct paths),
      communicate (IRC, XMPP, email), do Haskell programming, browse
      WWW relatively comfortably, play Discworld MUD over telnet, and
      publish these notes. At that point I've adjusted <a href="https://github.com/defanor/dwproxy">dwproxy</a> to be
      able to build it using only dependencies from the system
      repositories (for related rants and musings, see the notes
      on <a href="software-packaging-and-deployment.html">software packaging and deployment</a> and <a href="everyday-programming-in-haskell.html">everyday programming in
      Haskell</a>), and built a few work projects: since it's Cabal 3 now,
      had to set <a href="https://cabal.readthedocs.io/en/latest/cabal-project.html#specifying-the-local-packages">cabal.project</a> in order to use internal libraries, and
      made some other minor adjustments to handle newer versions of
      dependencies. C projects (<a href="https://codeberg.org/defanor/rexmpp">rexmpp</a> in particular) also required
      minor adjustments to handle newer versions of the compiler and
      libraries, but fairly straightforward.
    </p><h2>Adjustments</h2><p>
      Realtime Policy and Watchdog Daemon (rtkit) can be quite spammy
      in the logs with its debug messages, but that can be fixed by
      overriding its systemd service (<code>sudo systemctl edit
      rtkit-daemon.service</code>, followed by <code>sudo systemctl
      daemon-reload</code> and <code>sudo systemctl restart
      rtkit-daemon.service</code> to apply it) with the following:
    </p><pre>[Service]
LogLevelMax=info</pre><h2>Update to Debian 12</h2><p>
      Following the instructions (<a href="https://www.debian.org/releases/bookworm/amd64/release-notes/ch-upgrading.en.html">Chapter 4. Upgrades from Debian 11
      (bullseye)</a>), I executed <code>apt full-upgrade</code> to
      find out that my graphics card (GTX 660) is not supported by the
      NVIDIA proprietary driver anymore. Chose to not install the new
      nvidia-driver, but that interrupted the process, so had
      to <code>apt --fix-broken install</code>, and then <code>apt
      full-upgrade</code> again. Afterwards
      removed <code>nvidia-driver</code>,
      chose <code>mesa-diverted</code> in <code>update-glx --config
      glx</code> in order to de-blacklist nouveau drivers, rebooted,
      the system only worked for some minutes before freezing,
      rendering it unusable. Fortunately I have integrated graphics
      here (Xeon E3-1275 v2 on ASUS P8C WS), which I picked precisely
      because this sort of thing keeps happening; took the graphics
      card out, connected the display to the motherboard's DVI
      output. Apparently I disconnected the system disk while taking
      the graphics card out, so failed to boot; then reconnected it,
      and saw it via UEFI, but failed to boot still, with different
      priorities (possibly messed up the UEFI boot settings while
      poking them without the disk connected properly). Managed to
      boot into the system by booting grub from a live USB stick, then
      pointing it to the system's grub.cfg using grub shell's
      <code>configfile</code> command. Tried to fix it with
      efibootmgr, that did not work, but it worked to just
      do <code>grub-install</code> and <code>update-grub</code>,
      leading to a working system into which I can boot directly,
      albeit without a graphics card. See <a href="https://wiki.debian.org/GrubEFIReinstall">GrubEFIReinstall</a> for more
      options.
    </p><p>
      Additionally, some texlive packages failed to update, and some
      fcitx5 ones were kept back.
    </p><p>
      Afterwards I did <code>apt autoremove</code>, which removed
      telnet, so had to <code>apt install telnet</code> again.
    </p><p>
      mu4e broke as well: had to update mu4e-alert via Emacs, since it
      came from melpa, but then it kept failing with "Mu server
      process ended with exit code 1". Dug the approximate command out
      of the sources (<code>/usr/bin/mu server --debug
      --muhome=~/.mu/thunix</code>), executed it manually, saw the
      error message: "error: expected schema-version 465, but got 451;
      cannot auto-upgrade; please use 'mu init'", "Please
      (re)initialize mu with 'mu init' see mu-init(1) for
      details". Did <code>mv ~/.mu/ ~/.mu-old/</code>, then <code>mu
      init --muhome=~/.mu/thunix --maildir=~/Maildir/thunix
      --my-address=defanor@thunix.net</code> (and similar ones, for
      other mailboxes), and then it worked. As many other programs,
      mbsync deprecated "master/slave" terminology, introducing its
      unique alternative: "far/near".
    </p><p>
      Had to <code>M-x customize-group RET ansi-colors RET</code>,
      since <code>ansi-color-names-vector</code> became obsolete.
    </p><p>
      I had an unused PostgreSQL 13 (used primarily for local
      testing), and PostgreSQL 15 was installed by the system upgrade,
      so I just cleaned up the old version: <code>sudo pg_dropcluster
      --stop 13 main</code>, <code>sudo apt remove
      postgresql-13 postgresql-client-13</code>.
    </p><p>
      Then I was left with a bunch of other "installed,local" packages
      (<code>apt list '?narrow(?installed,
      ?not(?origin(Debian)))'</code>), so cleaned some of those up,
      after checking that they do not seem to be necessary: <code>sudo
      apt remove haskell-platform gcc-10 gcc-9-base gcc-10-base
      clang-11 python-numpy-doc openjdk-11-jre openjdk-11-jdk
      openjdk-11-jre-headless openjdk-11-jdk-headless libx264-160
      libx265-192 libwebp6 libvpx6 libswresample3 libssl1.1 libsepol1
      firmware-intelwimax linux-image-5.10.0-8-amd64
      linux-image-5.10.0-23-amd64 iukrainian libffi7 libbpf0
      libprocps8</code>.
    </p><p>
      Had to use <a href="https://www.linuxquestions.org/questions/linux-software-2/fbreader-writes-hyphen-after-each-word-4175679113-print/">a workaround for the FBReader's
      hyphenation-after-each-word bug</a>.
    </p><h2>Update to Debian 13</h2><p>
      Similarly to the previous update, following the Debian 13
      release notes chapter on <a href="https://www.debian.org/releases/trixie/release-notes/upgrading.en.html">Upgrades from Debian 12 (bookworm)</a>, I
      upgraded it to 13, which went mostly smoothly, but slowly,
      taking a couple of horus (with HDD and more than 3000 packages
      to upgrade or install, even though I tried to clean them up a
      little, and generally trying to avoid installing unnecessary
      ones).
    </p><p>
      After the upgrade and reboot, I ran <code>sudo apt
      autoremove</code>, and cleaned up a little more of the leftover
      NVIDIA packages, which I noticed that I still have, and which I
      picked with a combination of <code>apt search</code>
      and <code>aptitude why</code>: <code>sudo apt remove
      xserver-xorg-video-nvidia nvidia-vdpau-driver
      nvidia-kernel-dkms</code>, followed by another <code>sudo apt
      autoremove</code>.
    </p><p>
      mu4e once again required to re-run <code>mu init</code> as
      described above. And as before, <code>apt list
      '?narrow(?installed, ?not(?origin(Debian)))'</code> listed a
      bunch of dated packages, some of which I cleaned out
      manually: <code>sudo apt remove janus clang-14 freerdp2-x11
      openjdk-17-{jdk,jre}{,-headless} postgresql-15
      postgresql-client-15</code>.
    </p><p>
      Then I noticed that debian.map.fastlydns.net (199.232.174.132)
      is blocked here, so had to replace deb.debian.org with an
      unblocked (local) mirror in <code>/etc/apt/sources.list</code>.
    </p><p>
      I had magit installed from melpa in Emacs, in addition to the
      one installed from system repositories (IIRC I had to install it
      to keep up with git), but after the update it ceased to work,
      with odd "transient" library issues. I tried to switch back to
      magit from system repositories by removing that from melpa, and
      was able to remove melpa magit itself, but not its dependencies,
      since package.el kept seeing those as dependencies (despite
      there being other versions available). So I had to
      remove <code>~/.emacs.d/elpa/{magit-section,transient}*</code>,
      as well
      as <code>~/.emacs.d/eln-cache/30.1-c7a97098/{transient,magit-section,magit}-*</code>,
      restart Emacs, and then it worked fine.
    </p><p>
      Following this update, I have also upgraded from bird (version
      1.*, which is no longer in repositories, but not automatically
      replaced by a newer version) to bird3, which required (in my
      basic configuration, for use with a VPN) to
      wrap <code>import</code> and <code>export</code> directives
      within <code>protocol</code> blocks into <code>ipv4</code>
      and <code>ipv6</code> blocks, as shown in examples
      in <code>/etc/bird/bird.conf</code> itself.
    </p><h2>Servers</h2><p>
      It is handy to host servers locally, particularly for
      communication: they are always available from the primary system
      then, the latency is reduced, regular TLS allows for
      peer-to-peer connections. As a downside, issues with the primary
      system also lead to downtime of those.
    </p><h3>XMPP server</h3><p>
      Eventually I decided that having a properly configured XMPP
      server locally is useful as a backup, for lower-latency calls,
      and to decrease load on remote servers. Having just an A record
      pointing to my static IP address (a free dyndns service in this
      case, to avoid dependencies on domain names at once), and port
      forwarding configured on the router for ports 80, 5222, 5269,
      5281, 3478, 49152-49155, I have set nginx and uacme to obtain an
      X.509 certificate for TLS, configured nftables to decrease spam
      in the logs (only accepting connections on port 80 when renewing
      a certificate), then configured Prosody and coturn. <code>sudo
      apt install nginx uacme nftables prosody
      coturn</code>. My <code>/etc/nftables.conf</code>, slightly
      abridged to focus on relevant parts:
    </p><pre>#!/usr/sbin/nft -f

flush ruleset

table inet filter {
  set not-clients {
    type ipv4_addr
    flags interval
    elements = { 1.0.0.0/8 }
  }
  set blocks {
    type ipv4_addr
    flags interval
    elements = { 1.1.1.1 }
  }
  set open-ports-s2s {
    type inet_service
    flags interval
    elements = { 5269 }
  }
  set open-ports-c2s {
    type inet_service
    flags interval
    elements = { 5222, 5281, 3478, 49152-49155 }
  }
  chain input {
    type filter hook input priority 0; policy drop;

    # Mitigate TCP reset attacks performed by the ISP.
    ip saddr @blocks tcp sport 443 tcp flags rst drop;

    # Allow traffic from established and related packets.
    ct state established,related accept

    # Allow loopback traffic.
    iifname lo accept

    # Allow incoming TCP and UDP packets on @open-ports-s2s.
    tcp dport @open-ports-s2s accept;
    udp dport @open-ports-s2s accept;

    # Drop connections from spammy addresses.
    ip saddr @not-clients drop;

    # Allow incoming TCP and UDP packets on @open-ports-c2s.
    tcp dport @open-ports-c2s accept;
    udp dport @open-ports-c2s accept;
  }
  chain forward {
    type filter hook forward priority 0;
  }
  chain output {
    type filter hook output priority 0;
  }
}</pre><p>
      Then set <code>/usr/local/bin/uacme-hook.sh</code>,
      modifying <code>/usr/share/uacme/uacme.sh</code>:
    </p><pre>--- /usr/share/uacme/uacme.sh   2023-02-15 23:31:43.000000000 +0300
+++ /usr/local/bin/uacme-hook.sh        2024-01-30 09:49:06.505761694 +0300
@@ -16,7 +16,7 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see &lt;http://www.gnu.org/licenses/&gt;.
 
-CHALLENGE_PATH="${UACME_CHALLENGE_PATH:-/var/www/.well-known/acme-challenge}"
+CHALLENGE_PATH="${UACME_CHALLENGE_PATH:-/var/www/html/.well-known/acme-challenge}"
 ARGS=5
 E_BADARGS=85
 
@@ -37,6 +37,8 @@
         case "$TYPE" in
             http-01)
                 printf "%s" "${AUTH}" &gt; "${CHALLENGE_PATH}/${TOKEN}"
+                # Temporarily allow connections to port 80
+                sudo nft add element inet filter open-ports-s2s {80}
                 exit $?
                 ;;
             *)
@@ -48,7 +50,10 @@
     "done"|"failed")
         case "$TYPE" in
             http-01)
+                sudo nft delete element inet filter open-ports-s2s {80}
                 rm "${CHALLENGE_PATH}/${TOKEN}"
                 exit $?
                 ;;
             *)</pre><p>Then:</p><pre>sudo mkdir -p /var/www/html/.well-known/acme-challenge
sudo mkdir /etc/prosody/certs/example.com/
sudo touch /etc/prosody/certs/example.com/{fullchain,privkey}.pem
sudo chmod 640 /etc/prosody/certs/example.com/{fullchain,privkey}.pem
sudo chown root:prosody /etc/prosody/certs/example.com/{fullchain,privkey}.pem
sudo uacme -v new
sudo uacme -h /usr/local/bin/uacme-hook.sh issue example.com
sudo -e /etc/cron.daily/uacme-cert-update
sudo chmod +x /etc/cron.daily/uacme-cert-update</pre><p>With the following in <code>/etc/cron.daily/uacme-cert-update</code>:</p><pre>#!/bin/sh
set -e
/usr/bin/uacme -h /usr/local/bin/uacme-hook.sh issue example.com
cp /etc/ssl/uacme/example.com/cert.pem /etc/prosody/certs/example.com/fullchain.pem
cp /etc/ssl/uacme/private/example.com/key.pem /etc/prosody/certs/example.com/privkey.pem</pre><p>In <code>/etc/turnserver.conf</code> I have only set <code>external-ip</code>, <code>static-auth-secret</code>, <code>use-auth-secret</code>, <code>max-port=49154</code>.</p><p>Relevant lines of <code>/etc/prosody/prosody.cfg.lua</code>:</p><pre>interfaces = { "192.168.1.8", "127.0.0.1", "::1" }
modules_enabled = {
--- [...]
	-- Other modules
                "turn_external";
                "http";
}
-- TURN
turn_external_host = "example.com"
turn_external_secret = "secret here"

-- HTTP
http_host = "example.com"

VirtualHost "example.com"

Component "upload.example.com" "http_file_share"</pre><p>
      Then restart or reload the services, add users with <code>sudo
      prosodyctl adduser &lt;jid&gt;</code>, and it works.
    </p><h3>Voice conferences</h3><p>
      For <a href="voice-conferences.html">voice conferences</a>, apparently a particularly easy to set and
      properly working option is Mumble. <code>sudo apt install
      mumble-server mumble</code>, set a password
      in <code>/etc/mumble-server.init</code>, open UDP and TCP ports,
      and it is ready to use with desktop clients or Mumla or Android.
    </p><h3>IRC</h3><p>
      Similarly to XMPP and voice conferences, one may set an IRC
      server (or a small network) for private chatting. InspIRCd is
      available from Debian repositories and easy to configure, simply
      by setting the desired hosts, names, and passwords in its
      configuration file. And links (the spanningtree module) for use
      with multiple servers. Anope IRC services seem popular, and also
      available from Debian repositories, but perhaps unnecessary for
      a small private (and possibly local) network. To make it
      available over Internet, one may want to both enforce TLS and
      add restrictions for those connection classes; to do so, one may
      define a single connection class allowing no connections, then
      inherit one for plain connections, and one for TLS connections
      on a different port (corresponding to the Internet-facing
      endpoint), with additional restrictions (e.g., requiring a
      password).
    </p><h2>Shared machines</h2><p>
      If a machine is shared among multiple users, one may prefer to
      encrypt home directories, or at least subdirectories within
      those, individually in addition to the block device
      (LUKS/dm-crypt) encryption. That can be done with fscrypt,
      eCryptfs (an older option; also other stacked file systems). For
      instance, to create an encrypted directory:
    </p><pre># with fscrypt
# Enable and check the "encrypt" feature for the target ext4 file system
sudo tune2fs -O encrypt /dev/sda1
sudo dumpe2fs /dev/sda1 | grep features
# Install fscrypt and its libpam module at once
sudo apt install fscrypt libpam-fscrypt
# Setup fscrypt for the root partition (globally)
sudo fscrypt setup
# Create and encrypt a directory
mkdir private
fscrypt encrypt private/

# with eCryptfs
sudo apt install ecryptfs-utils
# Load the module
sudo modprobe ecryptfs
# Load it on boot as well
echo ecryptfs | sudo tee /etc/modules-load.d/ecryptfs
# Setup a private directory, in ~/Private/
ecryptfs-setup-private
# Mount it
ecryptfs-mount-private</pre><p>
      See <code>ecryptfs-migrate-home(8)</code> for encryption of the
      whole home directory.
    </p></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/public-data-backups.html"/><id>https://thunix.net/~defanor/notes/public-data-backups.html</id><author><name>defanor</name></author><title>Public data backups</title><summary>On data hoarding, Internet blackout preparedness</summary><published>2026-05-27T09:00:00Z</published><updated>2026-05-27T09:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Public data backups</h1><p>
      These notes grew out of those on <a href="personal-data-storage.html">personal data storage</a>, which
      cover the technical means. I used to keep a local music
      collection since the times before broadband and unmetered
      connectivity around here, and generally preferred to avoid
      reliance on online services, particularly commercial ones, since
      those tend to let users down. As <a href="another-blocked-website.html">the local censorship</a> advanced,
      complete with a partial Internet blackout, and threatening to
      impose a complete blackout, while inexpensive storage device
      capacities increased, I started storing more of public data, in
      addition to my private data backups.
    </p><p>
      Apart from Internet blackouts or individual resource blocking by
      a government, usual data sources may become unavailable because
      of a technical issue (along with the rest of the Internet if the
      issue is near the user), or due to the publisher changing their
      policies. These notes include suggestions on the kinds of public
      data to backup, along with links to some of them, their size
      estimates.
    </p><h2>Texts</h2><p>
      Written works tend to be the most information-dense, making it
      easy to collect and store much more of those than one could hope
      to read in a lifetime.
    </p><p>
      <a href="https://www.kiwix.org/en/">Kiwix</a> (with its <a href="https://github.com/openzim/">OpenZIM</a> archives) is a nice project. Its primary
      viewer may seem awkward for use in normal circumstances, but
      apparently it aims to be useful to general public and in bad
      circumstances: it provides archives as packages, while the
      viewer—with versions for every common OS—can also serve those to
      others in a local network via a web browser. <a href="https://library.kiwix.org/">library.kiwix.org</a>
      provides, among others, indexed archives of Project Gutenberg
      (about 75,000 public domain books by 2026), Wikipedia,
      Wikisource, Wikibooks, Wikiversity, Wiktionary, ready.gov,
      WikiHow, various StackExchange projects, Khan Academy, and many
      smaller bits like ArchWiki, RationalWiki, Explain XKCD (contains
      the comics).
    </p><p>
      <a href="http://www.textfiles.com/">textfiles.com</a> provides archives of files grouped by category,
      which are well-compressed, curious, and entertaining. <a href="https://www.rfc-editor.org/retrieve/bulk/">RFC Editor
      bulk retrieval</a> ceased to serve readily available archives by
      2026, but one can rsync it, optionally archiving and compressing
      afterwards, e.g.:
    </p><pre>rsync -avz --delete rsync.rfc-editor.org::rfcs-text-only/ rfcs-text-only/
tar --group=nogroup --owner=nobody -czf rfcs-text-only.tgz rfcs-text-only/</pre><p>
      The POSIX (SUS) specification is useful to have at
      hand: <a href="https://pubs.opengroup.org/onlinepubs/9799919799/">POSIX.1-2024</a> is available as an archive (see
      "Downloads"). Along those lines, there are programming language
      specifications (reports), and other relevant specifications and
      references: <a href="https://www.open-std.org/jtc1/sc22/wg14/www/standards">ISO C</a>, <a href="https://www.haskell.org/documentation/">Haskell Language Report</a>, <a href="http://www.scheme-reports.org/">Scheme
      Reports</a>, <a href="https://docs.python.org/3/download.html">Python documentation downloads</a>, <a href="https://riscv.org/specifications/ratified/">RISC-V
      specification</a>, <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel
      64 and IA-32 Architectures Software Developer's
      Manual</a>, <a href="https://docs.amd.com/v/u/en-US/40332-PUB_4.08">AMD64
      Architecture Programmer's Manual</a>, <a href="https://refspecs.linuxfoundation.org/index.shtml">Linux Foundation
      Referenced
      Specifications</a>, <a href="https://www.usb.org/documents">USB
      specifications</a>, <a href="https://www.bluetooth.com/specifications/specs/">Bluetooth
      specifications</a>, <a href="https://uefi.org/specifications">ACPI
      and UEFI specifications</a>, <a href="https://www.postgresql.org/docs/">PostgreSQL
      manual</a>, <a href="https://github.com/xsf/xeps/">XMPP Extension
      Protocols</a>, etc.
    </p><p>
      Then there are copyright-infringing but much larger libraries
      like <a href="https://en.wikipedia.org/wiki/Library_Genesis">Library Genesis</a> (a trimmed down, txt-only version used to
      be available at offlineos.com, but apparently not
      anymore), <a href="https://the-eye.eu/public/Books/">the-eye.eu books</a>, <a href="https://annas-archive.li/">Anna's Archive</a>, <a href="https://z-library.sk/">Z-library</a>. The
      Pirate Bay or similar torrent trackers may help to find book
      collections, including MIT mathematics and physics books,
      Cambridge Histories and philosophy companion books, Oxford "Very
      Short Introductions", Routledge books. As well as works grouped
      by an author (e.g., Gardner, Feynman). Other topics to consider
      acquisition of modern (text)books on: <a href="https://en.wikipedia.org/wiki/List_of_publications_in_philosophy">major philosophy works</a>,
      electronics and radio, engineering, sociology,
      economics, <a href="computing-context.html">computing</a>, <a href="food.html">cooking</a>, <a href="physical-exercises.html">physical exercises</a>, survival,
      fiction, medicine (e.g., the Merck manual), any topics of
      interest and other sciences. Other individual books on
      <a href="online-courses-and-math-notes.html">physics and mathematics</a>,
      history. Consider <a href="../files/complementary-books.txt">the
      list of books complementary to Wikisource and PG</a> (about 30
      GB). Literary awards and charts can be handy for finding books:
      Pulitzer, Nebula, Locus, Bentley, Booker, <a href="https://www.nature.com/news/the-top-100-papers-1.16224">Nature's analysis of
      the 100 most cited papers</a>, <a href="https://www.theguardian.com/world/2002/may/08/books.booksnews">The Guardian's top 100 books of all
      time</a>, <a href="https://www.theguardian.com/books/2015/aug/17/the-100-best-novels-written-in-english-the-full-list">The
      Guardian's 100 best novels written in
      English</a>, <a href="https://www.nytimes.com/interactive/2024/books/best-books-21st-century.html">The
      NYT's 100 Best Books of the 21st Century</a>, and similar
      lists. UN and other organizations' reports may also be of
      interest.
    </p><p>
      One can <a href="https://askubuntu.com/questions/207447/how-to-reduce-the-size-of-a-pdf-file-by-reducing-the-quality-of-the-images#626301">reduce PDF size</a> (compress the images) with GhostScript
      or ImageMagick, among others, sometimes reducing the size by an
      order of magnitude: see "<a href="https://transloadit.com/devtips/efficient-pdf-optimization-with-ghostscript-cli/">Efficient PDF optimization with
      Ghostscript CLI</a>". For instance: <code>gs -q -sDEVICE=pdfwrite
      -dPDFSETTINGS=/screen -dCompatibilityLevel=1.4 -o out.pdf
      in.pdf</code> (possibly with <code>-dCompressFonts=true</code>
      and other options). Its <code>-dFirstPage=$START
      -dLastPage=$END</code> options are also handy sometimes, to
      extract pages of interest (including cases when some
      crackpottery is attached to books: that is one of the ways in
      which the crackpots try to promote it). While EPUBs (basically
      ZIP archives with HTML and images) can be compressed by
      compressing individual images within those. Sometimes files can
      be removed from an EPUB archive, and it can be trimmed down by
      passing through pandoc (which would remove included fonts, for
      instance).
    </p><p>
      <a href="https://openstax.org/">OpenStax</a> provides good and freely available textbooks under the
      CC BY license, available for download in PDF. See <a href="https://github.com/openstax">OpenStax
      GitHub repositories</a> for their CNXML sources and related tools,
      though in 2024 I found it tricky to build HTML out of those, and
      then it still was not good enough for printing. <a href="https://libretexts.org/">LibreTexts</a> is
      supposed to be similar, though the licensing information is
      unclear in some cases, some links lead to HTTP 404 errors, and
      some of the books are quite messy (attempting to embed YouTube
      videos into PDFs, having every other page filled with listings
      of undeclared licenses, or with "welcome" messages). While its
      subdomains (math, phys, etc) geo-block direct requests from
      Russia, the books are available without proxying via
      commons.libretexts.org. One can also search for libre book
      sources on platforms like GitHub, possibly <a href="https://github.com/search?q=textbook+language%3ATeX&amp;type=repositories">querying for TeX
      sources</a>: there are occasional seemingly decent and not
      well-known textbooks, like <a href="https://github.com/OSTP/PhysicsArtofModelling">Introductory Physics: Building Models
      to Describe Our World</a>, <a href="https://github.com/vEnhance/napkin">An Infinitely Large Napkin</a>.
    </p><p>
      As of 2026, all those (Wikipedia, Wikisource, Wiktionary,
      Project Gutenberg, OpenStax and other complementary books) would
      take just 400 to 500 GB, even with images and some non-English
      versions added. While much of programming documentation,
      particularly manuals, library references, and sources, is
      available from system repositories.
    </p><h2>Software</h2><p>
      Apart from censoring books and the Internet, dictatorships like
      to issue "national operating systems" and mandate their spyware,
      or simply disrupting connections to system repositories as
      collateral damage, so backing up software can also be useful.
    </p><p>
      Software sources are particularly useful to backup for potential
      isolated usage, ensuring the ability to study and customize
      those, but one needs some binaries to bootstrap a system. Some
      of the options to consider are (with size estimates from January
      of 2026):
    </p><ul>
      <li><a href="https://www.debian.org/mirror/ftpmirror">Debian archive mirroring</a>: about 230 GB when done
        with <code>debmirror</code>, for amd64 trixie (13.3) with
        sources. While the <a href="https://www.debian.org/mirror/size">Mirror Size page</a> lists numbers for
        mirroring all suites. One may also consider usage of a caching
        proxy server, <code>apt-cacher-ng</code>, and <a href="https://wiki.debian.org/DebianInstaller/Modify/CD">Modifying Debian
        CD</a>. Unlike most others, Debian repositories contain all the
        source packages, which include upstream sources.</li>
      <li><a href="http://www.slackware.com/getslack/">Slackware downloads</a>: a whole mirror (for a single version)
        is under 20 GB, but it has few packages, and rather dated as
        well. But seems to be one of the few distributions with
        complete sources.</li>
      <li><a href="https://www.gentoo.org/downloads/mirrors/">Gentoo source mirrors</a>, particularly distfiles, almost 600
        GB. Those include multiple versions of the same programs.</li>
      <li><a href="https://wiki.archlinux.org/title/Mirrors">Arch Linux Mirrors</a> take a little over 110 GB for packages
        ("pool"), and 31 GB for sources (though the wiki claims it is
        80 GB and 110 GB, respectively; also most mirrors do not seem
        to host sources); apparently sources for many packages are not
        present.</li>
      <li><a href="https://fedoraproject.org/wiki/Infrastructure/Mirroring">Fedora mirroring</a>: about 356 GB for "Everything" x86_64
        packages, 123 GB for source ones.</li>
      <li><a href="https://www.openbsd.org/ftp.html">OpenBSD mirrors</a>: the sources may be in distfiles directories
        (as used by <a href="https://www.openbsd.org/faq/ports/ports.html">OpenBSD ports</a>), but I have not found mirrors with
        such directories available via rsync.</li>
      <li><a href="https://netbsd.org/mirrors/">NetBSD mirrors</a>: about 200 GB in distfiles, under 70 GB for
        precompiled amd64 packages. Those include multiple versions of
        the same programs.</li>
    </ul><p>
      Debian, in addition to being an all-around good system, seems to
      be a good option for such mirroring as well. The mirroring
      itself is done rather easily:
    </p><pre>sudo apt install debmirror debian-keyring
gpg --no-default-keyring --keyring trustedkeys.gpg --import /usr/share/keyrings/debian-archive-keyring.gpg
gpg --list-keys --keyring trustedkeys.gpg
debmirror -v -d trixie -a amd64 --source -h mirror.mephi.ru --method=rsync /mnt/backup/debian/mirror/</pre><p>
      An up-to-date <a href="https://www.debian.org/distrib/">live Debian CD/USB image</a> is useful to store along
      with it, and perhaps <a href="https://wiki.debian.org/moin_dump/">a Debian wiki dump</a>. As well as necessary
      additional firmware for one's hardware, and possibly firmware
      for devices other than regular computers, such as <a href="https://openwrt.org/">OpenWRT</a> images
      for routers, <a href="https://grapheneos.org/">GrapheneOS</a> or <a href="https://www.lineageos.org/">LineageOS</a> images for phones and
      tablets (along with individual program distributions, APKs; some
      software I use is listed in the note on <a href="mobile-computing.html">mobile
      computing</a>), <a href="https://github.com/koreader/koreader">KOReader</a> for e-readers. Consider <a href="https://f-droid.org/en/docs/Running_a_Mirror/">F-Droid mirroring</a>
      and <a href="https://openwrt.org/docs/guide-developer/source-code/start">OpenWRT source code</a> saving, or backups of individual
      packages.
    </p><h2>Audio</h2><p>
      As mentioned in the introduction, I always kept a music
      collection, and probably this is quite common. While musical
      records may seem less important than books and other written
      works, they still have a cultural value, provide
      entertainment. My <a href="music-discovery.html">music discovery</a> note seems relevant here.
    </p><p>
      Audibooks (including BBC radio collections) may also be useful
      to collect, even if one does not listen to those normally.
    </p><h2>Video</h2><p>
      Also for cultural and entertainment purposes, there are movies,
      and particularly long TV series may be suitable for hoarding;
      out of nice sci-fi ones, there are Doctor Who, Star Trek, Red
      Dwarf, Farscape, Lexx, Firefly, Defiance, Battlestar Galactica,
      Babylon 5, The X-Files, First Wave; plenty more can be found in
      Wikipedia; for humorous ones, see Black Books, The IT Crowd,
      Taskmaster, plenty of sitcoms.
    </p><p>
      Music videos are nice to have around, for the same reasons.
    </p><p>
      Lectures and educational videos on varied subjects can be both
      useful in addition to books (as <a href="https://www.3blue1brown.com/">3blue1brown</a>, providing useful
      illustrations and intuitive explanations, or various arts and
      crafts, or exercises, demonstrating how to do something), and
      work as book substitutes, to share with those who do not read
      much, or in case if there are not many books on a given topic
      (say, recent local legal practices). Unfortunately those are
      often hosted on YouTube, which, in addition to being blocked
      here (and in other places, see <a href="https://en.wikipedia.org/wiki/Censorship_of_YouTube">censorship of YouTube</a>), tries to
      prevent downloads itself, but there is <code>yt-dlp</code>,
      which may work. I usually download videos for archival at 480p
      if the visual details matter (perhaps 2 to 5 MB per minute), or
      even 360p if it is mostly a speaker standing and talking for the
      whole long video (under 2 MB per minute), which is done with
      the <code>-S "res:480"</code> option. I have collected
      some <a href="../links.html#Videos">video links</a>, including interesting YouTube channels. One
      may consider relatively information-dense ones (lectures, online
      lessons) first, possibly followed by entertainment-education,
      pop-sci, and documentaries.
    </p><h2>Other</h2><p>
      Other large and legal archives to consider for backing
      up: <a href="https://dumps.wikimedia.org/">Wikimedia Downloads</a>, <a href="https://planet.openstreetmap.org/">Complete OSM Data</a>, <a href="https://arxiv.org/help/bulk_data">arXiv</a> and other Open
      Access sources. If one gets into tape storage, <a href="https://commoncrawl.org/">Common Crawl</a> can
      be considered. For select website downloads, I use <code>wget
      --mirror --page-requisites --convert-links --no-parent
      --continue --adjust-extension https://example.com/~foo/</code>,
      occasionally adding something
      like <code>--exclude-directories=photos,pictures</code> or just
      listing URLs manually (since it can be hard to separate heavy
      bits of little interest from the others otherwise), and
      sometimes having to add <code>--compression=gzip</code> if wget
      gets confused otherwise, or <code>--max-redirect=0</code> if
      there are redirects to semi-blocked websites with freezing
      connections (and while trying to download those directly, given
      that wget does not support SOCKS proxies). But some websites
      make archives available (as mine does,
      see <a href="../files/archive.tgz">../files/archive.tgz</a>), or they are hosted at
      GitHub/Codeberg/Tilde/etc "pages", making the archive available
      for download (also as mine does,
      see <a href="https://codeberg.org/defanor/pages">codeberg.org/defanor/pages</a>). Some wiki-based websites also
      provide data dumps, static HTML or database ones.
    </p><p>
      Statistical ("ML", "AI") models for LLMs (llama.cpp) and speech
      recognition (whisper.cpp) may be useful to collect as well. LLMs
      in particular, while they do hallucinate, also contain plenty of
      information, and in a way that may make it easier to retrieve in
      some cases.
    </p></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/computer-hardware.html"/><id>https://thunix.net/~defanor/notes/computer-hardware.html</id><author><name>defanor</name></author><title>Computer hardware</title><summary>Personal notes on computer hardware</summary><published>2019-04-16T00:00:00Z</published><updated>2026-05-05T00:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Computer hardware</h1><p>
      The following is my hardware shopping list, more or
      less. Observations and rants are included.
    </p><h2>Workstation</h2><p>
      The term "workstation" can mean many things, but for brevity,
      here I use it to denote a relatively reliable desktop computer
      for daily usage and work, rather than for gaming, with ECC
      memory.
    </p><dl>
      <dt>CPU</dt>
      <dd>
        (Low-end) Intel Xeon processors are generally nice and
        suitable for a workstation: ECC memory support, fine TDP, and
        all the perks of being mainstream. Though there are security
        vulnerabilities, potential backdoors (particularly enterprise
        features, ME), vulnerabilities in backdoors, and numerous
        backwards compatibility warts, but there are comparable ones
        in other affordable and suitable for common computing tasks
        CPUs (PSB in AMD CPUs). Though as of 2019, it seems that AMD
        CPUs may be a generally better option: <a href="https://old.reddit.com/r/Amd/comments/5x4hxu/we_are_amd_creators_of_athlon_radeon_and_other/def6vs2/">ECC is not disabled</a>
        even in Ryzen (desktop, unlike the considerably more expensive
        EPYC or Threadripper) CPUs, and they seem to beat Intel in
        benchmarks/specifications at the same price. After 2022,
        intel.com geo-blocks me, nudging even closer to
        AMD. <a href="https://www.cpubenchmark.net/">cpubenchmark.net</a> provides a variety of benchmarks,
        including "best value" ones, useful for budget builds. <a href="https://www.tomshardware.com/">Tom's
        Hardware</a> has the "best picks" category with good pointers,
        aiming different needs (budget, workstation, gaming). As a
        side note, some suggest to choose by performance/watt, rather
        than by announced TDP, and then possibly throttle a CPU with
        software.
      </dd>
      <dt>Memory</dt>
      <dd>
        Software keeps taking all the available memory, and even if
        one manages to avoid memory hogs, it is still nice to cache
        more. So it is usually a good idea to have plenty of
        memory. Kingston seems to be relatively reliable and produces
        ECC memory (not only their server-oriented models, but also
        the embarrasingly named "FURY Renegade Pro" line); Crucial and
        SuperMicro seem fine; personally I have only had issues with
        Corsair (which makes non-ECC memory anyway). All DDR5 memory
        has in-chip ECC, but the "ECC" versions still come with
        additional lanes, to allow detection of in-transit
        errors. Dual rank (possibly double-sided) memory tends to be a
        little faster, more expensive, and possibly heat more. A
        common suggestion is to prefer using 2 modules over 4, to put
        less stress on the memory controller. Some suggest to avoid
        mixing different memory kits, though it is unclear how risky
        it actually is, ECC memory rarely comes in kits, and when it
        does, those are large ones: 4 or 8 modules.
      </dd>
      <dt>Storage</dt>
      <dd>
        Probably it is the time to move to SSDs, but I am still using
        HDDs as the primary storage. There are reliability statistics
        around (usually it is, from least reliable to most: Seagate,
        WD, Hitachi and Toshiba, which is also reflected in prices);
        it's hard to deduce reliability by a vendor, but WD Red disks
        work fine for me: by 2024, I only had one faulty WD disk,
        after about 15 years of regular usage.
      </dd>
      <dt>Graphics card</dt>
      <dd>
        Integrated CPU graphics are useful as a backup, and sufficient
        if you do not do heavy gaming, video editing, or things like
        that. They also take the price down and reduce the number of
        components, including moving parts, so there is less noise,
        less heat, lower power consumption, fewer possible
        failures. As for discrete video cards, the primary issue for
        me is software support (both drivers and higher-level software
        such as X compositors). NVIDIA is most problematic:
        proprietary drivers are not supported for long, and
        reverse-engineered libre ones are not usable at all for some
        cards, and slow for others. AMD is better: in addition to
        proprietary drivers, there are mostly working open
        ones. Integrated Intel graphics seem to be the most
        reliable. h-node.org listing alone does not guarantee that
        drivers will work any smoothly.
      </dd>
      <dt>Motherboard</dt>
      <dd>
        ASUS motherboards seem to be fine, and usually there is a few
        to choose from, with ECC support. Non-workstation ones tend to
        come with LEDs and other things one may prefer to not
        have. Though generally it is better to check reviews and
        benchmarks for motherboards on a chosen chipset at the time of
        buying. As of 2024, AMD "workstation" ASUS motherboards are
        quite expensive, while non-workstation ones support ECC as
        well. With other manufacturers, workstations motherboards
        (e.g., those with sockets and chipsets for Threadripper CPUs)
        also tend to support ECC and to be more expensive, but it is
        harder to pick a motherboard for a more modest yet reliable
        computer. Though there are cheaper mATX ASRock Pro ones, also
        with ECC support.
      </dd>
      <dt>CPU heat sinks and fans</dt>
      <dd>
        Noctua is nice. Painless CPU mounting is great, it is silent,
        and cools CPUs well. Newer AMD stock coolers are not so bad
        either (except for LEDs), though still behind Noctua.
      </dd>
      <dt>Power supply</dt>
      <dd>
        Since a PSU malfunction can fry a motherboard and components
        on it, it may be a good idea to attempt to pick a reliable
        one, which would easily handle the used hardware. "80 Plus"
        ratings can be consulted, and Thermaltake PSUs are not the
        worst, though their newer models are covered in gaudy
        LEDs. ATX PSUs are most common for desktop computers, but SFX
        ones may be preferable for smaller builds, like those with
        microATX motherboards.
      </dd>
      <dt>Chassis</dt>
      <dd>
        Full-tower metal cases are good for building and for cooling,
        and often come with handy features that are less common on
        smaller cases (e.g., front panel ports for SATA HDDs and other
        I/O, large/slow/silent fans), though tend to be
        heavy. Unfortunately annoying and ugly LEDs are common these
        days, especially on full-towers. Proper internal 3.5-inch bays
        for HDDs are increasingly hard to find on computer cases, as
        of 2025, with online stores counting places for bolting HDDs
        onto the case's walls as "bays", but adding a filter for cases
        having a 5.25-inch external bay helps to find those with
        proper internal 3.5-inch bays in front.
      </dd>
      <dt>UPS</dt>
      <dd>
        APC by Schneider Electric is nice (except for its software, as
        usual for software shipped by hardware vendors, but it is
        usable without that software). An RBC7 battery lasts for about
        3 to 5 years (and it is recommended to change them every 3
        years), though it is a pain to recycle one properly. I hear
        Falcon Electric and Eaton are nice as well. But APC ones tend
        to make regular beeping noises, and may not be quite suitable
        for bedrooms. Also heavier ones are quite inconvenient to deal
        with: even if you rarely move them or their batteries, it
        happens sometimes, and it is nice to have something more
        manageable then. After my larger APC UPS started
        malfunctioning (after about 15 years of usage), I switched to
        more home-oriented, quieter, and lighter CyberPower (1300 VA,
        which is still an overkill). This model (CP1300EPFCLCD) was
        handled by Debian 12 easily, without any tweaking, and
        estimated to keep my computer setup (85 W) running on battery
        (while it is new) for about 40 minutes.
      </dd>
      <dt>Keyboard</dt>
      <dd>
        The "Truly Ergonomic" keyboard has a relatively nice layout,
        though <a href="building-a-keyboard.html">custom keyboards</a> may suit one better (and are fun to
        build). Split keyboards seem nice too, but I haven't tried
        them yet.
      </dd>
      <dt>Mouse</dt>
      <dd>
        Gaming hardware tends to be unreliable, but mice advertised as
        gaming ones tend to be handy. Logitech mice seem to live
        longer than others (and particularly than those made by gaming
        companies, like Razer). They have gaudy LED lights, but those
        can be controlled with Piper (available from Debian
        repositories), at least on G102.
      </dd>
      <dt>Home router</dt>
      <dd>
        So far I had D-Link and ASUS routers that died, Linksys that
        lived until it got outdated, a TP-Link router (TL-WR841N/ND
        v8) that worked for a while and started hanging up after years
        of use, followed by another TP-Link router (Archer C7
        v5). Apparently <a href="https://www.eyecontrol.nl/blog/undocumented-user-account-in-zyxel-products.html">Zyxel shipped backdoored firmware</a>, so it may
        be better to avoid. <a href="https://librecmc.org/">LibreCMC</a> and <a href="https://openwrt.org/">OpenWRT</a> maintain supported
        hardware lists, which are handy for choosing from. OpenWRT
        seems to be better at supporting router models long-term,
        while LibreCMC drops support sooner and supports much fewer
        models. And there are interesting router projects like <a href="https://www.turris.com/en/omnia/overview/">Turris
        Omnia</a> (open and quite overpowered, by CZ.NIC). <a href="https://openwrt.org/toh/openwrt/one">OpenWrt One</a>
        looks like a particularly nice option in 2025, though only has
        a single LAN Ethernet port.
      </dd>
      <dt>Printer</dt>
      <dd>I don't have a printer, but apparently Brother makes nice
        and inexpensive black-and-white laser printers with working
        Linux drivers. Unlike HP, without chipped and locked down ink
        cartridges: third-party ones can be used, its own can be
        refilled. Though by 2025, Brother started locking down the ink
        cartridges as well, via forced firmware updates. And there are
        horror stories about HP printers.</dd>
      <dt>Computer speakers</dt>
      <dd>Heavy computer speakers are heavy to move around, loud ones
        malfunction loudly, small ones tend to make annoying
        noises. So now I prefer medium-sized ones, with a volume knob
        and an accessible on/off switch, maybe a headphone output, and
        a sensible volume range. Though there are many more aspects of
        both the speakers and the overall setup (including the room
        around them), and there is the whole "audiophile" group of
        people occasionally overdoing it in weird and silly ways.</dd>
      <dt>Microphone</dt>
      <dd>While not using a dedicated microphone, I've investigated
        those. Apparently (and as one may expect) decent microphones
        are standalone (not embedded into headsets, cameras, etc) and
        fully analog (that is, don't include sound cards and USB
        interfaces, but just focus on being microphones, usually with
        an XLR interface). Dynamic microphones are said to be more
        suitable for non-studio setups, and condenser/capacitor ones
        -- for studio setups. Condenser microphones require phantom
        power, so a suitable audio interface is required; for dynamic
        ones one may get away with just an XLR-to-TRRS cable (although
        a preamplifier is commonly recommended, so it may be better to
        get a basic audio interface anyway).  The popular options (for
        speech, basic and inexpensive ones) seem to be Shure SM58 for
        a dynamic microphone, Audio-Technica AT2020 and plenty of
        others for a condenser microphone, Focusrite Scarlett external
        audio interfaces.</dd>
      <dt>Power cords</dt>
      <dd>Apparently accidental unplugging is a fairly common issue,
        so IEC locks may be nice to have (even though the IEC 60320
        appliance coupling has no interlocking, unlike the industrial
        IEC 60309): locks on C13 work like finger traps, on C14 they
        work like tension sleeves, but perhaps they are better than
        nothing. APC also makes cords, but they come either with no
        locking at all, or with non-standard interlocking locks
        (requiring support on both ends). It also seems that contacts
        become loose with older female connectors, so occasionally
        replacing those may be useful. They all are supposed to handle
        10A, but one may also check <a href="https://www.sab-cable.com/cables-wires-harnessing-temperature-measurement/technical-data/cables-and-wires/instructions-for-the-safe-application-of-cables/boundary-conditions/calculate-wire-cross-section-current-carrying-capacity-table.html">current-carrying capacity tables</a>,
        as well as their claimed certification (some companies,
        including Cablexpert/Gembird, violate the standard and make
        C13-C14 cord versions for other maximum currents as
        well). Apparently APC cords are good and expensive, Cisco ones
        are similarly priced, Tripp Lite is inexpensive and seemingly
        okay, others (not counting weird audiophile ones) are
        inexpensive and their quality varies.</dd>
      <dd>Since C13 and C14 connectors can be rewirable, one can also
        acquire those and make cords of a desired length (and
        potentially be more picky about the connectors and wires
        themselves, paying more attention to plating, insulation,
        etc), but they can be fiddly, and it may be challenging to
        find good ones (just as with premade cords).</dd>
    </dl><p>
      Generally it is a good idea to look up the models on websites of
      vendors in order to get accurate and complete specifications,
      though it doesn't guarantee availability in local stores, and
      may take a few iterations. As of 2019, tech companies didn't
      adopt structured/machine-readable data exchange/publishing, so
      hardware search/picking services tend to provide and use
      incomplete information. Though they still may be easier to get
      information from, since official websites tend to be infested
      with JS and marketing. I've considered composing a table with
      various vendors, indicating whether they cover hardware in LEDs,
      make websites unusable and drivers hard to download, etc, but
      it's basically as bad as it gets for every major vendor.
    </p><p>
      Might be worthwhile to pay attention to capacitors on
      motherboards and in PSUs, and possibly it is even more important
      to keep them relatively cool and dry in order to prolong their
      lifespan.
    </p><p>
      One can also get a small server rack and server hardware, which
      generally aims reliability and is less prone to silly designs,
      but it may be more challenging to keep it quiet than a desktop
      computer, and there are likely to be minor annoyances: for
      instance, usually there's no analog audio I/O in server
      motherboards.
    </p><h2>Media/gaming/entertainment centre</h2><p>
      A basic setup can be quite similar to that of a workstation: a
      computer, a screen, speakers, some input devices. The major
      issues are content retrieval and manipulation (documented
      separately, in the <a href="home-entertainment-centre.html">Home entertainment centre</a> note), and awkward
      hardware (documented below).
    </p><h3>A computer</h3><p>
      It is much easier to begin with giving up on workstation
      priorities (such as ECC memory and not having gaudy LEDs), since
      there are plenty of compromises to be made even without
      those. In the end of 2019, I went for a build with Ryzen 7 3700X
      (because of a relatively low TDP, and a stock cooler; although
      later that turned out to be quite annoying, with its bright
      LEDs), ASUS TUF GAMING X570-PLUS (WI-FI), HX432C16PB3K2/32
      memory (which seemed a bit strange, with my workstation from
      2012 also having 32 GiB, though this memory is faster),
      GV-R57XTGAMING OC-8GD graphics card, Corsair HX750 PSU, a couple
      of NVMe SSDs, and just a voltage stabilizer instead of an UPS
      (which probably was a mistake: brief power cuts happen quite
      frequently here; or possibly it's just voltage going too far
      down sometimes, but either way it's not quite fixable and leads
      to computers losing power). Finally tried an NZXT case (H710);
      it's indeed quite nice, though heavy for a mid-tower.
    </p><h3>Input devices</h3><p>
      The Xbox One controller works easily with MS Windows 10 over
      Bluetooth (though the batteries only lasted for 40 hours of
      gaming, and one has to select "mice, keyboards, etc" when adding
      a device, despite MS Windows suggesting to pick a separate
      option for Xbox controllers) and over an USB cable
      (micro-usb). For some reason (which I have no idea how to debug
      with a reasonable effort, and likely it would violate long and
      unreadable game licenses) games lag when it vibrates, but
      disabling vibration gets rid of the lags. Seems to work well on
      Linux as well.
    </p><p>
      Wireless input devices may be particularly convenient for a
      setup like that, but one should keep in mind that they tend to
      use proprietary protocols, which are almost always insecure
      (see, for instance, <a href="http://kth.diva-portal.org/smash/record.jsf?pid=diva2%3A1701492&amp;dswid=-5022">Penetration testing wireless keyboards</a> from
      2022, and <a href="https://news.ycombinator.com/item?id=33123406">HN comments</a>, though I think it was pretty much common
      knowledge before that).
    </p><p>
      M-Audio Keystation 88 MK3 is an inexpensive MIDI keyboard; I
      don't have other MIDI keyboards to compare it to, and only
      played a regular piano before, but it seems fine. Both Yoshimi
      and LMMS work easily with it, on both Windows and
      Linux. Synthesia mostly works with it on Android too (though
      apparently misses some events, especially key releases, and then
      almost hangs; no idea where the issue is). Z-shaped keyboard
      stands are sometimes recommended for their stability and
      independent height and width adjustments, which indeed seem nice
      (I went for an OnStage one, which seems nice -- but once again,
      I don't have much to compare it to). I've also acquired an
      M-Audio SP-2 pedal, with its switch either being broken before
      it arrived or breaking on the first attempt to use it (and given
      that it's pretty cheap, attempting to replace it looks like more
      trouble than it's worth); fortunately a MIDI pedal is just a
      basic on-off switch, so one can try to replace it with a
      paperclip or two, but that's rather junky.
    </p><h3>A screen</h3><p>
      OLED matrices seem to be used relatively commonly for
      media-oriented "TVs", but modern "TVs" are monitors with
      built-in computers, loaded with proprietary software, malware,
      and even advertisements (see also: <a href="https://news.ycombinator.com/item?id=21002745">HN thread discussing spyware
      on smart TVs</a>). Apparently there are similar screens marketed as
      "conference room" or "commercial" ones, and perhaps non-OLED can
      be fine too. With comparable specifications, regular screens
      seem to be quite a bit more expensive than TVs; possibly that's
      because TVs can feature frame interpolation and double frame
      rate in their specifications, and/or advertise resolutions with
      interlacing. Though it's commonly suggested that preinstalled
      spyware and adware lead to lower prices as well.
    </p><p>
      I went for a gaming LG screen (32GK850F-B, VA matrix) in 2019,
      which seems rather nice and not particularly expensive.
    </p><h3>Old cable television</h3><p>
      While OTT services may make more sense these days, one may want
      to preserve regular TV (such as DVB-C). There are receivers (aka
      "set-top box") that can output video over HDMI and sound
      separately (e.g., over RCA), as well as speakers with dual
      inputs (e.g., also RCA), and computer screens commonly support
      multiple inputs, so that both DVB-C receiver and a computer can
      be connected to both a screen and speakers (so that TV can
      function independently of a computer). There are PCI and USB TV
      tuners too, but according to comments on the Internet their
      quality is very low (both hardware and software), so solving it
      with additional wires seems like a better option. See
      also: <a href="https://www.mythtv.org/">MythTV</a>, <a href="https://www.linuxtv.org/">LinuxTV</a>, <a href="https://www.linuxtv.org/wiki/index.php/DVB-C_devices">DVB-C devices in LinuxTV wiki</a>. See
      the <a href="home-entertainment-centre.html">home entertainment centre</a> notes for more on those.
    </p><h2>Builds</h2><p>
      I decided to put together approximate builds I would consider,
      so that I will have those at hand in case if I will need to
      replace a computer urgently, and just as a reference. I have not
      tried those though, so there may be compatibility
      issues. Historical ones (which I built) are explicitly
      marked. The approximate prices I refer to are taken mostly from
      Russian stores, where hardware is more expensive, but
      pcpartpicker links with similar builds are provided for
      reference.
    </p><ul>
      <li>
        2012, a workstation (built), $2000: Xeon E3 1275 v2, 32 GB (4
        * 8) of memory (Kingston, ECC, DDR3, unbuffered), ASUS P8C WS
        (ATX) motherboard, 1 old WD Green HDD (2 TB, died after 12
        years), 3 new WD Red HDDs (3 TB each), GeForce GTX 660 (ASUS;
        switched to integrated after Nvidia EOL'd it; their
        proprietary drivers were always a pain), an overkill
        Thermaltake PSU, Noctua NH-D14 CPU cooler, Thermaltake
        Overseer RX-I case.
      </li>
      <li>
        2019, a gaming computer (built), about $2000 including
        peripherals: Ryzen 7 3700X with a stock cooler, 32 GB of DDR4
        non-ECC memory (HX432C16PB3K2/32), ASUS TUF GAMING X570-PLUS
        (WI-FI) motherboard, Radeon RX 5700 XT (8 GB, GV-R57XTGAMING
        OC-8GD), Corsair HX750 PSU, two NVMe SSDs, NZXT H710 case
        (okay for SSDs, but perhaps too big, and would not work well
        for HDDs or optical disc drives).
      </li>
      <li>
        2024, budget computer, $600 (<a href="https://pcpartpicker.com/list/283WMV">pcpartpicker</a>): AMD Ryzen 5
        5600GT, 64 GB non-ECC memory, the B550 chipset (e.g., MSI PRO
        B550M-VC WIFI, microATX), Kingston NV2 1 TB M.2 SSD. Virtually
        any PSU (300 W should suffice), CPU cooler (TDP 65 W), and
        case.
      </li>
      <li>
        2024, cheap computer, $250: AMD Athlon 200GE, 8 or 16
        GB non-ECC memory, MSI A520M-A PRO (microATX), maybe a 500 GB
        SSD, possibly SATA (some motherboards may not support NVMe
        with this CPU), any PSU, CPU cooler, case.
      </li>
      <li>
        2024, modest workstation, $900 to $1800 (<a href="https://pcpartpicker.com/list/zhjqYN">pcpartpicker</a>): AMD
        Ryzen 5 9600X, 32 to 128 GB of ECC memory (e.g.,
        KSM48E40BD8KM-32HM), ASUS TUF GAMING B650M-E WIFI or ASUS
        PRIME B650M-A WIFI II (microATX, ECC support), Kingston NV2 1
        TB M.2 and optionally WD Red 4 TB. PSU, CPU cooler, and case
        do not matter much (CPU TDP 65/88 W, relatively little overall
        power consumption).
      </li>
      <li>
        2025, prebuilt computers: mini PC, $120 to $240: Intel N150,
        16 to 32 GB DDR4, 0.5 to 1 TB SSD, okay I/O including
        Wi-Fi. Apparently Lenovo ThinkPad are good and Linux-friendly
        options for laptops ($900 to $1000 for the low-end E series),
        IdeaPad are similar (but cheaper and without Ethernet ports,
        $600 to $900), and Lenovo ThinkCentre are small desktop
        computers (nettops, slim desktop) with similar qualities.
      </li>
    </ul></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/personal-data-storage.html"/><id>https://thunix.net/~defanor/notes/personal-data-storage.html</id><author><name>defanor</name></author><title>Personal data storage</title><summary>Storage and backup notes</summary><published>2021-03-23T12:00:00Z</published><updated>2026-05-01T17:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Personal data storage</h1><p>
      These are my data storage notes, targeting primarily personal
      data backups: regular files (documents, photo and music
      collections, not databases), moderate volume, added or edited
      rarely, backups are managed manually.
    </p><p>
      Notes on <a href="public-data-backups.html">public data backups</a> are extracted into a separate
      document.
    </p><h2>General approach</h2><p>
      The "3-2-1 rule" for backups suggests to keep at least 3 copies
      of data, on at least 2 different storage devices, with at least
      one copy off-site.
    </p><p>
      The exact requirements and methods to achieve those may depend
      on one's threat model: in addition to device failures, bit rot,
      and unauthorized access by scrapers, one may have to consider
      fire or flooding, burglaries and robberies, book burning
      campaigns and censorship with isolation, hardware seizures and
      imprisonment without ability to maintain the remaining backups
      for years, inability--or a limited ability--to acquire
      replacement storage devices, and even uncommon and hypothetical
      scenarios, such as a global high energy EMP.
    </p><p>
      Considering the information security "CIA" triad
      (confidentiality, integrity, availability), we need encryption,
      so that lost or decommissioned drives will not leak personal
      data (i.e., <a href="https://en.wikipedia.org/wiki/Crypto-shredding">crypto-shredding</a> can be employed); integrity
      checking, so that we will either read back the data that was
      written or detect <a href="https://en.wikipedia.org/wiki/Data_corruption">data corruption</a> (and preferably even repair
      it); varied and common technologies (hardware interfaces,
      drivers, filesystems, file formats), so that there will be a
      good chance that at least some of the backups can be accessed
      with reasonable effort in different situations in the future.
    </p><p>
      Most of the technologies covered here are usable for both
      backups and working storage. I prefer to use more general tools,
      since they tend to be better maintained, and learning them
      usually is a more useful time investment than learning
      specialized backup systems (but for those, see Bacula, Borg,
      restic, DAR), some of which are quite similar to actual file
      systems (e.g., Borg is), while apparently often lacking error
      correction codes and redundancy within a single repository, but
      those may still be suitable for the task. Fortunately in this
      case the variety is preferable, and one can combine those. See
      also: <a href="https://www.debian.org/doc/manuals/debian-reference/ch10.en.html#_backup_and_recovery">Debian Reference Manual - 10. Backup and
      recovery</a>, <a href="https://wiki.debian.org/BackupAndRecovery">BackupAndRecovery - Debian Wiki</a>, <a href="https://wiki.archlinux.org/title/Synchronization_and_backup_programs">Synchronization and
      backup programs - ArchWiki</a>.
    </p><p>
      As for portability, judging by experimentation in 2024, Android
      (as on Google Pixel phones) and Windows only support single
      (Ex)FAT partitions on USB drives, and probably only with MBR or
      without a partition table; no LUKS or filesystems such as Btrfs
      and ext4. So having to give up on compatibility with those for
      my regular backups, though when used for data transfer or
      unavoidable otherwise, one can use VeraCrypt (open-source, but
      not always considered FLOSS, for Windows, also supported for
      opening by cryptsetup, but creation would require additional
      tools: e.g., VeraCrypt itself or zuluCrypt) and
      exFAT. The <code>/\:*?"&lt;&gt;|</code> characters must be avoided
      in file names to stay compatible with exFAT.
    </p><h2>Hardware</h2><p>
      Reliable <a href="computer-hardware.html">computer hardware</a> is desirable to minimize errors and
      hardware failures: an UPS, ECC memory, and quality hardware
      (including storage) in general.
    </p><p>
      External HDDs (or combinations of internal ones and external
      boxes) are inexpensive and handy for local backups, allowing to
      keep them safely disconnected most of the time, and to easily
      plug into virtually any computer when needed.
    </p><p>
      USB flash drives seem more suitable for off-site backups, being
      more robust for physical transfer. Flash memory is not suited
      for a long-term storage without power though, so it is suggested
      to have them powered up at least for a few hours per year,
      letting the controllers to do maintenance, or even do data
      scrubbing (via a filesystem, if it supports that, or simply by
      forcing reading of all the files, possibly by verifying
      checksums) to nudge the rewrites. Writing onto cheap Kingston
      USB thumb drives (e.g., 256 GB DT Exodia) can be very slow,
      especially once about 2/3 of space is used and with ext4 on top
      of LUKS: writing at about 200 KB/s (less than 1 GB per
      hour). Even if you are not in a hurry, it makes one to wonder
      whether the device malfunctions, so perhaps it is better to not
      neglect the write speed completely, even for backup storage
      devices. I saw Apacer USB flash drives of the same capacity,
      which are even cheaper, having sustained write speeds of about
      10 MB/s, at least with exFAT.
    </p><p>
      Having an erratic USB port, bus, or wires (built into a
      Thermaltake chassis) that occasionally disconnects devices
      during active writing, I had a Transcend JetFlash (64 GB) thumb
      drive apparently dying (hanging on any writing attempt, "Device
      not responding to setup address.") after such a disconnect,
      while Kingston ones survived a few of those. As a side note,
      this seems more hazardous than non-ECC memory.
    </p><p>
      Optical drives (CD, DVD, Blu-ray) are commonly suggested for
      archieval, though they seem less convenient for updates and for
      usage in general, and it is not quite clear whether the
      recordable ("burned" with a laser and a dye, as opposed to being
      stamped at a factory) CDs and DVDs are that long-lasting, but
      apparently they are still quite durable (see <a href="https://blog.dshr.org/2024/08/2024-optical-media-durability-update.html">2024 Optical Media
      Durability Update</a>). And some aim archival storage
      explicitly (e.g., <a href="https://en.wikipedia.org/wiki/M-DISC">M-DISC</a>, mostly with BD).
    </p><p>
      Paper backups may be useful as well, and quite reliable,
      particularly for texts and images. <a href="https://en.wikipedia.org/wiki/Acid-free_paper">Acid-free paper</a> should be
      used for those, and one may play with bookbinding then. Some use
      QR codes and other two-dimensional barcodes to store arbitrary
      digital data on paper. Out of hardware, one would need a printer
      and a scanner for those, though I should investigate that
      better. To combine human-readability with relative
      machine-readability, special fonts like <a href="https://en.wikipedia.org/wiki/OCR-A">OCR-A</a> and <a href="https://en.wikipedia.org/wiki/OCR-B">OCR-B</a> can be
      useful, possibly combined with error correction codes.
    </p><p>
      One may also consider keeping backup storage devices and related
      items in a specialized storage shelf, a Faraday cage, or a
      fire-resistant and/or waterproof safe.
    </p><p>
      To go further than that, including storage of physical items,
      one may also look into general archieval- and collection-related
      materials, such as the <a href="https://psap.library.illinois.edu/">Preservation Self-Assessment Program</a>.
    </p><h2>Backup operating system</h2><p>
      I find it useful (for the peace of mind, at least) to set a
      bootable operating system on at least one of the backup drives,
      with all the necessary software to read the backups. So there
      usually is EFI system partition (ESP), an unencrypted partition
      for <code>/boot</code> (GRUB2 can handle encrypted ones, but it
      would not make much difference), an encrypted partition for the
      rest of the system (to prevent possible data leaks via cache,
      for instance, after backups are accessed from it), and a
      separate encrypted partition for the backup itself.
    </p><p>
      When installing a system using an installer, on a machine with more than
      one disk and some existing systems present, the installer would often use
      a seemingly random ESP on one of the internal disks, instead of the one on
      the backup drive. Fixing it may involve booting via the GRUB shell after
      GRUB fails to find or access its config from the
      <code>/boot</code> partition, remounting (and fixing in
      <code>/etc/fstab</code>) <code>/boot/efi/</code>, to point to the correct
      drive's ESP, and then running <code>grub-install</code> to install it
      there. Also removing undesirable directories from ESP manually, and
      adjusting things with <code>efibootmgr</code>. Or one can opt for a more
      involved/manual installation, setting it properly at once: see, for
      instance, "<a href="https://www.debian.org/releases/stable/amd64/apds03.en.html">Installing Debian GNU/Linux from a Unix/Linux System</a>" and
      "<a href="https://cryptsetup-team.pages.debian.net/cryptsetup/encrypted-boot.html">Full
      disk encryption, including /boot: Unlocking LUKS devices from GRUB</a>".
    </p><p>
      Alternatively, or additionally, one may set a personalized live
      system image, as described in the <a href="https://live-team.pages.debian.net/live-manual/html/live-manual/index.en.html">Debian Live Manual</a> and similar
      documents for other systems.
    </p><h2>Storage setups</h2><p>
      I do partitioning with <code>fdisk</code>, mostly because other
      common tools (or at least their fancy user interfaces) tend to
      be buggy, and/or to hide technical information, neither of which
      is desirable when partitioning storage
      devices. <code>fdisk</code> is nice, commonly available, and
      works well. With the setups described below, it works to set
      LUKS or an encrypted filesystems directly on a block device,
      without any partitioning, but it may also be desirable to store
      some <a href="public-data-backups.html">public data backups</a> on a separate partition of the same
      storage device, unencrypted.
    </p><p>
      RAID 1 (or possibly 5, 6) is nice to set if there are spare
      disks, but usually not as critical for redundant personal
      backups as it is, for instance, for a production server.
    </p><p>
      As of 2021 and for Linux-based systems, some of the common
      software options are:
    </p><ul>
      <li>
        <a href="https://en.wikipedia.org/wiki/Linux_Unified_Key_Setup">LUKS</a> and friends: <a href="https://en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)">LVM</a> or mdadm (software RAID),
        cryptsetup/dm-crypt (encryption), integritysetup/dm-integrity
        (integrity)
      </li>
      <li>
        ZFS (software RAID, encryption, integrity, added redundancy)
      </li>
      <li>
        Btrfs (software RAID, integrity, added redundancy)
      </li>
      <li>
        Regular checksums, such as <code>sha256sum</code> (integrity)
      </li>
    </ul><p>
      Those can be combined, even the ones serving the same purpose:
      for instance, storing file checksums would not harm even if the
      underlying filesystem supports those already. Likewise, it
      should not harm to encrypt the more important files
      (cryptographic keys, passwords), even while storing those on
      encrypted disks.
    </p><p>
      Below are notes and command cheatsheets for the setups I use.
    </p><h3>LUKS and ext4</h3><p>
      This is probably the most basic and widely supported setup for
      Linux-based systems. Only authenticated integrity checks are
      supported by cryptsetup (and those are experimental), so no CRC
      and no recovery from minor errors without RAID. Perhaps
      dm-integrity can be set separately to use CRC32C, but that would
      complicate the setup. Or it can be skipped altogether, since
      integrity checking is experimental, and wiping can slow down the
      process considerably (while skipping the wiping easily leads to
      errors).
    </p><p>
      Initial setup:
    </p><pre># Optionally, add: --type luks2 --integrity hmac-sha256
cryptsetup luksFormat /dev/sdXY
cryptsetup open /dev/sdXY backup2
mkfs.ext4 /dev/mapper/backup2
cryptsetup close backup2
mkdir /var/lib/backup2</pre><p>
      A typical session (CLI-based, though this is also handled by
      graphical file managers, such as Thunar):
    </p><pre>cryptsetup open /dev/sdXY backup2
mount -t ext4 /dev/mapper/backup2 /var/lib/backup2/
# synchronize backups
umount /var/lib/backup2/
cryptsetup close backup2</pre><p>
      When done, in order to safely eject a device, run <code>eject
      /dev/sdX</code>, or possibly <code>udisksctl power-off -b
      /dev/sdX</code>.
    </p><p>
      To change a passphrase, <code>cryptsetup luksChangeKey
      /dev/sdXY</code>.
    </p><p>
      For RAID with mdadm, see "<a href="https://gist.github.com/MawKKe/caa2bbf7edcc072129d73b61ae7815fb">dm-crypt + dm-integrity + dm-raid = awesome!</a>".
    </p><h3>ZFS</h3><p>
      ZFS is not modular like LUKS and friends, there are license
      compatibility issues, and it is rather unusual overall, but
      apparently a good filesystem containing all the features needed
      here. Be warned that installing it on Debian involves building a
      kernel module, which takes notable time on updates, and heats up
      the CPU (leading to laptop fans spinning loudly), discharging
      the battery at once, so it may be a good idea to have one
      dedicated machine to deal with it, but avoid it on others.
    </p><p>
      Initial setup:
    </p><pre># Ensure that linux headers are installed, needed for zfs-dkms
apt install linux-headers-amd64
# Install zfsutils-linux (from "contrib" repositories)
apt install zfsutils-linux
# Find a partition ID
ls -l /dev/disk/by-id/ | grep sda4
# Use that ID to create a single-device pool. The "mirror" keyword
# should be added to set RAID 1.
zpool create tank usb-WD_Elements_...-part4
# Create an encrypted file system.
mkdir /var/lib/backup/
# For redundancy within a dataset, add to the command below: -o copies=2
zfs create -o encryption=on -o keyformat=passphrase -o mountpoint=/var/lib/backup tank/backup</pre><p>
      ZFS comes with its own mounting and unmounting commands, and if
      it is to be used from different systems, the pools should be
      exported and imported (or just force-imported). A typical
      session, assuming that it is used from different systems:
    </p><pre># List pools available for import
zpool import
# Import the pool
zpool import tank
# Mount an encrypted file system
zfs mount -l tank/backup
# (Synchronize backups here)
# Unmount the file system (or it will happen on export)
zfs unmount tank/backup
# Unmount the pool (also unnecessary to do manually though)
zfs unmount tank
# Export the pool
zpool export tank
# And eject or udisksctl power-off -b, as mentioned above</pre><p>
      To change a passphrase, <code>zfs change-key tank/backup</code>.
    </p><h3>LUKS with Btrfs</h3><p>
      This one is set with the DUP profile for both metadata and data,
      adding redundancy, and with sha256 checksums (instead of the
      default crc32c), to reduce chances of collisions.
    </p><p>
      Initial setup:
    </p><pre># LUKS, as with ext4
cryptsetup luksFormat /dev/sdXY
cryptsetup open /dev/sdXY backup
# The file system
mkfs.btrfs --csum sha256 -m dup -d dup -L backup /dev/mapper/backup
cryptsetup close backup
mkdir /mnt/backup</pre><p>A session:</p><pre>cryptsetup open /dev/sdXY backup
mount -t btrfs /dev/mapper/backup /mnt/backup/
# synchronize backups here
umount /mnt/backup/
cryptsetup close backup
eject /dev/sdX
udisksctl power-off -b /dev/sda</pre><h2>Bit rot</h2><p>
      As mentioned above, it is important to be able to detect errors
      with some integrity checks, but one may also aim single-device
      redundancy for a recovery using that single device (and a better
      overall chance of successful data recovery), as well as
      calculate checksums on top of a filesystem (e.g., for ext4,
      which does not support those on its own).
    </p><p>
      For integrity checking with basic checksums, one can
      use <code>find</code> and <code>sha256sum</code> or similar
      tools:
    </p><pre># Store checksums
mkdir checksums
find . -type f ! -path './checksums*' -exec sha256sum {} \; \
  &gt; checksums/sha256
# Check them
sha256sum --quiet --check checksums/sha256
# Add new ones
find . -type f -newer checksums/sha256 ! -path './checksums*' \
  -exec sha256sum {} \; &gt;&gt; checksums/sha256</pre><p>Alternatively:</p><pre># Store (new) checksums
mkdir -p checksums
find . -type f ! -path './checksums*' -exec sha256sum {} \; \
  &gt; checksums/$(date -I).sha256
# Compare them to old ones
diff -U 0 &lt;(sort checksums/2026-01-12.sha256) \
  &lt;(sort checksums/2026-01-23.sha256) | less</pre><p>
      For redundant error correction codes (forward error correction,
      FEC), with ability to repair, one may employ <code>par2</code>,
      <code>dvdisaster</code> (aiming optical discs),
      <code>zfec</code> (a library with Python, C, Haskell APIs),
      libfec (a C library), GNU Radio FEC API, though those may be
      quite inefficient to use for collections of files that are
      updated. There are projects like blockyarchive (blkar), but just
      as specialized backup systems, they tend to require specialized
      tools to access the files backed up with them at all. A software
      RAID (1, 5, or 6) set on different partitions of the same device
      is a more time-efficient way to achieve some redandancy within a
      storage device, though less space-efficient, and protecting
      against different bit rot patterns. ZFS's "copies" parameter and
      Btrfs's DUP profile (for both data and metadata) do something
      similar, storing multiple copies of blocks within a dataset.
    </p><h2>Other useful tools</h2><p>
      <a href="https://en.wikipedia.org/wiki/S.M.A.R.T.">S.M.A.R.T.</a> monitoring and testing can be done with
      smartmontools, and usually supported even by external and older
      USB drives.
    </p><p>
      I normally use just <code>rsync --archive</code> for the initial
      backup, then <code>rsync --exclude='lost+found' --archive
      --verbose --checksum --dry-run --delete</code> to compare
      backups and for data scrubbing, and
      without <code>--dry-run</code> afterwards, if everything looks
      fine. Using <code>-rt</code> or <code>-r</code> instead
      of <code>-a</code> may be preferable sometimes though, if file
      permissions and ownership data are not to be preserved.
    </p><p>
      For <a href="https://en.wikipedia.org/wiki/Data_erasure">data erasure</a>, <code>dd</code> is handy for wiping both disks
      and partitions (before decommissioning drives, or if there were
      unencrypted partitions before), e.g.:
    </p><pre>dd status=progress if=/dev/urandom of=/dev/sdX bs=1M
dd status=progress if=/dev/urandom of=/dev/sdXY bs=1M</pre><p>
      GnuPG is there for individual file encryption, as well as for
      signing. In some cases it may be useful together with tar and
      gzip.
    </p><p>
      For more compact music backups, one may wish to backup just the
      files referenced from a playlist, and not the whole archive. An
      example command for counting the total size of files involved in
      a playlist:
    </p><pre>xmllint --xpath '//*[local-name()="location"]/text()' music.xspf |
  sed -E 's/&amp;amp;/\&amp;/g' |
  tr '\n' '\0' |
  du -s --files0-from=- |
  awk '{ sum += $1 } END { print sum }'</pre><p>
      While rsync has the <code>--files-from</code> option, to work
      with a given list of files only:
    </p><pre>xmllint --xpath '//*[local-name()="location"]/text()' music.xspf |
  sed -E 's/&amp;amp;/\&amp;/g' |
  rsync --dry-run -avz --files-from=- . ~/mnt/</pre><h2>Remote backups</h2><p>
      When backing up private data to a remote (and usually less
      trusted) machine, it should be encrypted and verified
      client-side (so options like plain rsync over SSH are not
      suitable), but preferably still allowing for incremental backups
      (so tar and gpg are not suitable in general, either). One can
      still employ LUKS or ZFS though, by accessing remote block
      devices via iSCSI (in particular, <code>tgt</code>
      and <code>open-iscsi</code> seem to work smoothly on Debian),
      NBD, or similar protocols, possibly on top of IPsec or WireGuard
      (though as of 2024, those are blocked in Russia between local
      and foreign machines), tunnels made with SSH port forwarding,
      TLS (e.g., with stunnel), or anything else establishing a secure
      channel, to add encryption and a more secure authentication.
    </p><p>
      A test iSCSI setup example:
    </p><pre># server (192.168.1.2)
apt install tgt
dd if=/dev/zero of=/tmp/iscsi.disk bs=1M count=128
tgtadm --lld iscsi --op new --mode target --tid 1 --targetname iqn:2024-07:com.example:tmp-iscsi.disk
tgtadm --lld iscsi --op show --mode target
tgtadm --lld iscsi --op new --mode logicalunit --tid 1 --lun 1 -b /tmp/iscsi.disk
tgtadm --lld iscsi --op new --mode account --user foo --password bar
tgtadm --lld iscsi --op show --mode account
tgtadm --lld iscsi --op bind --mode target --tid 1 --initiator-address 192.168.1.3 --initiator-name foo
tgtadm --lld iscsi --op unbind --mode target --tid 1 --initiator-address 192.168.1.3 --initiator-name foo
tgtadm --lld iscsi --op bind --mode target --tid 1 --initiator-address 192.168.1.3

# client (192.168.1.3)
apt install open-iscsi lsscsi
iscsiadm --mode discovery --type sendtargets --portal 192.168.1.2
iscsiadm  --mode node  --targetname iqn:2024-07:com.example:tmp-iscsi.disk --portal 192.168.1.2 --login
iscsiadm --mode session --print=1
lsscsi
# a block device is available at this point
iscsiadm  --mode node  --targetname iqn:2024-07:com.example:tmp-iscsi.disk --portal 192.168.1.2 --logout</pre><p>
      Apart from own (or rented) remote machines, such a setup can be
      used with "backup buddies", exchanging some of your local
      storage space for someone else's. Sneakernet-based backup
      buddies (that is, occasionally exchanging storage devices) is a
      fine and easier option for remote backup storage.
    </p><p>
      A popular option for remote backups is online services (aka "the
      cloud" and a few other names), with many people relying on those
      even in place of local backups, or any local storage (as with
      music and video streaming, hosted photo albums, password
      managers, book collections, general document storage),
      delegating all those worries to somebody else. It seems
      convenient, but decreases direct control over the data,
      introduces dependencies on the service providers' continued
      existence and continued acceptable terms of service, on network
      connectivity to them, on ability to transfer payments. In
      my--possibly unrepresentative--experience, all those are
      unreliable, but it may still work as a redundant backup copy for
      some, particularly in predictable democratic countries, with a
      reputable service provider. Throw in the rule of law and
      sensible laws (or some kind of a hypothetical anarchist or
      communist utopia), and one may worry less about keeping some
      information private, as well as about aiming long-term isolated
      backups of public information.
    </p><h3>Data sharing</h3><p>
      For less private data (perhaps for almost everything but
      cryptographic keys and passwords -- that is, explicit secrets),
      a good way to preserve it is by sharing with others: for
      instance, pictures from an event or gathering are commonly
      shared among all the participants, while creative works
      (particularly books and music) can be shared among people with
      similar interests or tastes. Everything work-related can be
      backed up on work machines. While the data that is not private
      at all, like this very note, or other own creative works under
      permissive licenses, is generally useful to publish, sharing
      even more widely.
    </p><h3>Adverse services</h3><p>
      One may consider use of relatively adverse services for both
      storage and transfer, such as censored and monitored
      ones. Usually they are best to avoid, but they may still be
      useful for redundancy, or when there are no other working
      options.
    </p><p>
      For file storage or sharing services, storage on block devices,
      as described above, can be handled with a file-backed loop
      device. GnuPG and other file- or stream-oriented methods would
      also work, but since encrypted data may attract unwanted
      attention, it is better at least to not advertise the encryption
      with headers. An easy option to do that (using a passphrase
      without additional files, the widely available openssl CLI tool)
      with a single file or stream is <code>openssl enc -aes-256-ctr
      -nosalt -pbkdf2</code>, but PBKDF2 is relatively weak (argon2id
      is recommended), and this skips a salt entirely, so the
      passphrase must be high-entropy then. Alternatively, one may
      consider using and storing the salt, and employing argon2
      manually, e.g.:
    </p><pre>sudo apt install openssl argon2
SALT=$(openssl rand -hex 16) # 128 bits
# https://www.rfc-editor.org/rfc/rfc9106#name-parameter-choice
# The first recommended option
argon2 $SALT -id -t 1 -p 4 -m 21 -l 32
# The second recommended option
argon2 $SALT -id -t 3 -p 4 -m 16 -l 32
# Use -e or -r options for scripting</pre><p>
      Another option is cryptsetup (dm-crypt) without LUKS: in its
      plain mode, on a loop device, with some basic non-journaling
      file system (such as ext2, or ext4 with journal disabled) on
      it. The passphrase must also be high-entropy, since it does not
      use a KDF: ideally matching the 128-bit key size, which would
      take 8 random words picked out of a dictionary of 65536
      words. Or a separately derived (or generated, simply random) key
      must be supplied, perhaps with the <code>--key-file</code>
      option. An example:
    </p><pre>dd if=/dev/urandom of=test.img bs=1M count=128
sudo losetup --find --show test.img
sudo cryptsetup open --type plain /dev/loop0 test
# No journal, no reservation
mke2fs -t ext4 -O ^has_journal -m 0 /dev/mapper/test
mkdir test
sudo mount /dev/mapper/test ./test
# Add the files here
sudo umount ./test
sudo cryptsetup close test
sudo losetup -d /dev/loop0</pre><p>
      To make use of audio channels or audio data storage services
      (including the ones that re-encode audio files), a
      straightforward way is to use modem software, such
      as <code>minimodem</code>. That can be combined with forward
      error correction for reliability, and mixed with another audio
      stream to stay more covert. One may also try setting a system in
      the style of numbers stations, using TTS (text-to-speech,
      festival or espeak) and STT (speech-to-text, CMU Sphinx or
      more advanced ones).
    </p><p>
      For data encoding as text, one can use plain base64, some
      words-based encoding, maybe Markov chains (which would require
      custom tools and data though).
    </p><p>
      For video, a similarly ad hoc and basic approach is to encode a
      sequence of QR codes (or other matrix barcodes). But as for
      other options, one may also consider more involved steganography.
    </p><p>
      HMAC can be useful for authenticated integrity checks with such
      services. Common CLI tools for that include
      <code>hmac256(1)</code> from the <code>libgcrypt20-dev</code>
      Debian package
      and <code>openssl-dgst(1ssl)</code>: <code>openssl dgst -sha256
      -hmac &lt;key&gt; [file ...]</code>.
    </p><p>
      I would not recommend to rely on online services generally, but
      using them for added redundancy, particularly for public data,
      may be okay, and potentially fun to play with. If arbitrary and
      random-looking data storage is not explicitly allowed by a
      service, it may lead to account suspension, including other
      services on that account (though it occasionally happens with
      online services even without such a trigger).
    </p></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/network-abuse.html"/><id>https://thunix.net/~defanor/notes/network-abuse.html</id><author><name>defanor</name></author><title>Network abuse</title><summary>A log of dealing with network abuse</summary><published>2022-09-07T09:00:00Z</published><updated>2026-04-28T09:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Network abuse</h1><p>
      Here is my log of spotted and reported network abuse incidents.
      It started as private notes aiming to keep track of those being
      fixed, and to block the hosts if they keep spamming. I decided
      to make it public, since there is no private information in it
      (though I'm omitting the bits I may discover that aren't public,
      such as server administrator email addresses), and it may be of
      interest for people trying to decide whether reporting is
      worthwhile.
    </p><h2>Spam messages</h2><p>
      Below are incidents with spam messages that got through the
      usual filters: dates, hosts, the abuse contact and other report
      information, other notes.
    </p><h3>XMPP</h3><ul>
      <li>2021-09-12, 188.243.192.232, abuse@sknt.ru: no response and
        spam kept coming, submitted <a href="https://github.com/JabberSPAM/blacklist/pull/26">a JabberSPAM blacklist PR</a>.</li>
      <li>2021-09-12, 138.201.50.174, stian@barmen.nu: replied that he
        will investigate. Probing from ether@jabber.no.</li>
      <li>2021-09-12, 54.36.115.48, info@xmpp.gg and abuse@ovh.net: no
        reply from either and the spam kept coming, submitted <a href="https://github.com/JabberSPAM/blacklist/pull/27">a
          blacklist PR</a>. Probing from ink@jabber.gg.</li>
      <li>2022-08-25, 138.201.25.9, abuse@hetzner.com followed
        by <a href="https://abuse.hetzner.com/">Hetzner abuse reporting form</a>. Subscription requests and
        OMEMO-encrypted messages, similar ones from multiple services
        and JIDs, with occasional plaintext being just silly. This one
        is from klassic@isgeek.info. Those kept coming for at least a
        month.</li>
      <li>2022-08-25, 185.146.232.56, vesselwave@protonmail.com: they
        deleted the user and started looking more closely for
        spammers. From klassic@satisprivacy.org.</li>
      <li>2022-08-25, 95.168.217.72, support@jabbim.zendesk.com and
        abuse@superhosting.cz (since the first one had no
        effect). From multiks@jabbim.sk.</li>
      <li>2022-09-06, 170.187.181.190, abuse@linode.com. From
        multiks@rows.im.</li>
      <li>2022-09-10, 86.250.242.174. Did not notice at first, and
        then it ceased. Probing (presence subscription requests) from
        multiks@im.azurs.fr.</li>
      <li>2022-10-01, 89.147.108.127, info@outerrealm.net on
        2022-10-06, within 30 minutes received a reply saying that it
        will be looked into, and apparently it was solved. From
        ehf@msg.outerrealm.net: subscription requests at first, an odd
        message saying "Request Subscription" (followed by
        opportunistic OTR's whitespaces, similarly to some of the past
        spammy/probing messages) on 2022-10-06.</li>
      <li>2022-10-18, 78.72.102.36. Have not reported, but then it
        disappeared; possibly somebody else did. From swe@qwik.space,
        a subscription request.</li>
      <li>2022-10-18, 78.72.102.36. Same as above: haven't reported,
        but then it disappeared. From basik@qwik.space, a subscription
        request.</li>
      <li>2022-11-01, 138.201.50.174, stian@barmen.nu. From
        floki@jabber.no: "Hi there, free for chat?". Then a
        subscription request from the same JID arrived on
        2023-01-03.</li>
      <li>2022-11-16, 138.201.25.9, the Hetzner reporting form (since
        have not found aministrator contact information). Received an
        acknowledgement on 2023-01-11, a reply from the XMPP server
        aministrator on 2023-01-13 saying that it doesn't look like
        spam; described the issue in more detail, another reply saying
        that it sounds like "complete nonsense" and suggesting to use
        iptables. <a href="https://logs.xmpp.org/operators/2023-01-13#2023-01-13-6730a27d988f0e8d">Asked on operators@muc.xmpp.org to ensure that my
        approach is sensible</a>, and replied to abuse@hetzner.com, asking
        about their policy on XMPP spam; no reply, as of
        2023-05-05. Unexpected presence subscription request and no
        message (likely probing) from basik@isgeek.info.</li>
      <li>2022-12-13, 138.201.50.174, stian@barmen.nu. Then again on
        2023-03-08 (after an additional message from the same XMPP
        address). From prtship@jabber.no/_, a presence subscription
        request, and a "Hi, Free for chat?" message 3 months
        later.</li>
      <li>2023-01-18, 167.179.180.180, abuse@octothorn.com (on
        2023-01-19). Received a reply on 2023-02-15, mentioning that
        the user is being kicked off, and the account had more than
        1000 contacts in the roster, most of which were pending a
        subscription approval. From aus@jabber.octothorn.com/_, a
        presence subscription request. The last one arrived on
        2023-01-31.</li>
    </ul><h3>Email</h3><ul>
      <li>2021-02-09, 103.66.105.237, noc@cmjainimpex.in.</li>
      <li>2021-03-31, 205.201.133.233, abuse@mailchimp.com.</li>
      <li>2021-06-24, 2a00:1450:4864:20::641, <a href="https://support.google.com/mail/contact/abuse">Gmail abuse reporting
          form</a>. Apparently reporting didn't work, nothing happened
          on "submit".</li>
      <li>2021-06-25, 91.223.3.194, admin@skynode.pl.</li>
      <li>2022-04-25, 146.19.173.107, abuse@ipconnect.services.</li>
      <li>2022-04-28, 5.181.80.128, noc@4vendeta.com.</li>
      <li>2022-05-29, 200.93.248.119, rolfex@powerfast.net.</li>
      <li>2022-05-30, 193.218.204.206, abuse@heficed.com. The client
        replied that it was solved a long time ago.</li>
      <li>2022-05-31, 2607:f8b0:4864:20::e41, Gmail abuse reporting
          form.</li>
      <li>2022-06-30, 211.100.47.38. A Chinese ISP, probably not worth
        reporting, Blacklisted
        in <code>postscreen_access.cidr</code>.</li>
      <li>2022-08-15, 159.183.196.221, abuse@sendgrid.com.</li>
      <li>2022-11-01, 2607:5500:3000:1176::2,
        support@hostwinds.com.</li>
      <li>2023-05-05, 106.75.10.112, ipas@cnnic.cn. From
        ucmail25.sendcloud.io.</li>
      <li>2023-05-30, 69.12.91.126, abuse@quadranet.com.</li>
      <li>2023-06-16, 117.50.66.12, ipas@cnnic.cn. From
        ucmail17.sendcloud.io, added <code>sendcloud.io REJECT
        spammers</code> into the file referenced by
        <a href="https://www.postfix.org/postconf.5.html#check_client_access">postfix's check_client_access</a>. dnswl.org returned
        127.0.15.0 for it, reported it to them as spam.</li>
      <li>2023-06-22, 192.119.65.137, abuse@hostwinds.com. Their mail
        server (Gmail) rejects messages with the spam message
        attached, reported without an attachment.</li>
      <li>2023-07-21, 220.133.13.91,
        hostmaster@twnic.net.tw. According to the received mail
        headers, it originated from 185.225.74.219.</li>
      <li>2023-09-15, 46.17.43.50, noc@baxet.ru. With valid SPF for
        tiaohu.net: apparently a Chinese organization's domain name,
        but a Russian hoster's IP address. Quickly received a reply
        saying "Blocked" from support@justhost.asia.</li>
      <li>2023-09-15, 2607:f8b0:4864:20::935, Gmail abuse reporting form.</li>
      <li>2023-09-22, 2607:f8b0:4864:20::72c, Gmail abuse reporting
        form. Same address as the previous one
        (polachek@squadhelp.co), a follow-up.</li>
      <li>2023-09-23, 2607:f8b0:4864:20::72a, Gmail abuse reporting
        form. Same address as the previous two, the spammer claimed it
        is the last message.</li>
      <li>2023-09-25, 2607:f8b0:4864:20::f29, Gmail abuse reporting
        form. A new subdomain, polachekg@go.squadhelp.co, but
        continuation of the previous 3, and Gmail does nothing;
        blacklisted the domain in postfix (check_sender_access).</li>
      <li>2023-10-19, 209.85.128.177, Gmail abuse reporting form. From
      masonlambert190@gmail.com</li>
      <li>2023-11-01, 209.85.128.172, Gmail abuse reporting form. From
      katherinesophia523@gmail.com</li>
      <li>2023-12-05, 31.192.235.11, abuse@profitserver.ru. Phishing,
        envelope-from abuse@q03.1cooldns.com, with valid DKIM and
        SPF.</li>
      <li>2023-12-11, 31.192.237.60, abuse@profitserver.ru. Phishing
        again, envelope-from abuse@origin.1cooldns.com.</li>
      <li>2023-12-11, 209.85.219.180, Gmail abuse reporting form. From
        haileyjtanner@gmail.com, asking to add a link to some
        furniture selling website (which supposedly has a blog post on
        astronomy) from my "links" page.</li>
      <li>2023-12-18, 209.85.128.170, Gmail abuse reporting form. From
        haileyjtanner@gmail.com again, Gmail does not seem to do much
        about outgoing spam.</li>
      <li>2023-12-19, 31.192.239.9, abuse@profitserver.ru. Phishing
        yet again, envelope-from=no-replies@batixtaneve.com this
        time. Blacklisted 31.192.232.0/21.</li>
      <li>2023-12-26, 209.85.128.169, Gmail abuse reporting form. From
        haileyjtanner@gmail.com yet again, Gmail still does
        nothing. Blacklisted the address in postfix
        (check_sender_access).</li>
      <li>2024-02-29, 204.152.197.177, abuse@quadranet.com. Spam about
        electric bicycles</li>
      <li>2024-03-12, 185.218.100.84, abuse@ipxo.com.</li>
      <li>2024-03-18, 194.53.136.174, abuse@virtono.com. Spam about
        electric bicycles, same as on 2024-03-12.</li>
      <li>2024-03-20, 104.223.121.26, abuse@quadranet.com. Same as the
        last two, and as on 2024-02-29: e-bikes.</li>
      <li>2024-04-25, 2024-04-26, 216.9.224.143,
        abuse@dchost.com. Scam, 3 messages. And one more message from
        the misconfigured mail server, notifying about a failed
        delivery (the "from" address matched the "to" address).</li>
      <li>2024-05-09, 173.249.144.124, abuse@liquidweb.com. Posing as
        a Docusign notification.</li>
      <li>2024-06-12, 193.188.192.139, abuse@pipenet.hu.</li>
      <li>2024-07-31, 47.90.198.34, abuse@alibaba-inc.com.</li>
      <li>2024-08-08, 103.224.90.82, abuse@nexcess.net. Phishing</li>
      <li>2024-09-23, 208.234.3.27, abuse@verizon.net,
        abuse@ait.com. A scam, as described in "<a href="https://www.insercorp.com/blog/post/december/09/2010/beware-of-chinese-domain-scams">Beware of Chinese
        Domain Scams</a>" or "<a href="https://nonewwars.co.uk/blog/2021/10/chinese-domain-registration-emails/">Chinese domain registration emails</a>". Verizon
        pointed to AIT.com, I wrote there, the "support ticket" was
        closed quickly without a comment.</li>
      <li>2024-09-24, 2a00:1450:4864:20::42b, Gmail abuse reporting
        form. From saracody9@gmail.com, a request to link some
        irrelevant website from mine.</li>
      <li>2024-10-27, 219.134.170.101,
        anti-spam@chinatelecom.cn. Router advertisements.</li>
      <li>2024-11-18, 46.23.108.219, abuse@bullethost.net. Electric
        bicycle advertisement.</li>
      <li>2024-11-19, 192.154.230.159,
        abuse@host4yourself.com. Electric bicycle advertisement.</li>
      <li>2024-11-22, 181.214.99.201, abuse@ipxo.com. E-bikes.</li>
      <li>2024-11-30, 188.127.247.224, abuse@smartape.net
        (though <a href="https://krebsonsecurity.com/2022/04/double-your-crypto-scams-share-crypto-scam-host/">SmartApe is reported to be a Russian hosting for
          cybercriminals</a> itself). Probing.</li>
      <li>2024-12-01, 120.241.40.88, abuse@chinamobile.com. Spam about
        shipping from China.</li>
      <li>2024-12-04, 91.193.18.13,
        abuse@hostzealot.com. E-bikes.</li>
      <li>2024-12-06, 181.214.99.132,
        report@abuseradar.com. E-bikes.</li>
      <li>2024-12-10, 84.32.41.141,
        report@abuseradar.com. E-bikes.</li>
      <li>2024-12-13, 162.250.189.12, complaints@servarica.com. The
        ticket was automatically created and automatically closed
        without response in 36 hours; blacklisted its subnet
        in <code>postscreen_access.cidr</code>.</li>
      <li>2024-12-29, 222.125.131.176, xujing@topway.cn. Shipping from
        China.</li>
      <li>2025-01-12, 39.189.22.39, abuse@chinamobile.com.</li>
      <li>2025-01-14, 45.147.167.60,
        abuse@thinkhuge.net. E-bikes.</li>
      <li>2025-01-15, 217.12.203.132,
        abuse@greenfloid.com. E-bikes.</li>
      <li>2025-02-03, 39.189.22.212, abuse@chinamobile.com. Blocked
        SMTP connections from Chinese IP addresses via nftables at
        this point, since there is a lot of spam and no ham at all
        coming from those.</li>
      <li>2025-02-11, 85.120.223.178, abuse-nav@rnc.ro and
        abuse@nav.ro. E-bikes. Connections to the rnc.ro mail server
        time out.</li>
      <li>2025-02-15, 209.85.208.176, Gmail abuse reporting form. From
        svcodie@gmail.com, again about adding some supposedly
        astronomy-related links to my "links" page (the URL that is
        dead for a few months), with a shady "unsubscribe" link.</li>
      <li>2025-02-17, 85.120.223.139, abuse@nav.ro. E-bikes
        again.</li>
      <li>2025-02-23, 209.85.218.42, Gmail abuse reporting form. From
        jessigfrost@gmail.com, again on astronomy links.</li>
      <li>2025-02-24, 85.120.223.179, abuse@nav.ro. E-bikes again,
        repeatedly from the same ISP. nav.ro rejected my report as
        spam. Added a rejection rule for 85.120.223.0/24
        into <code>postscreen_access.cidr</code>.</li>
      <li>2025-02-27, 209.85.208.47, Gmail abuse reporting form. From
        the previously reported jessigfrost@gmail.com.</li>
      <li>2025-02-27, 194.102.104.66. Phishing, but nowhere to report:
        it lists abuse-alexhost@rnc.ro, and rnc.ro's mail servers are
        not responsive, as discovered recently. Blacklisted the
        subnet, as with the other rnc.ro one.</li>
      <li>2025-03-08, 209.85.214.175, Gmail abuse reporting
        form. Probing of active mailboxes, apparently (sent to two of
        my email addresses), from insangleeq@gmail.com.</li>
      <li>2025-03-14, 209.85.222.53, Gmail abuse reporting
        form. Follow-up probing, from mrsirishboudreau86@gmail.com.</li>
      <li>2025-03-21, 209.85.160.52, Gmail abuse reporting
        form. Probing again, from ukpabimberi892@gmail.com.</li>
      <li>2025-03-24, 39.189.23.79. Chinese spam again: I had
        nftables.service disabled, so did not apply the filters after
        the reboot.</li>
      <li>2025-03-26, 2607:f8b0:4864:20::b44, Gmail abuse reporting
      form. From mberiukpabi611@gmail.com.</li>
      <li>2025-03-28, 209.85.208.66, Gmail abuse reporting form. Still
        probing, which seems to be quite regular (weekly), from
        ukpabimberi353@gmail.com. It was sent to two of my email
        addresses.</li>
      <li>2025-04-04, 179.61.221.11,
        report@abuseradar.com. E-bikes.</li>
      <li>2025-04-09, 209.85.208.196, Gmail abuse reporting
      form. Probing yet again, sent to at least two of my email
        addresses, from noorawilliams015@gmail.com.</li>
      <li>2025-04-17, 209.85.219.171, Gmail abuse reporting form. Some
        probing again, from quydai079@gmail.com, referencing a phone
        number for use with Telegram.</li>
      <li>2025-05-31, 193.52.142.199, certsvp@renater.fr.</li>
      <li>2025-06-20, 185.130.249.144, abuse@smartape.ru. "Unpaid
        invoice" scam.</li>
      <li>2025-06-23, 209.85.160.43, Gmail abuse reporting form. From
        mrsirishboudreau5@gmail.com, probing for live email
        addresses.</li>
      <li>2025-07-09, 193.42.36.71, abuse@hostzealot.com. E-bikes.</li>
      <li>2025-07-09, 209.85.219.196, Gmail abuse reporting form. From
        mrsirishboudreau288@gmail.com, probing for live email
        addresses.</li>
      <li>2025-07-28, 38.45.89.36, abuse@cogentco.com. Posing as a
        Docusign notification.</li>
      <li>2025-07-29, 191.252.13.209, abuse@locaweb.com.br. Posing as
        booking.com.</li>
      <li>2025-07-29, 191.252.13.197, 191.252.12.56, 177.153.3.113,
        179.188.6.145 (all Locaweb, as the one above). Blacklisted
        191.252.0.0/16, 177.153.0.0/16, 179.188.0.0/16.</li>
      <li>2025-08-04, 192.154.230.149,
        abuse@host4yourself.com. E-bikes.</li>
      <li>2025-08-05, 79.141.174.230, abuse@hostzealot.com.</li>
      <li>2025-08-07, 45.86.230.19, abuse@bluevps.com. E-bikes. They
        promptly responded "This client is blocked". I
        added <code>/e-bike/i REJECT E-bike spam</code> into
        postfix's <code>body_checks</code>.</li>
      <li>2025-08-08, 179.61.221.2, report@abuseradar.com. E-bikes,
        adjusted the body_checks rule to <code>/e-?bike/ REJECT E-bike
        spam</code> (the <code>i</code> flag actually turns
        case-insensitivity off), blacklisted 179.61.221.0/24
        in <code>postscreen_access.cidr</code>.</li>
      <li>2025-08-12, 108.165.213.11,
        abuse@dartnode.com. Phishing.</li>
      <li>2025-08-22, 209.85.217.68, Gmail abuse reporting form. From
        mrs.info.jashok@gmail.com.</li>
      <li>2026-02-24, 77.83.39.16, abuse@lanedo.net.</li>
      <li>2026-03-08, 209.85.216.48, Gmail abuse reporting form. From
        maviswanczykp82@gmail.com.</li>
      <li>2026-03-11, 163.223.211.186, hm-changed@vnnic.vn.</li>
      <li>2026-03-11, 163.223.211.186, hm-changed@vnnic.vn. Same
        message as earlier, blacklisted 163.223.210.0/23
        in <code>postscreen_access.cidr</code>.</li>
      <li>2026-04-28, 160.30.136.30 and 160.30.136.43,
        hm-changed@vnnic.vn. The messages pretended to be from gmail,
        with a DKIM signature failing verification. Blacklisted
        160.30.136.0/23 as well.</li>
    </ul><h2>General observations</h2><p>
      A lot of network abuse (spam, vulnerability scans, brute-force
      attacks) comes from China, plenty from Russia as well. As a side
      note, <a href="https://qz.com/978037/china-publishes-more-science-research-with-fabricated-peer-review-than-everyone-else-put-together">Chinese researchers similarly spam the world with
      fabricated research papers</a> (though apparently they try to combat
      it, <a href="https://www.statnews.com/2017/06/23/china-death-penalty-research-fraud/">up to a death penalty for researchers who commit fraud if it
      harms people</a>). Apparently wider agreements, policies, and
      cultures help to fight network abuse about as well as
      technological methods do. I think it is okay to rate-limit
      regional IP address blocks (as described in the <a href="private-server-setup.html">private server
      setup</a> and <a href="simpler-server-setup.html">simpler server setup</a> notes), though one should think
      twice before blocking them completely: if there are non-abusive
      users, it would be unfair to them. And then there are large mail
      providers, particularly Gmail, not caring much about outgoing
      spam, while blocking them is a bad option, given the number of
      legitimate users: the ham-to-spam ratio is less than 1, but more
      than 0.
    </p></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/information-security-basics.html"/><id>https://thunix.net/~defanor/notes/information-security-basics.html</id><author><name>defanor</name></author><title>Information security basics</title><summary>A brief guide on information security and literacy</summary><published>2025-03-15T12:00:00Z</published><updated>2026-04-14T17:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Information security basics</h1><p>
      There are information security guides for different audiences
      around, including EFF's <a href="https://ssd.eff.org/">Surveillance Self-Defense</a> and <a href="https://emailselfdefense.fsf.org/en/">Email
      Self-Defense</a>, <a href="https://www.nist.gov/cybersecurity">NIST's Cybersecurity</a>. But I fail to find concise,
      relatively general, and sensible guidelines aiming personal
      <a href="https://en.wikipedia.org/wiki/Information_security">information security</a> and <a href="https://en.wikipedia.org/wiki/Information_literacy">information literacy</a> to refer people
      to, so I wrote down the suggestions I would normally share.
    </p><p>
      I am not a security expert, but a programmer and a small-scale
      system administrator paying attention to security. So it is a
      good idea to consider these suggestions critically, just as any
      others, but I think that they will improve the average state of
      such guides.
    </p><h2>General advice</h2><p>
      Question things, do not trust blindly, require evidence and
      verifiability of claims, check those, do not share personal
      information or give away control without a good reason to,
      assume that "anything that can go wrong will go wrong" (<a href="https://en.wikipedia.org/wiki/Murphy%27s_law">Murphy's
      law</a>). That is, employ scientific and engineering approaches, and
      try to stay honest: do not nudge things to look better (e.g.,
      more trustworthy or certain) than they are; better to err on the
      side of safety, assuming that they may be worse than they
      seem. A lack of understanding makes one vulnerable to deception,
      so study the relevant subjects: how computers, banks, online
      stores, governments and scammers work, how software and relevant
      systems are developed, how the research used by those is
      done. <a href="computing-context.html">Computing context</a> is a part of it. Try to avoid <a href="https://en.wikipedia.org/wiki/List_of_fallacies">fallacies</a>
      and <a href="https://en.wikipedia.org/wiki/List_of_cognitive_biases">cognitive biases</a>, as they tend to be exploited by
      adversaries.
    </p><p>
      Do not shy away from learning. It is tempting (and commonly
      suggested) to stick to certain newbie-friendly tools, but that
      is a very fragile approach: without sufficient understanding,
      people easily lose the tools (e.g., when the government blocks
      their secure messengers), or manage to misuse them (e.g., by
      ascribing strange and unexpected properties to the tools: no
      user-friendly UI or API will protect from a user assuming that
      anything going through a system becomes "secure" in all senses
      and for all purposes, for instance).
    </p><p>
      Conversely, when providing a service, publishing software,
      asking for information, sharing information or software, it is
      nice to make it easy for others to follow that: provide
      references, evidence, source code, explain why the requested
      information is required (and ensure that it actually is
      required); generally, do not ask to believe or trust blindly, do
      not encourage and normalize dangerous practices.
    </p><p>
      And as with any other pursuit, give it a try, do not give up, do
      not view it as "all or nothing": learning a little, paying some
      attention to security, and avoiding some of the potential losses
      that way is already better than being successfully attacked all
      the time.
    </p><h2>Threats</h2><p>
      Information security includes a few areas, but personal security
      usually revolves around privacy and confidentiality. Some of the
      common <a href="https://en.wikipedia.org/wiki/Threat_actor">threat actors</a> targeting individuals are scammers,
      oppressive governments, and thrill seekers. All those seem to be
      commonly underestimated: scammers' victims think that they
      cannot be scammed, and are surprised afterwards; thrill seekers
      are often neglected because "why would anybody want to do
      that?"; governments are often ignored because of one's political
      views (loyalty to the regime, beliefs that it will not turn
      authoritarian, is not authoritarian even after it turned so,
      abandoning presidential term limits, introducing numerous
      censorship laws and persecution of dissent, and so on; belief
      that they will not reach you) or <a href="https://en.wikipedia.org/wiki/Learned_helplessness">learned helplessness</a>. "I have
      nothing to hide" is another common sentiment, often extended to
      the private information of one's friends and family that they
      possess, useful to threat actors. That usually implies a
      certainty that the government is on your side and will stay that
      way, in addition to one's immunity to the other risks. And then
      there are the likes of "<a href="https://en.wikipedia.org/wiki/Just-world_fallacy">the world is just</a>, I am good, so nothing
      bad can happen to me"; a variety of denial strategies and
      excuses, religious beliefs.
    </p><p>
      Entities collecting information, even if they do not use it
      against you intentionally and immediately, may also be viewed as
      threats, since they tend to leak it via <a href="https://en.wikipedia.org/wiki/Data_breach">data breaches</a>, or to
      abuse it themselves later. Those include commercial companies,
      government organizations, and individuals.
    </p><p>
      People may also engage in a crime of opportunity if the
      conditions for that are created: e.g., someone picking up or
      buying a discarded unencrypted storage device may access
      (recover) the private data stored on it. Same with information
      made available online: apparently even IT professionals manage
      to accidentally allow unauthenticated access to databases quite
      regularly, making it a common source of data breaches.
    </p><h2>Mitigation</h2><h3>Principle of least privilege</h3><p>
      The <a href="https://en.wikipedia.org/wiki/Principle_of_least_privilege">principle of least privilege</a> is generally useful: share the
      minimum required information to receive a service, or give
      minimal required and controlled access to your system. E.g.,
      buying most items, using most public transport, or visiting most
      public places should not require identifying yourself: doing so
      imposes an unnecessary risk. Likewise with running custom
      software to access online services, especially if it is
      closed-source (and possibly proprietary), so you cannot (and
      possibly not allowed to by the license) check what it is doing
      with your system. Communicating over the Internet does not
      require to provide your full name, phone number, or to identify
      yourself at all. Identifying yourself by sending pictures of
      documents is one of the sillier and dangerous
      practices. Software should not run with superuser (root)
      privileges, and generally the usual security mechanisms must not
      be bypassed, unless there is a good reason to.
    </p><p>
      If someone asks you to take unnecessary risks like that, that
      itself is a cause for suspicion, and to look for other
      options. Often it involves accepting inconveniences (such as
      visiting places and standing in queues instead of using
      proprietary software, dealing with paper documents, possibly
      with cash, missing some online conversations), resisting peer
      pressure (e.g., "just set a sensible password like 1234",
      "install our software with <code>curl | sh</code> and run its
      custom updater to be up to date", "let's run everything as root
      to avoid dealing with permissions").
    </p><p>
      If the private information is not requested by a service, or
      superuser privileges are not requested by software, it is safest
      to not volunteer to provide those: e.g., use screen names for
      online services and as a system user name (which is used as the
      default name for information sent online occasionally: the best
      way to ensure that the real name is not leaked accidentally is
      to never enter it), use dedicated system users or sandboxing
      facilities to run programs.
    </p><h3>Cryptography</h3><p>
      Cryptography provides useful tools, perhaps encryption being the
      most notable one, useful for <a href="personal-data-storage.html">personal data storage</a> (including
      encrypted backups), as well as for communication (over <a href="email.html">email</a> or
      instant messengers, such as <a href="xmpp.html">XMPP</a>), and for channel security (for
      network connections). Another common use of cryptography is for
      data integrity checks.
    </p><p>
      Following general advice given above, one should look for
      trustworthy (transparent, verifiable, openly developed) tools,
      ideally using free and open-source software exclusively,
      retrieving it from trusted sources (such as operating system's
      repositories, where the packages are signed), preferably
      checking the code, but at least preferring the tools used and
      inspected by many.
    </p><p>
      I personally use mostly LUKS for disk encryption and OpenPGP for
      file and mail encryption and signing, on a Debian system. And
      TLS, SSH, IPsec, Wireguard for channel security. Those are
      widely available, well-known tools.
    </p><p>
      The usage of LUKS with <code>cryptsetup(1)</code> is described
      in the personal data storage notes linked above, while that of
      OpenPGP is described in <a href="https://www.gnupg.org/documentation/guides.html">GnuPG's user guides</a>; it is supporetd out
      of the box in mail clients such as mu4e (an Emacs client), mutt
      (a standalone TUI client), Thunderbird (a standalone GUI
      client), and the GnuPG's <code>gpg(1)</code> command-line tool
      is fairly easy to use. For email, one may want to ensure that
      the messages are encrypted not just for recipients, but also for
      the sender, so that the sender can read them later: mutt does it
      by default (the <code>pgp_self_encrypt</code> option), for mu4e
      one should enable it
      in <code>mml-secure-openpgp-encrypt-to-self</code>.
    </p><p>
      There are endless alternatives, which tend to incorporate newest
      and shiniest algorithms (which is dangerous by itself: better to
      stick to heavily analyzed ones), to be written in this month's
      most trendy language (possibly to be abandoned soon), clean of
      the backwards--or standards--compatibility cruft accumulated by
      older tools, and supposedly easier to use, providing fun colors
      and supportive emojis. Some also like to write their own
      software, but there are many gotchas and cryptographic attacks
      that basic algorithm descriptions do not mention, which may
      easily compromise the system. Both scammers and governments like
      to advertise malware as security software, occasionally to
      disguise attacks as security measures. While more legitimate
      commercial companies tend to sell virtually useless security
      products, but not necessarily malware: perhaps more of
      placebo. <a href="https://en.wikipedia.org/wiki/Security_theater">Security theater</a> is a shady practice along those lines.
    </p><p>
      OpenPGP is criticized quite persisently, and it is indeed
      imperfect, as even its name points out: merely "pretty
      good". But as with other things, the "best" kind, as judged for
      a particular situation, is often that which is actually used at
      all, while OpenPGP usually beats the proposed alternatives in
      its applicability and (continued) availability, and in many
      cases its issues are irrelevant. There is a room for improvement
      though. For an alternative OpenPGP implementation,
      see <a href="https://sequoia-pgp.org/">Sequoia-PGP</a>. Out of standalone (but incompatible with
      OpenPGP) encryption and signing alternatives, <a href="https://github.com/FiloSottile/age">age</a> and <a href="https://jedisct1.github.io/minisign/">Minisign</a>
      are somewhat prominent. While the OpenSSL CLI tool is more
      widely available and versatile. And then there are OTR, OMEMO,
      and MLS for IMs specifically. But I think it can be quite a
      rabbit hole, while GnuPG is versatile and good enough for most
      tasks, so at least it is worthwhile to look into first.
    </p><h3>Other tactics</h3><p>
      There are minor tactics and useful habits, some of which can be
      described as simply common sense:
    </p><ul>
      <li>
        Use strong passwords (e.g., generate those
        with <code>xkcdpass</code>), do not reuse those across
        services, maybe do not reuse logins and other identifying
        information, either. That may include things like the IP
        address, web browser fingerprints, and so on.
      </li>
      <li>
        Update software (including firmware) regularly to ensure that
        known vulnerabilities are fixed in it, and pick reputable
        FLOSS options in the first place. Look into software projects
        such as <a href="https://www.gnu.org/">GNU</a>, <a href="https://kernel.org/">Linux</a>, <a href="https://www.debian.org/">Debian</a>, <a href="https://openwrt.org/">OpenWrt</a>, <a href="https://f-droid.org/">F-Droid</a>. And
        security-focused alternatives such as <a href="https://www.openbsd.org/">OpenBSD</a> and <a href="https://qubes-os.org">QubesOS</a>,
        though be careful: some people jump into rather radical,
        demanding, and possibly experimental setups, do not study
        those sufficiently, run back into bloated and proprietary
        systems, and possibly keep switching between those. I
        personally use <a href="debian-11-workstation.html">Debian stable with Xfce</a>.
      </li>
      <li>
        Think twice before publishing or otherwise sharing any private
        or sensitive information, as it is practically irreversible.
      </li>
      <li>
        If you have to use public services and expose sensitive
        information (possibly correspondence) to them, prefer the ones
        that are not easily accessible by entities that can harm
        you. For instance, it would be reckless to discuss civil
        liberties over unencrypted email while living in a
        dictatorship and under surveillance, and using a domestic mail
        server on top of that.
      </li>
      <li>
        Rely on yourself, do not assume that arbitrary systems are
        properly designed and make sense: it may seem like systems
        (software, services) made by professionals are supposed to be
        that way, but often they are not. Not only because of
        programmers' incompetence or malice, but also because odd
        decisions are made when multiple developers, managers,
        multiple interacting commercial companies, poorly composed
        requirements, cost-cutting, hurried development and following
        changes, pressure to make things more "user-friendly" are
        involved: there can be mostly competent and well-meaning
        people creating an insecure mess. Common and visible issues
        include password restrictions, mandatory recovery mechanisms,
        their silly combinations with multi-factor authentication. So
        do not rely on others for security, try to ensure it yourself:
        do not hand them private information, use end-to-end
        encryption when applicable, ensure that the software does not
        run with unnecessarily high privileges, etc.
      </li>
      <li>
        Try to reduce the impact of possible compromises: do not "put
        all your eggs in one basket", do employ other risk management
        tactics. For instance, do not tie all your online accounts to
        a single email address, domain name, identity provider, or
        phone number. And reduce the amount of sensitive information
        that you have written down (even encrypted), especially on
        Internet-connected devices.
      </li>
      <li>
        Pay attention to incentives. Particularly marketing of
        security-related services or software, often employed by
        commercial companies, tends to focus on selling things that
        are free otherwise (such as X.509 certificates, usually called
        "SSL certificates" by those, many years after SSL was renamed
        into TLS), or features that are not particularly useful, but
        help them to stand out, since they are not used by others. At
        which point their usefulness may be exaggerated. Even
        non-commercial projects may engage in a light version of that,
        with their developers looking for ways to improve existing
        systems, convincing themselves that some properties they could
        add are desirable, then promoting them.
      </li>
      <li>
        Avoid unnecessary risks (complexity): as with engineering in
        general, the more complexity there is, the harder it is to
        analyze, and the more likely it is that something will go
        wrong. As an example, to turn on the lights, usually a basic
        mechanical switch would suffice: there is no need for complex
        controllers, Wi-Fi, Internet connection, some remote servers
        controlling your lights, and you asking them to operate those,
        using additional software. Yet such <a href="https://en.wikipedia.org/wiki/Rube_Goldberg_machine">Rube Goldberg machines</a>
        seem to be worryingly common these days. This is also related
        to unnecessary loss of control, and to poor availability,
        which is another aspect of security. In programming, this
        usually amounts to avoiding unnecessarily complex
        architectures and tools, as well as unnecessary dependencies.
      </li>
      <li>
        Employ proper (usually standard and built-in) mechanisms when
        available: database roles and security policies, system users
        (with properly set file permissions) and capabilities. Often
        those are neglected by programmers, who implement such
        mechanisms from scratch, usually poorly, with risks and
        consequences similar to implementing custom cryptographic
        software.
      </li>
      <li>
        Employ <a href="https://en.wikipedia.org/wiki/Defense_in_depth_(computing)">defense in depth</a>.
      </li>
      <li>
        Reduce the <a href="https://en.wikipedia.org/wiki/Attack_surface">attack surface</a>.
      </li>
      <li>
        Keep learning, extending and revising practices.
      </li>
    </ul><h2>Further application</h2><h3>Sharing</h3><p>
      Ensuring secure practices can be interesting and fun, and one
      may be enthusiastic about it, which helps to follow them. Then
      it is tempting to share that with others, improve their security
      practices, which is what I am trying to do by writing this. But
      keep in mind that people may simply not care about it, as many
      do not care about their health enough to take care of it, of
      environment (ecology, as well as politics), of self-improvement,
      and of a variety of other topics that yet others do care
      about. Even among those who do care about information security,
      the threat models and views on ways to achieve it may differ
      considerably, also as with the other mentioned topics. And it
      can be difficult to idly observe people you care about doing
      what you think is bad for them. I think a fine balance between
      being unhelpful and annoying is to let people know that you are
      willing to help, to answer and explain things when asked to, but
      not to try to force those onto others. And maybe to work on
      useful tools, infrastructure, and documentation in order to
      satisfy the impulses to share and help, as well as to learn more
      in the process.
    </p><h3>At work</h3><p>
      The same principles apply to information security in
      organizations, when setting company's servers or developing
      enterprise software. Just as with software and hardware
      generally. There may be more bureaucratic approaches (with
      occasional checklists for compliance checks), scales are
      different, NIST's frameworks are more useful there, but it is
      basically the same thing.
    </p></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/mobile-computing.html"/><id>https://thunix.net/~defanor/notes/mobile-computing.html</id><author><name>defanor</name></author><title>Mobile computing</title><summary>Experiences of doing computing on mobile devices</summary><published>2017-07-08T12:00:00Z</published><updated>2026-04-13T11:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Mobile computing</h1><p>
      Mobile computing can be a pain, especially when done in
      uncomfortable positions, on downsized and/or underpowered
      hardware, possibly in a noisy environment and while being
      distracted. Unsuitable conditions can also make it much harder
      to focus on computing-related activities. Yet a mobile computer
      is often better than nothing, and a comfortable workplace is not
      always available exactly where you want or need it to be.
    </p><p>
      These are my notes on dealing with mobile computers over time:
      mostly the software for underpowered computers with poor input
      and output capabilities, focusing on Linux-based systems.
    </p><h2>A netbook in 2017</h2><p>
      I've been stuck with an old netbook (Intel Atom, 1 GB of main
      memory) for a couple of weeks, so wrote down some of the things
      I've learned. That's on Debian stable (Stretch was just
      released; using it with "non-free" repositories to get GNU
      documentation), with i3 window manager, and using Emacs for most
      of the tasks.
    </p><p>
      Wi-Fi is one of the most important things to set. This time,
      both
      <code>wpa_cli</code> and <code>wicd</code> claimed that the password is
      wrong, but <code>nmtui</code> (NetworkManager TUI) has connected just
      fine – though maybe it has messed up some settings for others somehow.
      Wicd was hogging resources even while not doing anything useful, as Python
      programs tend to do, so I've disabled it – it rarely worked
      anyway. <code>wpa_supplicant</code> writes log messages such as "result=4"
      and doesn't document those codes in its man page, requiring source code to
      see what's going on. And NetworkManager just repeats those.
    </p><p>
      Firefox just starts for 30-40 seconds, and then lags even
      without JS. I gave up on it, and switched to w3m (emacs-w3m);
      web services such as online banking don't work with it, but it
      is keyboard-friendly, generally works, and does not lag too
      much. To use DDG for search, one should
      customize <var>w3m-search-default-engine</var>.
    </p><p>
      As for maps, there is FoxtrotGPS – an OpenStreetMap client that
      can cache and pre-download maps. It's pretty lightweight and
      usable.
    </p><p>
      For video playback, VLC appears to be more reliable than
      mplayer, even though has its issues (including bloating, lack of
      documentation, and resource hogging even while
      idle). Unfortunately, many videos are not available via
      bittorrent, being only hosted on youtube.com or similar
      websites; youtube-dl works to extract those.
    </p><p>
      One of the painful tasks to perform without a mouse is to copy
      and paste things between a terminal emulator and other programs
      (such as GUI Emacs).  Actually it's somewhat awkward with a
      mouse, but even worse without it.  Well,
      Emacs-to-terminal-emulator is easy: there are
      <kbd>M-w</kbd> to copy from Emacs and <kbd>shift + insert</kbd> to paste
      into a TE. Copying from a TE can be done by selecting with a touchpad, and
      then <kbd>M-: (mouse-yank-primary (point)) RET</kbd> in Emacs, though it
      won't work to insert into a TE; but turns out that one can emulate the
      middle mouse button by pressing the two touchpad keys simultaneously. It's
      not great, but works; perhaps a nicer way is to use a terminal multiplexer
      functionality for that, though then one may have to use nested terminal
      multiplexers, if they are also using those remotely. Or one could use an
      Emacs TE instead of a separate one, but that could also get awkward.
    </p><p>
      Speaking of terminal multiplexers: even though normally I'm not
      using <code>tmux</code>, it is more useful to run remotely with
      an unstable connection: a remote persistent session partially
      compensates for the lack of a persistent connection and/or local
      session.
    </p><p>
      Doing Haskell programming would be a pain on a netbook because
      the REPL and <code>cabal</code> would require too much of
      resources, so I've planned to use a remote server for that: just
      run both Emacs and a REPL process there. Didn't have to do that
      in those two weeks though.
    </p><p>
      xpdf, mupdf, and zathura are relatively lightweight and portable
      PDF viewers. Xpdf has ugly GUI buttons and a mostly useless left
      pane that takes space, others use partially qwerty-oriented
      (vi-style) key bindings (while I'm using Colemak), and the
      scrolling is quite messy in both mupdf and zathura (in mupdf,
      there's no way to tell whether you're at the end of a page or
      not, but scrolling by a little amount would jump a page if
      you're at the end; zathura may skip a line when scrolling with
      spacebar).  Both xpdf and mupdf allow to adjust colors, zathura
      doesn't. So I've used both mupdf and zathura, but then
      discovered Emacs pdf-tools; didn't try it on a netbook, but it
      works nicely on a desktop: the colors are adjustable,
      keyboard-friendly, no notable issues like those with scrolling
      in others.
    </p><p>
      Bittorrent clients are not so nice to set and use: both rTorrent and
      Transmission (transmission-daemon with transmission-cli) have broken Emacs
      interfaces, which I gave up on after brief attempts to debug, since using
      a netbook doesn't make debugging more fun. Transmission is nicer in that
      it uses a daemon, which is more suitable for a program like that. To
      simplify authentication, one should either use netrc
      (<code>.authinfo.gpg</code>), or disable authentication and only allow
      local connections:
    </p><pre>"rpc-authentication-required": false,
"rpc-bind-address": "127.0.0.1",</pre><p>
      Then it's not so bad to control
      with <code>transmission-remote</code>: <code>-a</code> and
      <code>-w</code> options to add a torrent and write files into a
      specified path, <code>-l</code> to list tasks, etc. The
      Transmission IRC channel (#transmission at Freenode/Libera.chat)
      is quite helpful, and minor bugs get fixed quickly there.
    </p><p>
      The situation with music players is pretty similar. I've tried
      mpd multiple times before, and it never worked, but worked this
      time (well, after <code>mpc update</code>); mpc is usable to
      control it, even if not that fancy (i.e., plain CLI). There are
      some Emacs packages: <code>emms</code> supports mpd, but tries
      to handle all kinds of players, so the support is not so
      great; <code>bongo</code> seems to have nicer UI, but doesn't
      support mpd at all; <code>mingus</code> appears to work, but it
      refreshes its whole buffer all the time, resulting in annoying
      blinking and rendering it unusable. And there
      is <code>ncmpc</code>, which is fine;
      though <code>ncmpc-lyrics</code> has a lot of dependencies,
      including Ruby. Music playback seems to be one of the most CPU
      intensive tasks in a system with relatively little bloat.
    </p><p>
      The rest of my regular software is keyboard-oriented and
      lightweight: mu4e with mbsync for mail, circe/erc/rcirc for IRC,
      bitlbee and circe (later rexmpp) for XMPP, org-mode for notes
      and things like that, and other Emacs-based and CLI/TUI tools.
    </p><p>
      Later, in 2023, I have installed Debian 12.2 with Xfce on it. It
      takes almost 600 MB of main memory, leaving 400 for work. But by
      2025, even Debian 13 dropped support for 32-bit systems.
    </p><h2>A tablet computer in 2022</h2><p>
      During the unfortunate events in Russia in the early 2022, I
      decided to finally get a tablet computer while they are still
      available here and while I can afford one. At first I've looked
      into ones supported by <a href="https://lineageos.org/">LineageOS</a>, but those were rather old
      ones, so I went for a model that is newer, and possibly can be
      supported later -- Samsung Galaxy Tab A8. I don't have much to
      compare it to (only used one Android phone out of similar
      devices, and just as a phone, for calls), but it appears to work
      and to be a tablet.
    </p><p>
      Samsung groups the awkward software required to be installed by
      the local government into the "law" group, so it's easy to
      remove it all at once. Avoiding Google and Samsung account
      creation, and aiming its usage as both a general household
      appliance (maybe for use in the kitchen, to read in bed, etc)
      and a useful device in an isolated wasteland if/when desktop
      computers will break and have no replacement, I've set <a href="https://f-droid.org/">F-Droid</a>
      by downloading its APK, and then installed most of the software
      from it (though occasionally with APKs from their official
      websites too): <a href="https://osmand.net/">OsmAnd</a> for maps (including offline ones, from
      OSM); <a href="http://koreader.rocks/">KOReader</a> (as I use on an e-ink
      reader), <a href="https://librera.mobi/">Librera</a>, <a href="https://opendocument.app/">OpenDocument Reader</a>, and <a href="https://www.kiwix.org/en/">Kiwix</a> to read
      things; <a href="https://www.videolan.org/vlc/">VLC</a> as a music and video player; Fennec (a Firefox
      version available from F-Droid); Sketches for basic sketching;
      Notes for note taking; a couple of fancier calculators with
      graphing; <a href="https://conversations.im/">Conversations</a> as an XMPP client; the Wikipedia client
      out of curiosity, but it turned out to be handy. Also Synthesia
      to try it out with a MIDI keyboard, which mostly worked, but
      that's proprietary. <a href="https://termux.dev/en/">Termux</a> provides plenty of regular GNU/Linux
      system functionality, including Emacs in its repositories.
    </p><h2>A laptop in 2022</h2><p>
      I hear that ThinkPad (IBM originally, Lenovo now) laptops are
      nice for Linux, but they are expensive; Dell and Lenovo ones are
      commonly suggested for Linux-based systems too. Lenovo IdeaPad
      seem to be Linux-compatible, but cheaper than ThinkPad, with
      less advanced I/O (targeting consumers, not businesses). Here is
      one of the articles on the topic, linking more: <a href="https://notes.volution.ro/v1/2022/04/remarks/41dc175e/">On modern laptop
      requirements</a>.
    </p><p>
      Issues with Wi-Fi hardware support are common; see <a href="https://wireless.docs.kernel.org/en/latest/en/users/drivers.html">Existing
      Linux Wireless drivers</a>, ensure that there are drivers for a
      given laptop's hardware. <a href="https://linux-hardware.org/">Linux Hardware Database</a> is another
      potentially helpful database.
    </p><p>
      One can also look into <a href="https://fwupd.org/lvfs/vendors/">fwupd's vendor list</a> to estimate Linux
      driver support from vendors, or perhaps the <a href="https://linux-laptop.net/">Linux on Laptops</a>
      website, and other erlevant websites linked from <a href="https://old.reddit.com/r/linuxhardware/">the
      linuxhardware subreddit</a>.
    </p><p>
      I've picked a relatively inexpensive Dell Vostro 3515, which
      seems suitable for non-gaming tasks and inexpensive: a 15.6-inch
      display, plastic, no discrete graphics card, Ryzen 5 3450U and 8
      GB of main memory (2 of those are used as video memory, leaving
      about 6 for the rest of the system), 512 GB SSD, and a
      8P8C/Ethernet port (many laptops don't have those anymore), in
      addition to the common set of I/O ports.
    </p><p>
      To boot from an USB stick with a Debian 11 installer, I tried to
      add it in the boot options in the UEFI menu, but that was rather
      confusing: it asked to choose an exact <code>.efi</code> file,
      and then failed with a "Something has gone seriously wrong:
      shim_init() failed" message. Apparently that's common on
      laptops, with different Linux distributions and laptop vendors,
      but I haven't found descriptions of any working solutions,
      except for installing an older version first. What worked for me
      is just to choose a different <code>.efi</code> file, and then
      hold F12 during the boot to enter a boot menu, selecting the USB
      stick from it.
    </p><p>
      I'm always uncertain about the size of a boot partition (and
      sometimes about that of the ESP partition too), and how exactly
      to set encryption (e.g., apparently one can encrypt even the
      boot partition while using grub, but it doesn't seem that
      useful, and would lead to double password prompts). And about
      the swap partition too: usually just disabling it, but perhaps
      it's more useful on a laptop, and it's commonly suggested to
      use. I've settled on about 500 MB for ESP
      (<code>/boot/efi</code>), 500 MB for <code>/boot</code>,
      encrypted swap and ext4 root partition (<code>/</code>), without
      a separate <code>/home</code>. Then tried Debian's guided
      partitoning, and it did exactly that (after selecting use of
      encryption and of a single partition), so I just went with
      it. Though as of 2024, some recommend 1 GB or 2 GB
      for <code>/boot</code>, with Ubuntu apparently defaulting to
      almost 2 GB, and it is likely to be a pain to change later in
      such a setup, without reinstalling everything. After updating to
      Debian 13, which suggests at least 768 MB for /boot, I recreated
      those, reducing EFI to 200 MB, and increasing the boot partition
      to 800 MB.
    </p><p>
      In this case it was a Debian Xfce Live version, with non-free
      software and documentation (just as for the <a href="debian-11-workstation.html">Debian 11
      workstation</a>). It is nice and almost everything works well out of
      the box, though DPI tends to be wrong on laptops: it is 96 by
      default, while laptop screens have something closer to 144. That
      can be adjusted in the "Appearance" settings, the "Fonts" tab. I
      have also adjusted the touchpad behaviour.
    </p><p>
      In 2023, after hardly any use, the laptop ceased to charge the
      battery (it is on the "pending-charge" status all the time, even
      at 0% charge, with any UEFI charging settings), unclear why. I
      have not found a way to fix it so far. Also attempts to update
      the UEFI/BIOS firmware via "BIOS flash update" lead to an
      "invalid file" error. Some suggest to run it from FreeDOS, but
      it relies on BIOS, and the laptop appears to only support UEFI
      boot. Another option is Windows (possibly the live and
      lightweight version, Windows PE), though microsoft.com bans
      Russian addresses from downloading it, and bans hoster addresses
      where proxies are hosted as well, as of 2023 (while dell.com
      also refuses to serve requests from Russian addresses, but
      proxies work with it). Plenty of images on The Pirate Bay (which
      is blocked in Russia, but at least does not refuse to serve
      requests coming from non-residential addresses, so proxies work)
      though. I managed to install Windows ADK on a Windows 10
      machine, then to prepare a Windows PE USB stick from it. Had to
      add firmware files into the "media" directory (actually added
      into a few locations, initially failing to find any), then to
      run <code>diskpart.exe</code> and its <code>rescan</code>
      command to find the firmware (I think it was on disk C). The
      firmware complained that "The AC adapter and battery must be
      plugged in before the system bios can be flashed", had to run it
      with <code>/forceit</code> option. Then it seemed to be working,
      but got stuck on "update progress: completed". I ended up
      resetting the laptop, then it complained that "battery pack is
      removed or less than 10%". I turned it off, unplugged the cable,
      plugged it back again, and the charging LED finally stayed
      on. Waited for half an hour, turned it on, it ran the BIOS
      (UEFI) and EC update process again, but then rebooted itself. It
      forgot where the boot media is, I pointed it manually to a
      Debian's <code>.efi</code> file again. Then it booted and was
      charging. Better to look for laptops with a sane firmware update
      process.
    </p><p>
      With this laptop, I have also experienced odd touchpad issues,
      which unfortunately seem quite common: in this case, it ceases
      to move the cursor after a seemingly random time after the boot,
      though clicking works, and it is fine again after a
      reboot. Sounds similar to <a href="https://askubuntu.com/questions/1233543/touchpad-stops-working-after-a-while">the "Touchpad stops working after a
      while" issue</a>, but there is no touchpad mode setting in this
      laptop's BIOS/UEFI settings. Later noticed that Bluetooth does
      not work well, either, at least with a Bluetooth speaker: there
      are occasional audible interrupts, and a stream of kernel module
      error messages in the logs.
    </p><h2>A smartphone in 2022</h2><p>
      I acquired a Google Pixel 6a (not exported here officially, so
      without a warranty, and no spare parts available; but at least
      not certified in Russia, so no mandatory malware installed on
      it), which has a plain Android system, and is supported by most
      of the alternative Android distributions. The software to set on
      it is similar to that on a tablet: F-Droid (with Guardian
      Project repositories), then Conversations, ConnectBot, OsmAnd+
      (with pale road style, 150% text size), Compass
      (com.bobek.compass), Wikipedia, VLC, Fennec (+ uBlock Origin,
      noscript, HTTPS everywhere), Tor Browser (with a bridge set
      manually), Notes, Librera, Yaaic, Termux (with Emacs on it, as
      well as openssh and rsync, and allowing it access to storage, so
      that pictures and other files can be transferred over SSH with
      rsync: for instance, to synchronize the pictures -- <code>rsync
      -av -e 'ssh -p 8022' --exclude='.trashed-*'
      user@host:storage/dcim/OpenCamera/
      ~/Pictures/OpenCamera/</code>; but by 2025 it ceased to work,
      since Android increasingly locks everything down). Later I added
      strongSwan and WG Tunnel (to connect to a home network as a
      "road warrior") and baresip (though still mostly using
      Conversations for calls), Just Another Workout Timer, Open
      Camera, WiFiAnalyzer, Kiwix, Aegis Authenticator (hOTP/TOTP),
      Orgzly (an org-mode viewer/editor), Material Files (a file
      manager with WebDAV, FTP, SFTP, SMB support), a couple of games
      (Shattered Pixel Dungeon, Mindustry), K-9 Mail, Briar.
    </p><p>
      The camera on this phone appears to produce rather bleak (washed
      out, desaturated) pictures, which is particularly apparent after
      enabling raw (DNG) picture writing. There are multiple ways to
      saturate it in darktable, but "tone curve" with independent
      CIELAB channels in particular is handy and versatile; the
      "denoise" module then helps to get rid of the produced
      noise. Perhaps one may also change the input color profile:
      colors look almost fine with sRGB instead of the embedded one;
      apparently it is a common problem with Pixel phones, see
      "<a href="https://discuss.pixls.us/t/colours-washed-out-from-pixel-7-dng/38455">Colours washed out from Pixel 7 DNG</a>".
    </p><h2>A laptop in 2026</h2><p>
      <a href="https://linux-hardware.org/?probe=e087c08de0">Lenovo IdeaPad Slim 3 16AHP10</a> looks like a fine option: the I/O
      is not as good as on ThinkPad or IdeaBook ones, but it is
      inexpensive, has a power-efficient CPU, okay specifications, and
      Debian runs well on it. I have set it to "battery saving mode"
      in the UEFI settings, booted from a Debian live Xfce USB stick,
      partitioned its 512 GB disk on installation (using Debian's
      regular installer, not the GUI Calamares one) as follows: 1 GB
      for ESP, 1 GB for <code>/boot</code>, then LUKS with LVM on top:
      80 GiB for the root file system, the rest for home; used ext4
      this time, with <code>noatime,nodiratime</code> mount options
      (see <a href="https://wiki.debian.org/%20SSDOptimization">SSDOptimization in Debian Wiki</a>). Then have set a more
      suitable DPI (142 for a 16-inch screen with 1920 by 1200
      resolution; see also: <a href="https://wiki.archlinux.org/title/HiDPI">Arch Wiki HIDPI</a>, <a href="https://wiki.debian.org/MonitorDPI">Debian Wiki
      MonitorDPI</a>, <a href="https://wiki.archlinux.org/title/LightDM#HiDPI_or_4K_configuration">Arch Wiki LightDM HIDPI</a>) and larger fonts (12) in
      Xfce settings, as well as <code>xft-dpi=142</code>
      in <code>/etc/lightdm/lightdm-gtk-greeter.conf</code>,
      ran <code>sudo dpkg-reconfigure console-setup</code> to set a
      larger tty font size (DejaVu, 16x30),
      added <code>/usr/bin/setxkbmap -option "ctrl:nocaps"</code> into
      Xfce startup commands (another option is to set it
      via <code>/etc/default/keyboard</code>,
      e.g.: <code>XKBOPTIONS="ctrl:nocaps,grp:shifts_toggle"</code>,
      <code>XKBLAYOUT="us,us,ru"</code>, <code>XKBVARIANT="colemak,,"</code>),
      have set locale to <code>C.UTF-8</code>
      in <code>/etc/locale.conf</code>, disabled the loud PC speaker
      with
      <code>sudo rmmod pcspkr &amp;&amp; echo 'blacklist pcspkr' |
      sudo tee /etc/modprobe.d/nobeep.conf</code>, set additional DNS
      servers (74.82.42.42, 208.67.222.222, 8.8.8.8) for
      NetworkManager, installed a few Firefox extensions (uBlock
      Origin, noscript, FoxyProxy), configured input methods and
      touchpad behavior in Xfce settings, configured some of its
      panels, generated an SSH key with <code>ssh-keygen</code>, added
      it to the agent with <code>ssh-add ~/.ssh/id_ed25519</code>,
      disabled menu access keys in Xfce terminal preferences (so that
      it does not intercept shortcuts like M-f), removed some of the
      unnecessary packages, installed useful ones, configured a little
      more:
    </p><pre>sudo apt install task-laptop task-english smartmontools dkms
sudo apt remove live-task-localisation live-task-localisation-desktop
sudo apt autoremove
sudo apt remove 'hunspell*'
sudo -e /etc/apt/sources.list # Add "contrib non-free non-free-firmware"
sudo apt update
sudo apt install systemd-timesyncd openssh-server rsync emacs mu4e isync git \
  elpa-{magit,haskell-mode,nov} ghc cabal-install texinfo mtr-tiny nftables \
  mpv vlc telnet xsltproc clementine lynx mutt irssi whois nmap ncat dnsutils \
  knot-dnsutils tmux fbreader inkscape gimp lmms musescore libxml2-utils \
  xkcdpass wireguard tinc tor obfs4proxy shadowsocks-libev kiwix kiwix-tools \
  autoconf autoconf-doc libtool pkgconf libexpat1-dev libgsasl-dev \
  libssl-dev libcurl4-openssl-dev build-essential dino-im goldendict \
  dict-freedict-{deu-eng,fra-eng,lat-eng,eng-rus,deu-rus,fra-rus,eng-deu}  \
  dict-gcide transmission pandoc audacity festival sox postgresql \
  nginx libnginx-mod-http-dav-ext aptitude jmtpfs emacs-common-non-dfsg \
  texlive texlive-plain-generic texlive-xetex texlive-lang-cyrillic \
  blueman sqlite3 libsqlite3-dev gcc gcc-doc glibc-doc-reference \
  python3-{sympy,scipy,numpy,matplotlib,psycopg,doc} \
  info guile-3.0 guile-3.0-doc oathtool iotop \
  opus-tools vorbis-tools flac cuetools wavpack ffmpeg
# ... darktable blender librecad freecad kicad evince
# prosody coturn uacme inspircd mumble-server mumble qemu-system icecast2
# libvirt-clients libvirt-daemon-system virtinst dnsmasq-base bridge-utils
# debian-reference-en debian-kernel-handbook linux-doc user-mode-linux-doc
sudo -e /etc/ssh/sshd_config # "PasswordAuthentication no"
# Disable some services: going to run them manually, as needed.
sudo systemctl disable --now tor tinc shadowsocks-libev nginx postgresql \
  bluetooth
killall pulseaudio # restart to load pulseaudio-module-bluetooth
# (for PipeWire, use libspa-0.2-bluetooth instead)
# Optionally, upload hardware information to the linux hardware database
sudo hw-probe --all --upload
# Enable battery conservation mode (remembered between boots),
# so it does not charge above 80%.
# Also can be done with TLP, STOP_CHARGE_THRESH_BAT0=1.
echo 1 | sudo tee /sys/bus/platform/drivers/ideapad_acpi/VPC2004:00/conservation_mode</pre><p>
      Then it was left to copy personal files (dotfiles, documents,
      books, music, etc) onto it, and configure things further, but
      this is a basic initial setup.
    </p><p>
      One may try to select "Advanced install options", "Text
      installer", "Expert Install" in the installer, so that there
      will be an option to install a "normal" system instead of
      "live", but it still installs some of those "live-task"
      packages.
    </p><p>
      Though the hardware seems to work well generally, eventually I
      noticed an I/O error during reading from its SSD, reporting a
      timeout and a controller reset, similar to "<a href="https://askubuntu.com/questions/1557696/ubuntu-24-04-freezes-with-nvme-nvme0-i-o-timeout-error">Ubuntu 24.04 freezes
      with "nvme nvme0: I/O" timeout error</a>" or "<a href="https://community.frame.work/t/nvme-timeout-woes/54999">NVME timeout woes</a>";
      have not tried adding <code>pcie_aspm=off</code> or
      <code>nvme_core.default_ps_max_latency_us=100
      nvme_core.io_timeout=3000</code> into /etc/default/grub myself,
      yet, but probably will try it, if it will keep
      happening. Apparently those things happen on some laptops. While
      Wi-Fi, Bluetooth, touchpad, and SD card reader work smoothly.
    </p><p>
      I have also set Debian 13 and Windows 11 on another 16AHP10
      laptop (not for myself), and surprisingly, while Wi-Fi worked on
      Debian out of the box, it required to manually install drivers
      on Windows.
    </p><h2>E-readers</h2><p>
      Kobo devices are supported by <a href="https://github.com/koreader/koreader">Koreader</a>, among a few
      others. Apparently both Kobo and reMarkable are suitable for
      running Linux and custom software on them; see "<a href="https://rmkit.dev/eink-is-so-retropunk/">E-ink is so
      Retropunk</a>".
    </p><h2>Power</h2><p>
      Some devices, particularly infrequently used and mostly
      stationary ones, such as radio receivers, may be awkward to
      power: batteries may be wasteful for stationary ones, and
      inconvenient to keep in a usable state for infrequently used
      devices, while inbuilt cheap AC-to-DC converters are unreliable
      and occasionally humming, and mimicking batteries with external
      AC-to-DC converters is tricky (they tend to provide higher
      voltages than 1.5 V of common batteries, and connecting those
      snugly would be tricky). A good option is to pick devices
      relying on external AC-to-DC converters, as laptops, phones, and
      e-readers do: it is usable for both direct usage and battery
      recharging, and battery chargers can be very simple, not adding
      complexity; many devices use USB for DC power input these days.
    </p><p>
      By 2026, there are laptops supporting USB-C for power input.
    </p></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/email.html"/><id>https://thunix.net/~defanor/notes/email.html</id><author><name>defanor</name></author><title>Email</title><summary>Email usage notes, including mail server maintenance</summary><published>2016-06-28T12:00:00Z</published><updated>2026-03-29T09:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Email</h1><p>
      I quite like email: perhaps not so much because of its design or
      technical qualities, but because nice tools exist and there are
      plenty of users, so it can be used for communication
      easily. Though even the design is not bad: SMTP by itself is
      quite usable, OpenPGP is better than plain text messages (though
      could be much better, and there is criticizm), it is all open
      and federated. Some of the email criticizm goes as far as to
      propose to replace it with something, but without proposing any
      viable alternative, so it does not seem like the time to abolish
      it yet, and here are some email-related notes.
    </p><h2>Server</h2><ol>
      <li>Configure (and install if needed – though usually it's
        present, but barely used) <a href="http://www.postfix.org/">Postfix</a> or other <a href="https://en.wikipedia.org/wiki/Message_transfer_agent">MTA</a>. There are
        guides around, it is pretty simple, and actually that's it:
        the rest builds around it.</li>
      <li>To not look like a spammer to other servers:
        <ul>
          <li>Set <a href="https://en.wikipedia.org/wiki/DomainKeys_Identified_Mail">DKIM</a>: DNS record and <a href="http://www.opendkim.org/">OpenDKIM</a></li>
          <li>Set <a href="https://en.wikipedia.org/wiki/Sender_Policy_Framework">SPF</a> DNS record</li>
          <li>Set <a href="https://en.wikipedia.org/wiki/DMARC">DMARC</a> DNS record</li>
          <li>Set <a href="https://en.wikipedia.org/wiki/Reverse_DNS_lookup">reverse DNS</a> records</li>
          <li>Get into <a href="https://www.dnswl.org/">DNSWL</a></li>
          <li>If IPv6 is used, make sure that a /64 subnet is assigned (as
            per <a href="https://tools.ietf.org/html/rfc6177">RFC6177</a>)</li>
        </ul>
      </li>
      <li>To filter spam, set <a href="http://www.postfix.org/POSTSCREEN_README.html">postscreen</a> and regular Postfix settings
        (see <a href="http://jimsun.linxnet.com/misc/postfix-anti-UCE.txt">Postfix Anti-UCE Cheat Sheet</a> and <a href="http://rob0.nodns4.us/postscreen.html">rob0's postscreen(8)
        configuration</a>; a local caching DNS server is useful to speed
        things up a bit). It works well to filter the spam,
        while <a href="https://spamassassin.apache.org/">spamassassin</a> (via <a href="https://savannah.nongnu.org/projects/spamass-milt/">spamass-milt</a>, for instance) may hog
        too much memory for a small VM, leading to OOM killer
        rage. Other options include <a href="http://bogofilter.sourceforge.net/">bogofilter</a>, which would require
        training, and <a href="https://www.rspamd.com/">Rspamd</a>. <a href="http://postgrey.schweikert.ch/">Postgrey</a> may also be used.</li>
      <li><a href="https://letsencrypt.org/">LE</a> to obtain <a href="https://en.wikipedia.org/wiki/X.509">X.509</a> certificates for <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a>. ACME clients are
        mostly poor, but <a href="https://github.com/ndilieto/uacme/">uacme</a> and <a href="https://certbot.eff.org/">certbot</a> are fine after some
        tweaking (particularly setting them to run as a dedicated
        user, rather than root).</li>
      <li><a href="http://dovecot.org/">Dovecot</a> or something else for IMAP, possibly for SMTP
        submission, and/or synchronization over SSH (optionally: as an
        alternative, one can read messages via ssh on a server,
        retrieve them into a local maildir with rsync, or just read
        and compose them on the server).</li>
      <li>Optionally, set <a href="https://wiki.gnupg.org/WKD">Web Key Directory</a>, <a href="https://en.wikipedia.org/wiki/DNS-based_Authentication_of_Named_Entities">DANE</a> (<a href="https://tools.ietf.org/html/rfc7929">RFC 7929</a>), or other
        OpenPGP key discovery method.</li>
    </ol><p>
      Dovecot can also be used for SASL (for both Dovecot and
      Postfix). See the <a href="private-server-setup.html">private server setup</a> and <a href="simpler-server-setup.html">simpler server setup</a>
      documentation for more precise instructions, and possibly the
      "<a href="user-authentication.html">user authentication</a>" note for more options.
    </p><h3>IPv6 and DNSBLs</h3><p>
      DNSBL records appear for no apparent (or discoverable) reason in
      spamhaus's CSS blacklist (part of ZEN), /64 IPv6 subnets at
      once; delisting procedure is automated but complicated by Google
      captcha and partially broken (it reports success without
      actually delisting, and sometimes reports a captcha error even
      after solving the captcha, which is quite hard when using
      Tor). See also: <a href="http://forum.spamcop.net/topic/21566-blacklisted-by-spamhaus-sblcss/">Blacklisted by Spamhaus SBLCSS</a>.
    </p><p>
      One way to mitigate it is to stick to
      IPv4: <code>smtp_address_preference = ipv4</code>
      in <code>/etc/postfix/main.cf</code>. Another one is to get a
      /64 IPv6 subnet, assuming that they don't just blacklist subnets
      at random.
    </p><h3>Being marked as spam</h3><p>
      Gmail (and maybe other large email providers) would occasionally
      mark/hide messages coming from smaller servers (and/or just not
      from themselves) as spam, even with SPF, DKIM, whitelists,
      messages being sent/delivered from them to you first. Not much
      can be done about it: once a mail server accepts a message, it
      is its responsibility to deliver it. Large commercial companies
      just keep messing up interoperability, as they always do.
    </p><h3>Spam that gets through</h3><p>
      Not much spam gets through with just configured Postfix and
      postscreen, but when it does, it should be possible to report
      the abuse to its ISP.  Though spam from those who accept such
      reports and resolve the issues is unlikely, and as a last resort
      there are <code>client_checks</code> (or a firewall) to reject
      messages from spammy IP addresses or subnets. But one should be
      careful with that, since it is rather frustrating (and all too
      common) when you're a good actor being treated as a bad one.
    </p><p>
      Dealing with spam coming form large providers is about as tricky
      as sending messages to them: they deliver spam just as regular
      messages, don't get blacklisted by honeypots automatically, and
      you proboably don't want to blacklist them manually because of
      all the legitimate users. Yet Gmail's abuse report form seems to
      be broken (simply nothing happens when I hit "submit": no
      network requests or UI changes, even with JS enabled), and their
      support is infamousely unreachable even by their own users. Then
      there's IP address's abuse contact (ripe-contact@google.com),
      but since they are their own hoster, it's probably also broken
      (as with the web UI, there's no visible reaction, not even
      automated; though at least there's a possibility of it working).
    </p><p>
      For more on incoming spam, see my <a href="network-abuse.html">network abuse</a> notes.
    </p><h3>Port 25 redirection</h3><p>
      Residential ISPs tend to block incoming SMTP connections, which
      is supposed to stop spam somehow, but if it was not for that, an
      IP address without NAT (and preferably static) would be
      sufficient at least to receive email directly, without a remote
      server. To get around that, there are services for port
      redirection, though I have not tried any, and they seem to be
      odd and/or to cost about as much as a remote VM (similarly to
      paid email).
    </p><h2>Client</h2><p>
      Both <a href="http://notmuchmail.org">notmuch</a> and <a href="http://www.djcbsoftware.nl/code/mu/mu4e.html">mu4e</a> use <a href="http://xapian.org/">xapian</a>, which provides fast search. It
      is also nice to compose and read messages in <a href="https://www.gnu.org/software/emacs/">Emacs</a> (unless you
      are a vi user, perhaps), so I target those.
    </p><p>
      Some prefer mutt, which has a simpler configuration, and less
      modular, more self-contained. But its default key bindings are
      based on those of Vim, QWERTY-oriented, which is awkward if you
      use a different keyboard layout. Thunderbird is quite bloated,
      but perhaps more suitable for casual users, including mail
      services that require OAuth. It also supports OpenPGP now, but
      Maildir is not quite supported. Evolution looks similar to
      Thunderbird. Claws Mail looked odd and half-baked all around to
      me each time I tried it over the years, but it is a relatively
      lightweight GUI client, supporting OpenPGP and Maildir, but not
      OAuth, being similar in that to most other lightweight
      clients. But I focus on simpler Emacs clients (such as mu4e) in
      the following sections.
    </p><h3>Option 1: IMAP + SMTP</h3><p>
      <a href="http://isync.sourceforge.net/mbsync.html">mbsync</a> can be used to retrieve messages via IMAP, and Postfix
      can also be set locally to get more flexibility and better SASL
      options than emacs <code>smtpmail</code> library provides (see
      the <a href="user-authentication.html">user authentication</a> note).
    </p><h3>Option 2: SSH</h3><p>
      SSH-only setup allows to use just SSH keys, with no SMTP or IMAP
      between client and server. Messages can be sent with a remote
      sendmail, while a remote Maildir can be accessed via sshfs, or
      messages can be retrieved with, for instance, <a href="https://wiki2.dovecot.org/Tools/Doveadm/Sync">doveadm sync</a>. An
      example with relevant mu4e context variables:
    </p><pre>(message-send-mail-function   . message-send-mail-with-sendmail)
(sendmail-program             . "/home/defanor/bin/example-sendmail.sh")
(mu4e-get-mail-command        .
,(concat "doveadm sync sh -c "
"\"SSH_AUTH_SOCK=$SSH_AUTH_SOCK ssh mail.example.com doveadm dsync-server\""))</pre><p>
      And <code>example-sendmail.sh</code>:
    </p><pre>#!/bin/sh
ssh mail.example.com /usr/sbin/sendmail "$@"</pre><p>
      Though an issue with this method of synchronisation (as
      described here, without additional customisations) is that
      messages removed from <code>mu4e</code> would be reloaded
      by <code>doveadm sync</code>, and one would have to
      use <code>doveadm search</code> and <code>doveadm expunge</code>
      instead, or switch to IMAP for cleanup. Or use the sshfs method.
    </p><p>
      Another caveat is that even setting the remote sendmail script
      as <code>sendmail</code> in <code>$PATH</code> won't necessarily
      make all the programs to use it: for instance, git would still
      require to set it explicitly (in <code>.gitconfig</code> or as a
      command-line argument), as "smtp-server":
    </p><pre>[sendemail]
    smtpServer = /home/defanor/bin/example-sendmail.sh</pre><p>
      Later it turned out to be handy to set mail sending this way,
      while retrieving it via IMAP, when a public provider (Yandex)
      that I used for work email with my own domain, to avoid
      dependencies on a personal server, decided to charge for using a
      custom domain and disabled SMTP. That way, the work server does
      not have to accept SMTP connections still, and it was already
      configured to send mail notifications from local clients (for
      both a website hosted there and munin), while incoming mail is
      handled as it used to be.
    </p><h3>OpenPGP</h3><p>
      GnuPG can be used with mu4e (and perhaps most of the other
      common Emacs MUAs) out of the box, does not require any special
      setup.
    </p><h3>mu4e with git</h3><p>
      While <code>git-send-email(1)</code> bypasses mu4e, receiving
      patches still requires to point git (or another DVCS) to a
      message that is normally first seen in one's MUA. I find it
      handy to define a custom mu4e message action that simply
      does <code>(kill-new (mu4e-message-field msg :path))</code>, so
      that the result can then be fed into <code>git-am(1)</code>.
    </p><h3>MIME part detachment</h3><p>
      Sometimes people attach large files (particularly
      high-resolution images of their pets) to messages, which quickly
      inflate the total mail archive size, complicating their backups
      and migrations. When the messages also contain texts, it is
      undesirable to remove the correspondence, but some MUAs can
      remove individual parts. Particularly mutt can: backup the
      maildir, save the attachments separately if needed, <code>mutt
      -f maildir-path</code>, then open a message, <code>v</code> to
      view attachments, select an image, <code>d</code> to delete
      it. Then one may have to rebuild indexes, synchronize messages.
    </p><h2>Etiquette</h2><p>
      While there are different views and advices on email etiquette,
      relatively common ones are to use plain text, to properly quote
      relevant parts of messages when needed, to avoid bloating
      messages with signatures, and of course to adhere to general
      writing practices. Or, in other words, to be considerate and
      make minimal assumptions about readers' MUAs. <a href="https://www.ietf.org/rfc/rfc1855.txt">RFC 1855
      (Netiquette Guidelines)</a> is worth reading.
    </p><h2>Public providers</h2><p>
      With seemingly decent email providers (e.g., <a href="https://fastmail.com/">fastmail.com</a>
      (banned in Russia), <a href="https://www.migadu.com/">migadu.com</a>), accounts cost like a hosted VM
      (VPS, VDS, or whatever they are called this year) or more, so it
      may be desirable to get a remote VM at once. Although there are
      slightly cheaper (or even partially free) ones as
      well: <a href="https://mailbox.org/">mailbox.org</a> (blocked in Russia), <a href="https://runbox.com/">runbox.com</a>, <a href="https://mailfence.com/">mailfence.com</a>
      (also blocked in Russia), <a href="https://posteo.de/en">posteo.de</a>, maybe <a href="https://www.mailo.com/">mailo.com</a> (blocked in
      Russia). As for free ones, there is a few seemingly fine
      options, though usually they don't seem that nice after an
      attempt to use them; the ones commonly advertised as secure
      and/or ethical tend to not even provide SMTP and/or IMAP, not to
      mention SSH. Domain registrars tend to provide email services,
      though the quality varies. And there are ones like <a href="http://sdf.org/">sdf.org</a> and
      other pubnixes, including tildeverse ones, financed primarily
      with donations. Also <a href="https://disroot.org/">disroot.org</a>, <a href="https://dismail.de/">dismail.de</a> (no new
      registrations since 2021-05-28 though), <a href="https://riseup.net/">riseup.net</a> (rather
      politicized, blocked in Russia).
    </p><p>
      In 2024, I registered at Microsoft's hotmail.com, but my account
      (which only received one confirmation message from
      OpenStreetMap) was locked in a couple of days, with Microsoft
      claiming that it violated an unidentified part of the agreement,
      and that they need my phone number in order to resolve
      it. Apparently <a href="https://old.reddit.com/r/microsoft/comments/1aurot8/your_account_has_been_locked/">people who provide their phone numbers are
      unexpectedly locked out of Microsoft accounts as well</a>, and there
      are regular stories like that about Google's Gmail, too. Though
      it also looks like many people do manage to use those larger
      services. As mentioned above, I ran into an unpleasant change of
      service terms with Yandex as well. Also receiving Gmail spam,
      reporting it, but spam from the same addresses keeps coming
      afterwards. Interaction with those larger commercial IT
      companies is generally a bad experience.
    </p><h2>On reliability</h2><p>
      My primary concern with using private email for everything has
      been that regarding reliability, which is actually broader than
      just email. And if it is set on a single machine that you also
      use for everything else, that is a single point of failure for
      many things.
    </p><p>
      There are potential issues with public services as well: the
      companies that maintain those can go out of business, usually
      can do whatever they want with user accounts and data (commonly
      selling the data, <a href="https://news.ycombinator.com/item?id=30051054">messing up authentication and blocking
      accounts for strange reasons, with no way to contact customer
      support</a>, <a href="http://news.bbc.co.uk/2/hi/science/nature/2138014.stm">sometimes mangling messages</a>, restricting access to
      accounts until you provide more of personal data after a policy
      change), with the services they provide (including turning
      unlimited plans into limited ones, free into paid, cheap into
      more expensive), etc. Even technical issues with larger services
      may be equally or more common: though they have dedicated staff,
      larger setups tend to be considerably more complex and unusual,
      hence less reliable.
    </p><p>
      But private ones require regular payments and maintenance. It is
      not much harder than maintaining your personal machine, and
      usually cheaper than paying for an internet connection,
      electricity, and so on, but it is an additional burden. Very
      small one, but collecting things like that is always unpleasant:
      there is no shortage of other ways to get into trouble simply by
      staying idle.
    </p><p>
      Using 2-3 servers instead of one and teaming up with others (for
      both payments and maintenance) may be helpful to mitigate those
      issues, but that requires some trust. That is a hard part, since
      not many people seem to care about service providers, control,
      etc. Maybe it is a good approach though: worrying about all the
      small things and possibilities may be too much, whether one uses
      a private or a public service.
    </p><p>
      It is particularly unfortunate when other online services depend
      on email, allowing email-based account recovery: that way, the
      loss of an email address compromises those accounts as
      well. Sometimes it is possible to set a two-factor
      authentication, with the second factor being something
      relatively sensible, like TOTP, effectively disabling
      email-based account recovery that way, since that usually only
      allows to reset the password.
    </p></xhtml:div></content></entry>
  <entry><link rel="alternate" href="https://thunix.net/~defanor/notes/music-studies.html"/><id>https://thunix.net/~defanor/notes/music-studies.html</id><author><name>defanor</name></author><title>Music studies</title><summary>Music theory, practice, and software</summary><published>2022-03-13T09:00:00Z</published><updated>2026-03-22T09:00:00Z</updated><content type="xhtml"><xhtml:div xmlns="http://www.w3.org/1999/xhtml"><h1>Music studies</h1><p>
      While music itself is pleasant to listen to, the theory behind
      it, along with maths for processing or synthesizing it, as well
      as the process of performing it, can be quite fun.
    </p><h2>Music theory</h2><p>
      <a href="https://eev.ee/blog/2016/09/15/music-theory-for-nerds/">Music theory for nerds</a> is a great starting point. "<a href="https://dmitri.mycpanel.princeton.edu/files/pdfs/MUS105handouts.pdf">What Makes
      Music Sound Good?</a>" is another overview and introduction,
      though perhaps more opinionated.
    </p><p>
      Some of the related and interesting research areas are those of
      music origin and purpose, such as <a href="https://en.wikipedia.org/wiki/Evolutionary_musicology">evolutionary musicology</a>, and
      how it's perceived by humans: <a href="https://en.wikipedia.org/wiki/Psychoacoustics">psychoacoustics</a>, <a href="https://en.wikipedia.org/wiki/Music_psychology">music
      psychology</a>, <a href="https://en.wikipedia.org/wiki/Music_and_emotion">music and emotion</a>.
    </p><p>
      The <a href="https://news.ycombinator.com/item?id=35272536">Ask HN: Tools to learn music theory?</a> discussion contains a
      few more relevant links.
    </p><p>
      <a href="https://openmusictheory.github.io/">Open Music Theory</a> looks like a nice textbook.
    </p><p>
      <a href="computing-context.html">As with computing</a> and maths, it is useful to study history of
      the subject as well, so that more of it will make sense, and it
      will be easier to put into a perspective. Videos on history of
      music can be found on YouTube, as well as on PeerTube, where
      some of the <a href="https://www.pianotv.net/">PianoTV</a> videos are available.
    </p><h2>Generation and processing</h2><p>
      The <a href="https://en.wikipedia.org/wiki/Pulse-code_modulation">PCM</a> format is to audio basically what <a href="https://en.wikipedia.org/wiki/Netpbm">netpbm/PPM/PGM/PBM/PNM</a>
      is to graphics: very simple and straightforward, can be played
      with ffplay and others, easy to generate programmatically and
      write into a file without any encoder libraries, as well as to
      read without a special decoder. Audio I/O libraries (e.g.,
      PortAudio) and codec libraries (e.g., libopus) tend to work with
      it.
    </p><p>
      DCT/DFT are often involved in processing (and in compression,
      also similarly to graphics), Mel-frequency cepstrum can be
      useful and/or interesting to look into.
    </p><h2>Analysis</h2><p>
      Audacity is handy for checking the spectrum and notes in it, for
      music transcription and other checks.
    </p><h2>MIDI keyboard</h2><p>
      To practice playing piano using a MIDI keyboard, one needs at
      least a software synthesizer and some music scores.
    </p><p>
      The keyboard in this case is M-Audio Keystation 88 MK3, which
      worked easily with Linux (5.10, Debian), Windows 10, and an
      Android tablet (Samsung Galaxy Tab A8, connected with a
      USB-A-to-USB-C adapter). For a synthesizer, I've used Yoshimi on
      Linux, LMMS (mostly with its sf2/soundfont plugin) on Linux and
      Windows, and Synthesia (not in F-Droid repositories, and I don't
      have a Google account, but grabbed an apk from their website) on
      Android.
    </p><p>
      MuseScore allows to compose sheet music and export it into MIDI
      rather quickly and easily, and there are more editors and
      converters of that kind available from Debian repositories.
    </p><p>
      PianoBooster looks like a nice trainer, akin to GNU Typist, but
      I found it quite annoying that it counts it as a mistake if you
      press a key too soon, so switched back to just reading scores
      and playing from those.
    </p><p>
      I use my computer screen to read sheet music, with the keyboard
      stand placed behind my computer chair, so it has to be zoomed in
      (Xfce's zooming in is quite handy when software can't zoom in on
      its own), and scrolling is needed for larger compositions, but
      the regular computer keyboard and mouse are out of reach. The
      MIDI keyboard has directional keys, messages from which come
      from a separate MIDI port; I haven't found readily available
      software (possibly LMMS plugins) helping to scroll the notes
      from a MIDI keyboard, but it took just a small script to
      achieve:
    </p><pre>import mido
from xdo import xdo

# apt install python3-mido python3-xdo libportmidi-dev python3-rtmidi

# perhaps can be done in bash, with something like amidi + xdotool

# https://gitlab.com/dkg/python-xdo/-/blob/main/xdo/__init__.py
# https://gitlab.com/cunidev/gestures/-/wikis/xdotool-list-of-key-codes

# print(mido.get_input_names())

mapping = {
    96: 'Page_Up',
    97: 'Page_Down',
    98: 'Left',
    99: 'Right',
    100: 'space'
}

x = xdo()

with mido.open_input('Keystation 88 MK3:Keystation 88 MK3 MIDI 2 24:1') as port:
    for msg in port:
        # print(msg)
        if msg.note in mapping and msg.velocity == 127:
            x.send_keysequence_window(mapping[msg.note])</pre><p><a href="https://pianoguidelessons.com/fingering-scales-on-the-piano/">Fingering Scales on the Piano</a> is a handy outline.</p><h2>Sheet music</h2><p>
      <a href="https://imslp.org/">IMSLP.org</a> is a nice source of public domain or otherwise freely
      available scores (including solo piano
      arrangements). Additionally, there are MIDI music collections
      around, which are lightweight, but encode melodies, which can
      then be viewed as scores (e.g., with MuseScore 2). <a href="https://musopen.org/sheetmusic/">Musopen</a> also
      provides sheet music, as well as recordings of classical music.
    </p><h2>Composition</h2><p>
      Music composition seems to be rather similar to poetry, and to
      arts in general: a creative process, but one can reuse a <a href="https://en.wikipedia.org/wiki/Musical_form">musical
      form</a>, learn and use a variety of approaches and tricks (by
      analyzing existing works, in addition to just reading about
      techniques), experiment and try things out.
    </p><p>
      <a href="https://en.wikipedia.org/wiki/Music_appreciation">Music appreciation</a> seems useful to study as well; "<a href="https://www.youtube.com/@InsidetheScore">Inside the
      Score</a>" is one of the YouTube channels focusing on that.
    </p><p>
      <a href="https://www.youtube.com/c/RyanLeach/about">Ryan Leach on YouTube</a> makes nice videos explaining the
      composition process. <a href="https://www.youtube.com/c/DavidBennettPiano">David Bennett Piano</a> brings up plenty of
      interesting subjects and analyzes songs.
    </p><h2>On singing</h2><p>
      While I don't sing, a brief look into it suggests that as with
      most of other skills, it's primarily about learning and
      practicing, exercising.
    </p><p>
      Yet even without singing, it is interesting to learn about <a href="https://en.wikipedia.org/wiki/Vocal_register">vocal
      registers</a> and related topics.
    </p><p>
      Possibly it is a wrong way to learn and practice, but I found it
      fun to set a tuner program on a phone (e.g., <a href="https://f-droid.org/packages/org.billthefarmer.tuner/">Tuner</a> from F-Droid
      repositories), and try hitting notes.
    </p><h2>Ear training</h2><p>
      For Android, there is the Open Ear program, available from
      Android: seems to be a little buggy (making noises), but allows
      to practice recognition of scale degrees.
    </p><p>
      The <a href="https://www.musictheory.net/">musictheory.net</a> website also provides exercises, including
      those for ear training. A similar one to train playing from
      memory is <a href="https://lend-me-your-ears.specr.net/">lend-me-your-ears.specr.net</a>.
    </p><p>
      And I composed <a href="https://codeberg.org/defanor/ear-training/">a shell script using SoX</a> for practice of
      identification of scale degrees and intervals.
    </p><h2>Motivation</h2><p>
      Sometimes I find myself questioning the usefulness of these
      amateur music studies, particularly of playing instruments
      (while the theory and composition may conceivably be applied
      somehow), but it helps to view it as a recreational activity,
      quite similar to a game: the process itself should be enjoyable.
    </p></xhtml:div></content></entry>
</feed>
