Tilde Club Badge

petbrain

I discovered twtxt occasionally, when glanced at tilde.club server stats. I found the idea behind twtxt useful to talk to myself, it just needed some extension.

In the first place I use AMW instead of plain text. AMW looks the best format for raw source data I needed for years.

There's no well-known tw.amw file like twtxt. tw.amw has concept of channels. A well-known file is named twchan.amw and its root object is a map with the following structure:

<channel name>:
    filename:  # relative name of channel file, may include directory
    archive:   # relative path to a directory with YYYY-MM archives
            

All the gibberish is stored in channel files. The root object is a mapping:

channel:
    file_id:   # unique identifier of channel file
    about:     # channel description
    avatar:    # channel avatar

items:
    # list of items

    - id:      # unique item identifier (optional)
      parent:  # parent item identifier for replies
      ts::isodate:  # timestamp
      source:  # URL of the source if this item is fetched from somewhere
      text:    # the message
      data:    # source data in any other format
          type:     # JSON, Markdown, etc.
          content:  # the data
      tags:    # list of tags
      media:   # links to media, as in fedi, TBD
            

New items are always appended to the end of file and the requester may download only last changes. However, the entire file can be re-created when it goes to archive. That's why it contains file_id in the very beginning and the requester must check it against local copy. If file_id does not match, the requester moves local copy to the archive and downloads new file.

The data can be archived when the size goes beyond some limit or channel preferences get changed. Thus, there's no need to include channel info into in each post like fedi does for users.

Archive files are kept in subdirectories named YYYY-MM. File name has the following format:

CHANNEL-YYYYMMDD[HHMM]-YYYYMMDD[HHMM].amw
            

The first date is the date/time of first record (UTC), and the second date is the date/time of last record. HHMM part is optional, it is used when there are multiple large files for the same day.

Files in the archive can be compressed. Lzma is the preferred method.

Intended use and TODO: