diff --git a/content/images/remwhareadFF.png b/content/images/remwhareadFF.png new file mode 100644 index 0000000..5779e0d Binary files /dev/null and b/content/images/remwhareadFF.png differ diff --git a/content/posts/keep-track-of-what-you-have-read-online-with-remwharead.adoc b/content/posts/keep-track-of-what-you-have-read-online-with-remwharead.adoc new file mode 100644 index 0000000..d6121cf --- /dev/null +++ b/content/posts/keep-track-of-what-you-have-read-online-with-remwharead.adoc @@ -0,0 +1,143 @@ +--- +title: "Keep track of what you've read online with remwharead" +description: "How to archive articles you read online locally and how to find them again." +date: "2019-09-26T06:10:07+02:00" +draft: false +tags: ["remwharead", "bookmarks", "archive", "Tooting my own horn"] +--- + +:wp-asciidoc: https://en.wikipedia.org/wiki/AsciiDoc +:wp-rss: https://en.wikipedia.org/wiki/RSS +:wp-json: https://en.wikipedia.org/wiki/JSON +:uri-bookmarks: https://msdn.microsoft.com/en-us/library/aa753582(VS.85).aspx +:uri-remwharead: https://schlomp.space/tastytea/remwharead +:uri-ff-addon: https://addons.mozilla.org/firefox/addon/remwharead/ +:uri-archive: https://archive.org/ +:uri-perlre: https://perldoc.perl.org/perlreref.html#SYNTAX +:uri-sqlitebrowser: https://sqlitebrowser.org/ + +Today I'd like to talk to you about how I archive articles I read online and how +I find them again. + +I've found myself repeatedly in situations where I wanted to reference an +article I knew I read, but couldn't find it anymore. Be it that I didn't +remember the right search terms or that the article had gone offline. I searched +for solutions to my problem, but could only find webservices, nothing that would +allow me to keep an archive on my local computer. So I decided to fill that gap +and write remwharead. It runs on Linux, probably BSD and maybe macOS. + +== What is remwharead? + +remwharead is a tool that allows you to save URIs of things you want to remember +in a local database, along with an URI to the archived version, the current date +and time, title, description, the full text of the page and optional tags. You +can then export all or a portion of your aggregated hyperlinks to different +formats, including {wp-asciidoc}[AsciiDoc], {wp-rss}[RSS], {wp-json}[JSON] and +{uri-bookmarks}[Netscape Bookmark File Format]. + +.Output of `remwharead -e asciidoc | asciidoctor --backend=html5 -o file.html -` +[alt="AsciiDoc output of remwharead, formatted to HTML"] +[link="https://doc.schlomp.space/.remwharead/example_dates.png"] +image::https://doc.schlomp.space/.remwharead/example_dates.png[] + +== Get remwharead + +You can download the latest release from {uri-remwharead}/releases[]. If your +CPU architecture is X86_64 (if you don't know it probably is) and you use +Debian, Ubuntu, or a distribution based on Debian or Ubuntu, you can use the +attached `.deb` package. Download it and install with +`apt install ./rewharead_*.deb`. Gentoo users can use my repository as described +in the {uri-remwharead}#gentoo[readme]. + +If there is no package for your distribution / operating system yet, you have to +compile it yourself, as described in the {uri-remwharead}#from-source[readme]. + +The extension for Firefox is available from {uri-ff-addon}[addons.mozilla.org]. + +== How to use it + +=== Adding an entry + +.remwhareadFF +image::/images/remwhareadFF.png[Screenshot of remwhareadFF,233,117,role="right"] + +Saving things is simple: Just type `remwharead` followed by the URI into your +terminal and press “Enter”. To add tags, use the command-line switch `-t` or +`--tags`. + +But most of the time you'll probably want to use {uri-ff-addon}[remwhareadFF], +the Firefox extension. + +.Example: Save this article with the tags remwharead, bookmarks and archive. +---- +{{< highlight shell >}} +remwharead -t remwharead,bookmarks,archive https://blog.tastytea.de/posts/keep-track-of-what-you-have-read-online-with-remwharead/ +{{< / highlight >}} +---- + +remwharead will automatically ask the Wayback machine from the +{uri-archive}[Internet Archive] to archive the page and store the URI to +the archived page, unless you run it with `-N` or `--no-archive`. + +=== Retrieving / Exporting entries + +To display the saved things using the export format “simple”, type +`remwharead -e simple`. You can filter by date and time with `-T` or +`--time-span`, filter by tags with `-s` or `--search-tags` and perform a +full-text search with `-S` or `--search-all`. You can also use `--search-tags` +and `--search-all` with {uri-perlre}[regular expressions], with `-r` or +`--regex`. + +.Example: Display all things you saved on 2019-09-23. +---- +{{< highlight shellsession >}} +% remwharead -e simple -T 2019-09-23,2019-09-24 +2019-09-23: Keep track of what you've read online with remwharead + +2019-09-23: Another good article + +{{< / highlight >}} +---- + +Times are in the format _YYYY-MM-DDThh:mm:ss_. `2019-09-23` is short for +`2019-09-23T00:00:00`. + +.Example: Display all things you tagged with “apple” or “onion”. +---- +{{< highlight shellsession >}} +% remwharead -e simple -s "apple OR onion" +2019-08-03: The best onion soup recipe of the whole internet! + +2019-04-12: 5 funny faces you can carve into YOUR apple today! + +{{< / highlight >}} +---- + +Most export formats show only a portion of the available data for readability +reasons. If you want the full datasets, use `-e json` or `-e csv`. You can also +access the SQLite-database at `${XDG_DATA_HOME}/remwharead/database.sqlite`, for +example with {uri-sqlitebrowser}[sqlitebrowser]. + +NOTE: `${XDG_DATA_HOME}` is usually `~/.local/share`. + +==== Create an RSS feed + +Want to share what you read? with the “rss” export you can create an RSS feed +for your friends to subscribe. Unfortunately remwharead can't create a valid RSS +feed out of the box, because it can't know what content the “link”-element +should have. You probably also want to change the title from “Visited things” to +something more descriptive. + +.Example: Shell script to create a valid RSS feed of the last week. +---- +{{< highlight shell >}} +#!/bin/sh + +remwharead -e rss -T $(date -d "-1 week" -I),$(date -Iminutes) \ + | sed -e 's||https://example.com/|' \ + -e "s|Visited things|<title>My hyperlink archive|" \ + > /var/www/feed.rss +{{< / highlight >}} +---- + +TIP: Put that script into `/etc/cron.hourly/` to update your feed once an hour.