Hacktoberfest in review

CS SYD participated in Hacktoberfest this year. It has inspired many more contributions than I expected. Here is an overview.

Three people were interested in helping with some issues. Two of them for Hastory, and one for Validity. The Hastory contributions have been inspiring. This post will focus on those.

Hastory

Hastory is a little command-line tool that hooks into your terminal command prompt to save every command you run, and some metadata about it. This then allows you to use this data to optimise your terminal usage. The gathered data includes:

Which command was run
In which working directory the command was run
When the command was run
Which user the command was run as
Which host the command was run on

Currently only two optimisations are implemented:

quickly jump to the directory that you'll most likely want to jump to next based on the directories that you cd to
suggest shell aliases based on the commands that you run a lot

Sqlite instead of JSON

Hastory used to store its logs in daily files (`.hastory/command-history/.log) in the following format:

{"w":"/home/syd/cs-syd/code/open-source/easyspec/","d":"2017-06-16T23:35:51.96478113+01:00","t":"coop\n","u":"syd","h":"septus"}
{"w":"/home/syd/cs-syd/code/open-source/","d":"2017-06-16T23:35:54.574981592+01:00","t":"..\n","u":"syd","h":"septus"}
{"w":"/home/syd/cs-syd/code/open-source/wolf/","d":"2017-06-16T23:35:55.934447816+01:00","t":"cd wolf\n","u":"syd","h":"septus"}
[...]

That's one json object per line. This format was originally chosen because it is 1. easy to append to and 2. text-based and therefore future-proof. Logs were separated by day because of performance concerns with respect to operating on this data.

I realised that this format has some serious downsides as well:

It has a lot of redundancies. Every line contains the name of every column. It is uncompressed text, and JSON at that.
It is a custom format. Even if it is supposadly future-proof, this format is not standard and that makes tooling more difficult.

As part of hacktoberfest, Steven Levia has implemented a change from this custom format to using a single sqlite database instead. He has been a joy to work with, and may implementing further performance optimisations in the near future.

Now the entries are stored in the following sqlite table using persistent:

Entry
    text Text
    workingDir (Path Abs Dir)
    dateTime UTCTime
    hostName Text
    user Text
    deriving Show Eq Generic

Hastory: Append-only sync server

One of hastory's purposes is to retain command history. For some purposes, it already does that better than bash and zsh. (Neither of those store the working directory, for example.) It has been able to use this history for some advanced applied laziness, like smart directory changing. It hasn't (yet) been able to use this history for archiving/logging purposes because all history is lost when a device loses its database.

This contribution is about creating a centralised server for collecting the hastory entries. This will allow the data to be retained at the server-side in case clients lose their data or dissappear.

As part of hacktoberfest, Yiğit Özkavcı has implemented such a sync server.