This post announces the new version 0.2.0.0
of the safe-coloured-text
library. The safe-coloured-text
library lets you safely output coloured text to a terminal. The idea for version 0.2.0.0
came from a very smart and annoyingly sensible comment on reddit. The first (0.1.0.0
) version made the now-considered-erroneous decision to require the user to use UTF8. The newest (0.2.0.0
) version relaxes that requirement by using Text
instead of ByteString
.
A quick primer on character encodings
Human language is seriously complex. Representing human language in computers is even more complex. This has to do with human history but more importantly also the history of computing and (historical) efficiency requirements. Here is an extremely simplified summary.
Human text consists of Characters.
[*]
Unicode assigns a number to every
[*]
character. We call these numbers code points.A character encoding lets you map a sequence of code points from and to a sequence of octets (bytes
[*]
).We would like encodings to be efficient for common use-cases like "English text only" or "Text with European languages only".
[*]
UTF8
is a common encoding that is a good compromise for most use-cases.UTF8
is not the standard everywhere, and even Haskell'stext
package usedUTF16
internally until recently.Systems try to specify the encoding that they want programs to use in various ways like, for example, the
LANG
environment variable.
[*]
: Not really, but we've more or less been able to pretend so anyway.
Relevant Haskell types
With that in mind, these are the relevant types in Haskell:
Char
: A unicode code pointString
: A list ofChar
s:type String = [Char]
.Text
: LikeString
, but more performant for most use-cases. (Text also doesn't support certain code points, like unmatched UTF16 surrogate code points, in versions beforetext-2.0
.)ByteString
: Like[Word8]
, but more performant for most use-cases.
These types are different for Real and Important reasons. Some examples include:
Programmers often want to be able to talk about single
Char
s.Lists are a fundamental data (and control) structure in Haskell
In order for
Text
to roundtrip withByteString
, one must choose an encoding.For some encodings, not every
ByteString
represents a valid encoding of a sequence of characters. (UTF8, for example.) This means that decoding must be able to fail.
Notable changes
Version 0.2.0.0
of the safe-coloured-text
library
The default output of the safe-coloured-text
library is now Text
instead of ByteString
. Existing functions are deprecated according to the following scheme:
renderChunks
is now a deprecated synonym ofrenderChunksUtf8BSBuilder
.renderChunksUtf8BSBuilder
is a new function that outputs aByteString.Builder
.renderChunksBuilder
is a new function that outputs aText.Builder
.renderChunksText
is a new function that outputs aText
.renderChunksBS
is now a deprecated synonym ofrenderChunksUtf8BS
.renderChunksUtf8
is a new function that outputs aByteString
.
Note that the new version of the library requires you to choose an encoding in order to continue outputting raw bytes, but does not break reverse dependencies that want to keep using renderChunks
or renderChunksBS
.
Version 0.2.0.0
of the autodocodec-yaml
library
The autodocodec-yaml
library lets you output a schema for a JSON (and YAML) codec in a nice and colourful way. The functions that output these nicely coloured schemas now produce Text
values instead of ByteString
s.
Version 0.11.0.0
of the sydtest
library
The sydtest
testing framework now tries to respect the system's locale by using the functions in Data.Text.IO
instead of outputting UTF8 bytes directly.