I wrote the first part of this two-part series for FP Complete. It can be found on their website. That post describes how settings should be handled. This post will describe how to do that in Haskell in practice.
Following suggested approach to passing settings
Following the first two pieces of advice comes very easily in Haskell. Passing settings as an argument, instead of storing them in global state, is almost a given. We would have to resort to
System.IO.Unsafe.unsafePerformIO to go against this principle. The second piece of advice, "Use immutable settings" is also mostly a given. I will not even go into how you could go against it in Haskell.
The next two pieces of advice are certainly less trivial. We will use a single example in the rest of this blog post: Suppose we were to write a very simple program with some of the functionality of
grep and call it
my-grep. We would like to be able to handle two cases:
Find a string
Replace one string by another
Keep in mind that the example is just that: an example. Focus on the principles and use good judgement to write code differently as needed.
Purely functional argument parsing.
Purely functional argument parsing may be easier to do in Haskell than in other languages, but we still have to make the conscious decision to do so.
First we define the appropriate types. We need a
Command type that contains all the command-specific information that we can get from the command-line arguments. This information should be an accurate, unprocessed reflection of what is present in the command-line arguments. Use simple types, and err on the site of using
Maybe for values that are optional instead of using default values.
We will also need a type that represents the non-command-specific flags:
Finally, we add one more type to package up the previous two:
Now we have to write a pure parsing function.
The specifics of the error-case are not as important as making sure that the error is pure and not just an exception. It is fine if we handle this
dieing, because it is usually a good idea to stop the program if the argument parsing fails, but a pure function should exist for testing.
optparse-applicative, please use
fullDesc generously and set
We suggest turning on
stack build :my-program --file-watch --exec='my-program' to see what the output looks like while writing this part.
Now that we have gathered the arguments, the next step is to gather the appropriate information from the environment, the configuration file(s), and possibly even other sources like the program name. In Haskell, we have access to the arguments via
getArgs :: IO [String], to the program name via
getProgName :: IO String, and to the environment via
getEnvironment :: IO [(String, String)]. All of these functions live in
IO, but it is important that we keep as much of the pre-processing as possible pure.
Gathering the relevant part of the environment
The environment is up first. We will define a new type that represents the relevant part of the environment: This type should be an accurate representation of the information found in the environment. Again: err on the side of using
Maybe instead of default values.
We will parse the relevant part of the environment with a function that has the following type:
Note that we do not allow the gathering from the environment to differ based on the arguments we just parsed. Also note that
relevantEnvironment is pure and not allowed to error. This helps to ensure that it does not perform any processing yet.
Gathering the configuration
We approach gathering from the configuration in the same manner. First we define a type whose values represents the configuration that we may find in the configuration files. Similar to the
Arguments and the
Configuration should use
Maybe values to signify when a certain value is not configured.
Environment that we just gathered, we will get the configuration from configuration files with a function of this type:
getConfiguration :: Arguments -> Environment -> IO Configuration
Note that the IO is necessary for reading files, so we may as well
die here if anything goes wrong with reading configuration files. You may want to return
Maybe Configuration from the
getConfiguration function in case you want to handle situations in which no configuration file exists yet. Note also that the previous steps were necessarily completed beforehand, because usually we would want to be able to override the config file path with a
--config-file option or a
MY_GREP_CONFIG_FILE environment variable.
The next step is to actually process these
Configuration values. We define a type
Dispatch that contains all the command-specific settings that the program will use, and a
Settings type that contains all the non-command specific settings.
We then build the settings that our application will use from all this gathered information. The format should be easy for our program to use. We would prefer to get as much of the validation out of the way, by
dieing early. Building the settings happens with the
combineToInstructions :: Command -> Flags -> Environment -> Configuration -> IO Instructions
Make sure that there is still no application logic in these settings. As an example: our little grep program needs to know in which files to look, and we want to be able to specify that using a directory name. Specifying
foo as the directory in which to look, should mean 'look in all the files inside this directory'. When processing that concept, we are allowed to pre-process a
FilePath into a
Path Abs Dir, but not list the directory to get a
[Path Abs File]. The first part is considered parsing. The second part is considered application logic.
To encapsulate all of this logic, we put it in an
OptParse module. We export
ReplaceSettings(..) and a function called
getInstructions :: IO Instructions
This function is transitively responsible for all argument parsing, gathering from the environment and Configuration file(s), and combining all of that information into the
Instructions. Should anything go wrong, this function is allowed to
die, so that the program only ever has to deal with valid settings.
Now we can write the application using the easy-to-use settings we wanted: