I wrote the first part of this two-part series for FP Complete. It can be found on their website. That post describes how settings should be handled. This post will describe how to do that in Haskell in practice.
Following suggested approach to passing settings
Following the first two pieces of advice comes very easily in Haskell. Passing settings as an argument, instead of storing them in global state, is almost a given. We would have to resort to System.IO.Unsafe.unsafePerformIO
to go against this principle. The second piece of advice, "Use immutable settings" is also mostly a given. I will not even go into how you could go against it in Haskell.
The next two pieces of advice are certainly less trivial. We will use a single example in the rest of this blog post: Suppose we were to write a very simple program with some of the functionality of grep
and call it my-grep
. We would like to be able to handle two cases:
Find a string
Replace one string by another
Keep in mind that the example is just that: an example. Focus on the principles and use good judgement to write code differently as needed.
Purely functional argument parsing.
Purely functional argument parsing may be easier to do in Haskell than in other languages, but we still have to make the conscious decision to do so.
First we define the appropriate types. We need a Command
type that contains all the command-specific information that we can get from the command-line arguments. This information should be an accurate, unprocessed reflection of what is present in the command-line arguments. Use simple types, and err on the site of using Maybe
for values that are optional instead of using default values.
Example:
data Command
= CommandFind FindArgs
| CommandReplace ReplaceArgs
data FindArgs = FindArgs String
data ReplaceArgs = ReplaceArgs String String
We will also need a type that represents the non-command-specific flags: Flags
.
Example:
data Flags = Flags
flagConfigFile :: Maybe FilePath
{ flagVerbosity :: Maybe String
, }
Finally, we add one more type to package up the previous two:
data Arguments = Arguments Command Flags
Now we have to write a pure parsing function.
parseArguments :: [String] -> Either ArgError Arguments
The specifics of the error-case are not as important as making sure that the error is pure and not just an exception. It is fine if we handle this ArgError
by die
ing, because it is usually a good idea to stop the program if the argument parsing fails, but a pure function should exist for testing.
We suggest using optparse-applicative
or optparse-simple
to do the actual argument parsing. There are excellent tutorials in the README, on 24 days of hackage and in the optparse-simple
README.
When using optparse-applicative
, please use help
and fullDesc
generously and set prefShowHelpOnError
and prefShowHelpOnEmpty
to True
.
We suggest turning on stack build :my-program --file-watch --exec='my-program'
to see what the output looks like while writing this part.
Pre-processing settings
Now that we have gathered the arguments, the next step is to gather the appropriate information from the environment, the configuration file(s), and possibly even other sources like the program name. In Haskell, we have access to the arguments via getArgs :: IO [String]
, to the program name via getProgName :: IO String
, and to the environment via getEnvironment :: IO [(String, String)]
. All of these functions live in IO
, but it is important that we keep as much of the pre-processing as possible pure.
Gathering the relevant part of the environment
The environment is up first. We will define a new type that represents the relevant part of the environment: This type should be an accurate representation of the information found in the environment. Again: err on the side of using Maybe
instead of default values.
data Environment = Environment
envVerbosity :: Maybe String
{ envConfigFile :: Maybe FilePath
,...]
[ }
We will parse the relevant part of the environment with a function that has the following type:
relevantEnvironment :: [(String, String)] -> Environment
Note that we do not allow the gathering from the environment to differ based on the arguments we just parsed. Also note that relevantEnvironment
is pure and not allowed to error. This helps to ensure that it does not perform any processing yet.
Gathering the configuration
We approach gathering from the configuration in the same manner. First we define a type whose values represents the configuration that we may find in the configuration files. Similar to the Arguments
and the Environment
, Configuration
should use Maybe
values to signify when a certain value is not configured.
data Configuration = Configuration
confVerbosity :: Maybe String
{...]
[ }
Using the Arguments
and Environment
that we just gathered, we will get the configuration from configuration files with a function of this type:
getConfiguration :: Arguments -> Environment -> IO Configuration
Note that the IO is necessary for reading files, so we may as well die
here if anything goes wrong with reading configuration files. You may want to return Maybe Configuration
from the getConfiguration
function in case you want to handle situations in which no configuration file exists yet. Note also that the previous steps were necessarily completed beforehand, because usually we would want to be able to override the config file path with a --config-file
option or a MY_GREP_CONFIG_FILE
environment variable.
Settings
The next step is to actually process these Arguments
, Environment
and Configuration
values. We define a type Dispatch
that contains all the command-specific settings that the program will use, and a Settings
type that contains all the non-command specific settings.
data Instructions
= Instructions Dispatch Settings
data Dispatch
= DispatchFind FindSettings
| DispatchReplace ReplaceSettings
data FindSettings = FindSettings
findSetQuery :: Text
{...]
[
}
data ReplaceSettings = ReplaceSettings
replaceSetOriginal :: Text
{ replaceSetReplacement :: Text
,...]
[
}
data Settings = Settings
setVerbosity :: LogLevel
{...]
[ }
We then build the settings that our application will use from all this gathered information. The format should be easy for our program to use. We would prefer to get as much of the validation out of the way, by die
ing early. Building the settings happens with the combineToInstructions
function:
combineToInstructions :: Command -> Flags -> Environment -> Configuration -> IO Instructions
Make sure that there is still no application logic in these settings. As an example: our little grep program needs to know in which files to look, and we want to be able to specify that using a directory name. Specifying foo
as the directory in which to look, should mean 'look in all the files inside this directory'. When processing that concept, we are allowed to pre-process a FilePath
into a Path Abs Dir
, but not list the directory to get a [Path Abs File]
. The first part is considered parsing. The second part is considered application logic.
The API
To encapsulate all of this logic, we put it in an OptParse
module. We export Dispatch(..)
, Settings(..)
, FindSettings(..)
, ReplaceSettings(..)
and a function called getInstructions
:
getInstructions :: IO Instructions
This function is transitively responsible for all argument parsing, gathering from the environment and Configuration file(s), and combining all of that information into the Instructions
. Should anything go wrong, this function is allowed to die
, so that the program only ever has to deal with valid settings.
Now we can write the application using the easy-to-use settings we wanted:
import MyGrep.OptParse
main :: IO ()
= do
main Instructions dispatch settings <- getInstructions
runReaderT (doSomethingWith dispatch) settings