Configuration design style guide

Looking at developer's tool and having written some myself, I noticed that developers often seem to struggle with the design of the configuration of their tool. In this post I would like to summarize some design considerations, as well as give examples of good and bad configuration design.

The need for configuration

Configuration, /kənˌfɪɡəˈreɪʃ(ə)n,-ɡjʊ-/: an arrangement of parts or elements in a particular form, figure, or combination.

Configuration is generally used to accomplish at least these three goals:

Generalized implementations (for maintainability)
Code often contains fixed numbers and default values for certain parameters. Making the implementation general with respect to these values makes for healthier and easier to maintain code. A constant like MAX_ITER = 5 is an example of such a configuration.
Personal Customization
The workings of a tool are often customizable for a given user. This allows individual users to make the most out of the tool and the programmer to leave decisions about those configurations to the user. Changing the color of output is an example of such a configuration.
Different outcome
Sometimes tools need to work differently in different scenarios and it's not always possible for them to figure out how on their own. That's when outcomes should be configurable by the user. A --force flag is an example of a different outcome achieved by manual configuration.

Design decisions

Configurations design can take at least three forms: constants in code, command-line flags and configuration files. These forms don't generally each correspond with one of the above goals, but the goal of a configuration can definitely give an indication as to which for to use.

Configurability is generally desirable. Be sure to err on the side of Configurability. When in doubt, make it configurable, either with a configuration file or a flag.

Constants

Replace all 'magic' numbers with constants that describe their use. This will help you to better understand the semantics of those magic numbers. For example; 1 doesn't mean the same thing everywhere. Sometimes it's the LARGEST_BINARY_DIGIT but other times it is the UNARY_BASE or the MULTIPLICATIVE_IDENTITY.
Wherever possible, use constants defined by library.
Time libraries often export constants that involve time. Use those instead of defining SECONDS_IN_AN_HOUR yourself.
Don't define a constant if it's not really a constant.
Use a flag instead if it's not really a constant. Example: DECIMAL_BASE = 10 is a good constant but NB_RETRIES = 3 is not. Use a --nb-retries flag for the latter.
Don't make real constants configurable.
Example: Don't make a --decimal-base=INT option. This will only confuse users and is of no value.
Don't use a constant if its name would only refer back to the formula.
Example: discriminant = b ^ 2 - 4 * a * c is acceptable as is, as long as you make a comment about it. There is no need to change it to D = b ^ EXPONENT_OF_B_IN_DISCRIMINANT_FORMULA - FACTOR_OF_SECOND_TERM_IN_DISCRIMINANT_FORMULA * a * c.

Command-line flags

Use the right format for command-line configuration.
Use words or their substrings for commands.
Use a single dash - for short (one character options).
Use a double dash -- for long options. Use kebab case names that look-like-this for long options.
Don't put commands in flags.
Commands change the fundamental use of your tool while flags should only change configuration. The line here is blurry, and will require the programmer's judgment, but the way gpg does it is definitely a bad example.
This has exactly two exceptions. --help should only display usage information and --version should only output the version number.
Don't use short flags if they're not obvious.
Example: -f for --force and -i for --interactive are okay, but -l is not a good abbreviation of --files-with-matches (grep).

Configuration files

Make configuration -files human-readable.
Configuration files are for external consumption. Making configuration files manually editable is essential to a good user experience.
A configuration file that is only editable via the program is not a configuration file but a data file, even if it is used to configure the program. This does not mean that configuration files must not be modifiable internally, for example with a GUI.
Make sure configuration files are modularizable.
This means they must support recursive includes of both relative and absolute files.
DRY is a powerful concept that is entirely negated if configuration files are not modularizable. Copying parts of configuration files makes for a very bad user experience.
Be considerate when deciding where to put your configuration file.
The program has to make assumptions about where to go looking for the configuration file. By default you don't want to make the user specify the location of the configuration file, so make sure to have a sensible default.
The XDG base directory specification solves this problem elegantly.
Make the location of your configuration file configurable with a flag.
This is essential for users that have more than one configuration file for different occasions. Having to move configuration files around to use different one makes for a very bad user experience.
Have the program look for the configuration file in more than one place.
Having a sensible default for the location of the configuration file is one thing. If the program has got something to do with files in specific locations, having it for a configuration file recursively upwards in the directory tree can make for a nice user experience. See Stylish Haskell for an example of such a case.
When possible, stick to values that could appear in a JSON object.
Don't use custom objects that you will have to parse with a custom parser. Stick to numbers, booleans, strings, etc.
If you must define a custom format, make it a full-fledged language. See the Super User Spark for an example of such a case.

Deciding between putting a configuration in the configuration file or in a flag

There is a simple heuristic to decide whether a configuration should be put in the configuration file or in its own flag: "Frequency of configuration". How many times does the user have to manually configure this?

If the user is deploying the program as a system, use a flag. The configuration will most likely be saved in a startup script any way.

If the user uses the command-line manually to interact with the system, put the configuration in the configuration file. Having to retype a --really-long-command-line-option often makes for a very bad user experience.