When starting a new project in C I tend to focus on getting a config parser functional before I do any real work on the project. Originally, I would dig around online for some header-only library to do this for me, the downside being that these libraries tend to be bloated with features I’ll never make use of. So I’ve devised my own simple key/value config parser. This article is a tutorial on implementing said config parser. You can find my more feature-complete version of this here
This implementation is intentionally hack-ish with the intention of being very simple and short. Towards the end of this article I will go over the issues that this implementation suffers from and what you could do to improve and fix these issues.
First lets take a look at an example config file we’ll be parsing.
As you can see, the key value pairs are split with only a space. A key is always a string, but a value is either a string or an integer. Parsing this will be pretty straight forward, and due to the pairs being separated with a space instead of a “=” symbol we can cut some corners when it comes down to checking syntax.
Before we even think about parsing the config file, we need to load it. We’re taking the header-only approach here as the code required is minimal, hence the frequency of the “static” keyword.
This is a simple function so I wont go over it to much as the comments should be sufficient information.
Now it’s time to parse the config. To do this we will be using the strtok function to “tokenize” the file contents. strtok mangles the given string, so we’ll create a duplicate of buff and use that with strtok, keeping our buff intact for future use.
We should probably do some basic error checking, as I said previously due to the key value pairs being separated by a space this is pretty simple to do. The easiest (albeit hack-ish) method would be to obtain the number of tokens, and see if they are divisible by two, if they are not then there’s a key with a missing value somewhere.
Now we can start parsing the tokens. But wait! We are going to need some data structure to store the key value pairs. For this we’ll use a struct containing a string for the token, an indicator of the values type, and a union to store the actual value.
Now we have these, we can load and store our key value pairs. Because we want the ability to have arbitrarily-sized config files, lets create an array of conf_var_t.
Now we’ve got all of that down, we can actually write the parser, for real this time. Last time we used strtok it mangled our buff copy, so lets re-create that copy and find the first token.
Essentially what strtok does, is split up the given string by the given delimiters. In this case we pass the delimiters “ \t\n”, which is an empty space, tab-character, or newline. “token” should now point to the first key in the file. We want to automate this as much as possible so we’ll need a few temporary variables.
So, we start with the first token, which is the first key in the config file. As every key should be followed by a value, we can interpret the first token as a key, the following as a value, the following as a key, and so on and so forth. We’ll use the ‘t’ variable to indicate that we are at a key or value, 0 being key, 1 being value. The following code resides in the while loop.
As you can see the keys where pretty easy to handle, due to them only ever being strings. Next we need some code within the else statement to figure out if our value is a string or integer. The easiest way is to devise a function that determines if our value is a number or not. I wont go into this much as its function should be pretty obvious.
So now we’ve got a way to determine the type of a value, lets go back to that else function and put it to use. In the case of an number value, we convert the string to a integer, and set the type to conf_type_int. In the case of a string, we copy the string into the value and set the type to conf_type_string. After which we increase i, which is the array index. And so ends the else-statement.
The last bit of code for the while loop is small, but its also the most important. We switch the t variable, so that the next loop iteration knows if its dealing with a key or value. And we get the next token. The strtok call takes the same delimiter as before, but this time we pass NULL, instead of a string. The reason for this is that strtok keeps track of what its working with between strtok calls, so we pass NULL to let strtok know we want to continue tokenizing the same string.
And so ends the parser. The only thing left to do now would be to write some small getter functions that take a key and return the value at said key in the array. You’ll want one for both string and integer values. I’ll write the integer one and let you figure out the rest on your own as the code is almost identical. Also remember to free the vars array after you are done with it!
This is a pretty quick and easy implementation of a very simple key value config parser. As I noted at the start there exist some issues with it, specifically the error checking. Currently we check to see if the amount of tokens is divisible by two, and in the case that its not, an error is thrown and it doesn’t parse the config. Ideally the parser should be able to discard any keys that have no value and continue parsing the rest of the config, but I’ll leave that for you to implement should you find it necessary.