yocton
|
This document is intended as an introductory guide to the Yocton parsing API. Let's start with a basic example that shows how to open an input file and read a single property:
This example shows the basic boilerplate of how to get started with the API. A Yocton document is an object (yocton_object) which contains properties (yocton_prop). We can expand the example into one that prints every property in a file:
However, this example only works when all property values are strings. Property values may instead be objects; these can be accessed using yocton_prop_inner()
. Using this we can construct a recursive function that reads and prints all properties of all subobjects:
The APIs for many serialization formats are often document based, where data is deserialized into a document object that can then inspected (examples are the XML DOM, and protocol buffers). Yocton instead uses a pull parser. With a pull parser, it is up to the caller to read data one item at a time. This avoids the need for either autogenerated code (as with protobufs) or complicated APIs - Yocton's API is minimalist and simple to learn.
The API has been designed with a particular approach in mind to using input data to populate data structures. It is assumed that Yocton objects will correspond to C structs, and object properties will correspond to C struct fields. Here's a simple example of how a struct might be read and populated; the example struct here is a minimal one containing a single string field:
While this is relatively easy to understand, it looks quite verbose. It is therefore important to note that there are convenience functions and macros to make things much simpler, as will be explained in the sections below.
Yocton is a recursive format where objects can also contain other objects. The assumption is that a subobject likely corresponds to a field with a struct type. Consider the following input:
This might be used to populate structs of the following types:
When subobjects are mapped to struct types in this way, a function can be written to populate each type of struct. In the examples above, read_foo()
might be complemented with read_baz()
and read_qux()
functions. This makes for clear and readable deserialization code; recursion in the programming language is used to handle recursion in the input file. The approach also means that the individual functions can be tested in isolation.
Yocton property values can contain arbitrary strings, the contents of which are open to interpretation. In practice though, the values are often likely to be one of several common base types which every C programmer is familiar with. There are convenience functions to help parse values into these types:
Function | Purpose |
---|---|
yocton_prop_int() | Parse value as a signed integer. Works with all integer types, performs bounds checking, etc. |
yocton_prop_uint() | Parse value as an unsigned integer. Works with all unsigned integer types, performs bounds checking, etc. |
yocton_prop_value_dup() | Returns the value as a plain, freshly allocated string, performing the appropriate checking for memory allocation failure. Useful for populating string fields. |
While these functions are useful, in most cases it is more convenient to use the preprocessor macros which are specifically intended for populating variables (and struct fields).
Type | Macro |
---|---|
Signed integer | YOCTON_VAR_INT() |
Unsigned integer | YOCTON_VAR_UINT() |
String | YOCTON_VAR_STRING() |
Consider the following input:
We might want to read this input and populate the following struct type:
In the following example, we populate a struct foo
variable named x
. A different YOCTON_VAR_...
macro is used to match each property name and assign a value to a different struct field:
In the above example the fields of a struct are being populated, but this does not have to be the case; for example the following sets an ordinary variable named string_value
:
It is important to note is that these macros are internally designed to provide a simple and convenient API, not for efficiency. If performance is essential or becomes a bottleneck, it may be preferable to avoid using these macros.
C provides enumerated types (enums) which allow the programmer to define a set of integer values with symbolic names. Yocton provides support for enums through the yocton_prop_enum()
function which will map a property value to an integer value through lookup in an array of strings. For example:
The array of strings must be NULL terminated. As with the other functions described in the previous section, it is usually simpler to use the YOCTON_VAR_ENUM()
convenience macro:
Sometimes we might have a pointer variable, and want to initialize that variable when a particular property is read. For example, consider the following input:
We might want to use this to initialize the following pointer variable:
In this scenario, we can use YOCTON_VAR_PTR()
to allocate a new struct foo
. In the following example, when YOCTON_VAR_PTR()
matches a property named foo
, a new struct foo
is allocated, my_foo
is initialized to point to it, and parse_foo()
is called to populate it from the property's object value.
The Yocton format has no special way of representing lists. Since property names do not have to be unique, it is simple enough to represent a list using multiple properties with the same name.
As with the previous example that described how to populate variables (and struct fields) with base types, convenience macros also exist for constructing arrays. The main difference is that an extra variable (or struct field) is needed to store the array length.
Type | Macro |
---|---|
String array | YOCTON_VAR_STRING_ARRAY() |
Signed integer array | YOCTON_VAR_INT_ARRAY() |
Unsigned integer array | YOCTON_VAR_UINT_ARRAY() |
Enum array | YOCTON_VAR_ENUM_ARRAY() |
Array of pointers | YOCTON_VAR_PTR_ARRAY() |
Array of structs | YOCTON_VAR_ARRAY() |
Consider the following input:
We might want to parse this input to populate the following struct type:
The following code populates a single struct bar
named x
:
While the above macros are convenient for building arrays of base types, often it is preferable to construct arrays of structs. The YOCTON_VAR_ARRAY()
macro can be used to do this (actually, it can be used to construct arrays of any type; it is what the previous macros were built upon). It does the following:
Consider the following input:
We might want to parse this input into the following array:
In the following example, when YOCTON_VAR_ARRAY()
matches a property named item
, the items
array is reallocated to allot space for a new element (item[num_items]
). The parse_foo()
function is then called to populate the contents of this new struct from the property's inner object value. Finally, the length of the array num_items
is incremented.
The previous section covered how to construct an array of structs. The analogous YOCTON_VAR_PTR_ARRAY()
can be used to construct an array of struct pointers. Consider the following input (same input as the previous section):
We might want to parse this input into the following array (note the difference to the previous section; this is an array of pointers to structs):
In the following example, when YOCTON_VAR_PTR_ARRAY()
matches a property named item
, a new struct foo
is allocated and appended to the items
array, and the parse_foo()
function is called to populate the struct's contents from the property's inner object value. Finally, the length of the array num_items
is incremented.
There are many different types of error that can occur while parsing a Yocton file. For example:
Continual checking for error conditions can make for complicated code. The Yocton API instead adopts an "error state" mechanism for error reporting. Write your code assuming success, and at the end, check once if an error occurred.
Here's how this works in practice: most parsing code involves continually calling yocton_next_prop()
to read new properties from the file. If an error condition is reached, this function will stop returning any more properties. In effect it is like reaching the end of file. So when "end of file" is reached, simply check if an error occurred or whether the document was successfully parsed.
Here is a simple example of what this might look like:
Some of the API functions will also trigger the error state. It may be tempting to add extra checks in your code to avoid this happening, but it is better that you do not. If an error is triggered in this way, it is likely that it is due to an error in the file being parsed. Your API calls implicitly document the expected format of the input file. If the file does not conform to that format, it is the file that is wrong, not your code.
An example may be illustrative. Suppose your Yocton files contain a property called name
which is expected to have a string value. If the property has an object value instead, a call to yocton_prop_value()
to get the expected string value will trigger the error state. That is not a misuse of the API; your code is implicitly indicating that a string was expected, and the input is therefore erroneous. The line number where the error occurred is logged, just the same as if the file itself was syntactically incorrect.