Persistent Configuration Tree

1. Overview
2. Source Code
3. Storage
- 3.1. Example: Basics
- 3.2. Example: Linked list
4. Store
5. Tree
6. Schema
7. Transactions
8. Inspiration
- 8.1. Related projects
9. About this documentation

1 Overview

This document describes a file based graph database.

Some of its features:

Atomic (either all changes are commited at once, or none are commited at all)
Persistent (the changes are not visible to the processes that have opened the database before the update)

The database is built on 3 layers: (ça va être renommé)

Storage: the raw storage, that can append new records to a file, and also manage a special pointer that references one of the record. (Finalized)
Store: a low level tree data structure based on the Storage layer, and that allows to perform multiples changes in memory on this tree before commiting everything to the storage. (Finalized)
Tree: a high level graph data structure based on the Store layer, that also provides a mean to apply constraints on the shape and values of the graph, and to simulate extra nodes and values dynamically. (Finalized)

In addition, there is a component to help describe validation schema:

Schema: a layer on top of the schema validation provided by the Tree layer. (WIP)

2 Source Code

The source code is available on GitHub:

https://github.com/fjolliton/ConfigurationTree

3 Storage

The storage is the low level layer that can store an unlimited number of immutable records, and manage a entry point offset. A record is a byte array that should contains neither \t nor \n characters, but can otherwise be of any length. The offset points at one (and only one) of these records.

The idea of the entry point is to point at the "current" value in the storage, and when a new value is to be added, it is first stored then if everything goes fine, the entry point might be updated to point at it.

This storage meets the ACID properties, using optimistic concurrency control.

The primitives to operate on the storage are the following:

open(filename, [options..]): Open (and create if necessary) a storage.
load(offset) → data: Retrieve a record given its offset.
store(data) → offset: Store a record, and get its offset.
get_current() → offset: Get the current offset ("entry point").
set_current(offset): Set the current offset ("entry point").

The actions of store() and set_current() are immediatly visible from other processes.

The storage is targeted at storing JSON encoded data, but no special format is really enforced (except for the exclusion of the \t and \n characters, which are used as markers internally).

3.1 Example: Basics

For the following examples, run a Python shell with:

python3 -i storage.py

We start from an empty storage:

>>> s = Storage.open('/tmp/demo', create_if_missing=True)
>>> s.get_current()
0

At this point the file contains the two lines shown below. The first column indicates the offset in the file. The first line contains the UUID for the version of this file format. The second line contains the current pointer.

>>> s.dump()
   0 | 3dbf4cbc-f015-43d9-b280-ff6962a22198
  37 | 000000000000000

Then we add a record and we change the current pointer:

>>> s.store(b'Hello, World!')
53
>>> s.set_current(53, lease=0)
>>> s.dump()
   0 | 3dbf4cbc-f015-43d9-b280-ff6962a22198
  37 | 000000000000053
  53 | Hello, World!

The datastructure above can be represented as follow:

To change the current value, we begin by adding the new value to the storage:

>>> s.store(b'Bonjour tout le monde.')
68

The new value is stored, but nothing points to it.

Then, we "commit" the change by updating the entry point:

>>> s.set_current(68, lease=53)

The entry point should usually be changed by specifying both the new offset and the previous offset. This is to ensure that the change is performed with a well known state, and to avoid overwritting the pointer set by a concurrent process.

Since this works with optimistic concurrency, the user must be prepared to handle a failure when changing the entry point and to retry the modifications he/she wants to make to the storage until it succeed. Note that on the next attempt, it is perfectly possible to reuse the records that just got stored (assuming they still make sense with the changes that occured from another process).

Since the change on the entry point is atomic, this ensure that:

Anyone reading the storage with the previous entry point value will not be impacted by the new value being added. They will see the value at the time of the access to the storage, even if other modifications occur on the storage at the same time.
The new value, even if complex, it commited all at once.

Once a value is stored, it is never deleted nor modified.

It's up to the user of this layer to build the data structure that fit his/her needs.

3.2 Example: Linked list

Here is an example that use the storage to implement a method to store a value and keep track of all its previous state.

We could do that by using extra records that implement single linked list:

This is achieved with the following session. Here we use a record where we combine two pointers separated by a colon (:).

>>> s = Storage.open('/tmp/demo', create_if_missing=True)

Storing the first value:

>>> s.store(b'Hello, World!')
53
>>> s.store(b'0:53')
68
>>> s.set_current(68)

Storing the second value:

>>> s.store(b'Bonjour tout le monde.')
74
>>> s.store(b'68:74')
98
>>> s.set_current(98)

The content of the storage at this point is:

>>> s.dump()
   0 | 3dbf4cbc-f015-43d9-b280-ff6962a22198
  37 | 000000000000098
  53 | Hello, World!
  68 | 0:53
  74 | Bonjour tout le monde.
  98 | 68:74

Reading back the values, from newest to oldest, can be done this way:

>>> p = s.get_current()
>>> while True:
...     p, val = map(int, s.load(p).split(b':'))
...     print('Record:', s.load(val))
...     if p == 0: break
...
Record: b'Bonjour tout le monde.'
Record: b'Hello, World!'

This will works even if other processes perform changes at the same time.

4 Store

The "store" is a layer on top of "storage" that allows to manage a (potentialy large) tree of JSON encodable values.

Each node of this tree is a record. Each such record contains pointers to other records. The leaf are also records. To distinguish between node and leaf, the record is prefixed with a particular character (@ for node, = for leaf). A node is itself stored using JSON serialization.

All the changes happen in memory, until a call to commit() is made.

The commit is atomic. If the commit succeeds, new processes can see the new value immediately. If the commit fails, then other processes will see no changes at all.

When commiting the changes, the store tries to reuse the old tree structure as much as possible, causing only nodes and leafs that get changed to be updated. The layer is also able to handle node or leaf reparenting.

The abstraction allows to lazily load the data structure, avoiding to load unneeded part of the tree in memory.

There are three main classes.

The Store class has the following operations:

open(filename, [options..]): Open a store
commit(): Permanently store all the changes, and declare this as the new current value.
get_root() → Node|Leaf: Get the root value. Usually a node.
set_root(Node|Leaf): Set the root value.
detach_root() -> Node|Leaf: Detach the root value, allowing to attach it elsewhere.

The Node class has the following operations:

Node() → Node: Create an empty node.
get(key) → Node|Leaf: Get an element.
set(key, Node|Leaf): Set an element.
remove(key) → Node|Leaf: Remove an element and returns it (to make it possible to attach it elsewhere in the tree)
keys() → set: Get the keys of this node.
node(key) → Node: Get a sub node, creating it if necessary. It's an error if the element is a Leaf.

The Leaf class has the following operations:

Leaf(value) → Leaf: Create a leaf.
get() → Any: Get the actual value.
set(Any): Set the value.

4.1 Example: Creating a simple tree

We create a store, with an empty tree at the root:

>>> s = Store.in_memory()
>>> s.root = Node()

Then we add some elements:

>>> s.root.node('alpha').set('delta', Leaf('Alice'))
>>> s.root.node('alpha').node('epsilon').set('zeta', Leaf('Bob'))
>>> s.root.node('alpha').node('epsilon').set('eta', Leaf('Carol'))
>>> s.root.set('beta', Leaf('Dave'))
>>> s.root.node('gamma').node('theta')
Node({})
>>> s.root.node('gamma').set('iota', Leaf('Eve'))
>>> s.root.node('gamma').set('kappa', Leaf('Frank'))

The data structure looks like this:

>>> s.root.dump()
/* UNCOMMITED */
alpha {               /* UNCOMMITED */
  delta "Alice";      /* UNCOMMITED */
  epsilon {           /* UNCOMMITED */
    eta "Carol";      /* UNCOMMITED */
    zeta "Bob";       /* UNCOMMITED */
  }
}
beta "Dave";          /* UNCOMMITED */
gamma {               /* UNCOMMITED */
  iota "Eve";         /* UNCOMMITED */
  kappa "Frank";      /* UNCOMMITED */
  theta {             /* UNCOMMITED */
    ø
  }
}

We see that each part is commented with UNCOMMITED. because none of the values are actually stored on disk.

Now, we store the data on disk:

>>> s.commit()
203

Looking again at the data structure:

>>> n.dump()
/* @203 */
alpha {               /* @104 */
  delta "Alice";      /* @53 */
  epsilon {           /* @81 */
    eta "Carol";      /* @63 */
    zeta "Bob";       /* @73 */
  }
}
beta "Dave";          /* @132 */
gamma {               /* @164 */
  iota "Eve";         /* @141 */
  kappa "Frank";      /* @149 */
  theta {             /* @159 */
    ø
  }
}

We see that every part of the tree got an offset, since everything is on disk now.

We can also look at the raw storage:

>>> s.dump_storage()
   0 | 3dbf4cbc-f015-43d9-b280-ff6962a22198
  37 | 000000000000203
  53 | ="Alice"
  63 | ="Carol"
  73 | ="Bob"
  81 | @{"eta":63,"zeta":73}
 104 | @{"delta":53,"epsilon":81}
 132 | ="Dave"
 141 | ="Eve"
 149 | ="Frank"
 159 | @{}
 164 | @{"iota":141,"kappa":149,"theta":159}
 203 | @{"alpha":104,"beta":132,"gamma":164}

4.2 Example: Accessing an existing tree

In this example, we suppose that we have the tree defined in the previous section on disk, and that we started a new session by opening the storage.

Initialy we just have an offset for the root value. We don't know what type of value is pointed (either a node or a leaf).

By attempting to get access to the root, we effectively loads it, and we now know that it is a node with 3 children:

In turn, loading each children of the root leads to the following state. We now figured that among other thing, beta is a leaf.

Loading everything leads to:

Notice that the node theta has no children.

4.3 Example: Accessing a part of the tree

If the only value that interest us is at the path gamma → kappa, then only a part of the tree will be loaded:

4.4 Example: Modifying the tree

We change one of the leaf:

>>> s.root.node('gamma').set('kappa', Leaf('Felix'))

When we dump the structure, we see that some nodes were marked as UNCOMMITED:

>>> s.root.dump()
/* UNCOMMITED */
alpha {               /* @104 */
  delta "Alice";      /* @53 */
  epsilon {           /* @81 */
    eta "Carol";      /* @63 */
    zeta "Bob";       /* @73 */
  }
}
beta "Dave";          /* @132 */
gamma {               /* UNCOMMITED */
  iota "Eve";         /* @141 */
  kappa "Felix";      /* UNCOMMITED */
  theta {             /* @159 */
    ø
  }
}

If we commit the changes, each uncommited nodes get stored on disk:

>>> s.commit()
291
>>> s.root.dump()
/* @291 */
alpha {               /* @104 */
  delta "Alice";      /* @53 */
  epsilon {           /* @81 */
    eta "Carol";      /* @63 */
    zeta "Bob";       /* @73 */
  }
}
beta "Dave";          /* @132 */
gamma {               /* @252 */
  iota "Eve";         /* @141 */
  kappa "Felix";      /* @242 */
  theta {             /* @159 */
    ø
  }
}

This is the now the current state of the tree for any new process that access the data. Existing processes still see the previous tree (until they open the storage again).

We have the following situation:

We see that we can still examine the old tree if we have the old root pointer. The old tree will exist forever. (Removing unused nodes is another topic.)

4.5 Example: Reparenting

It is possible to reorganize nodes without needing to copy the structure. Note however that a node can have only one parent (it's not a DAG).

Starting from our last tree:

We want to detach the alpha → epsilon node and attach it to the root node.

For that, we first detach it as follow:

>>> n = s.root.node('alpha').remove('epsilon')

Then we attach it at its new place:

>>> s.root.set('epsilon', n)

The structure is now the following:

>>> s.root.dump()
/* UNCOMMITED */
alpha {               /* UNCOMMITED */
  delta "Alice";      /* @53 */
}
beta "Dave";          /* @132 */
epsilon {             /* @81 */
  eta "Carol";        /* @63 */
  zeta "Bob";         /* @73 */
}
gamma {               /* @252 */
  iota "Eve";         /* @141 */
  kappa "Felix";      /* @242 */
  theta {             /* @159 */
    ø
  }
}

Interestingly, since the node has not changed, then only its previous parent and its new parent need to change.

>>> s.commit()
345
>>> s.root.dump()
/* @345 */
alpha {               /* @330 */
  delta "Alice";      /* @53 */
}
beta "Dave";          /* @132 */
epsilon {             /* @81 */
  eta "Carol";        /* @63 */
  zeta "Bob";         /* @73 */
}
gamma {               /* @252 */
  iota "Eve";         /* @141 */
  kappa "Felix";      /* @242 */
  theta {             /* @159 */
    ø
  }
}

The tree is now as follow:

4.6 Example: Reparenting the root node

Starting from the previous tree, we can detach the root itself, and set it as the subnode of a new tree:

>>> n = s.detach_root()
>>> s.root = Node()
>>> s.root.set('lambda', Leaf('Gary'))
>>> s.root.set('mu', n)
>>> s.commit()
406

This gives:

The original tree was kept unchanged and is simply referenced by a new node. This caused only two records to be written (one for the new root, and one for the additional leaf).

We can perform the opposite operation: bringing back the tree as the root one.

>>> n = s.root.remove('mu')
>>> s.root = n
>>> s.commit()
345

We see that we're back to the previous tree as the root.

5 Tree

The "tree" is the last layer, on top of "store", that provides high level access to the tree, along with a schema enforcement.

There are three main classes.

Note that the Tree class has all its operations named with an underscore prefix to avoid confusion with actual configuration keys. However, the restriction doesn't apply when using __getitem__, __setitem__ and __delitem__, so it's still possible to have keys with any name encodable as an attribute name in JSON.

The Tree class has the following operations.

value = conf.key: Access a key, using attribute syntax.
value = conf['key']: Access a key, using index syntax.
conf.key = value: Set a key, using attribute syntax.
conf['key'] = value: Set a key, using index syntax.
del conf.key: Remove a key, using attribute syntax.
del conf['key']: Remove a key, using index syntax.
_keys(): Get the set of keys of the tree.
_has(key): Check if the tree contains the given key.
_query(expr) → dict: Query the tree recursively and return key matching the given expression.

The Configuration class represents the root node. It has the same operations as the Tree class plus the following ones:

Configuration(filename, *, validator=None): Open a configuration tree.
_commit(): Store all the changes to disk.

The Schema class is used to enforce structure and values, and has the following operations, which are all optionals:

descend(name) → Schema: Get the specific schema for the given subtree.
validate(name, value): Check if the value matches the requirements for the given key. This is called whenever a key is about to be changed.
check(): Check that the whole tree matches global requirements. This is called before a commit.
pose(name, value) → ?Tree: Get a specific subtree in place of a leaf value. This might be used for redirection.
choices() → set: Get the set of possible keys. This might be used for completion.
help(name) → str: Get the help string for the given key.
format(name) -> str: Get hint about how the configuration should be printed for the given key.
missing() → set: Get the set of missing but mandatory keys. This is used for annotating the documentation.

5.1 Example: Basic operations

In the following session, we assume that:

we have an object n representing a configuration, which is empty at the beginning, and has no schema defined,
the reset command removes all the existing elements from the configuration,
the commit command applies the changes permanently to the configuration.

We can store values in a hierarchy of keys:

>>> n.server.listen = '127.0.0.1'
>>> n.server.port = 443
>>> n.server.timeout = 600
>>> n.server.log.level = 'INFO'
>>> n.server.log.file = '/var/log/acme.org'

We get:

>>> n
server {
  listen "127.0.0.1";
  log {
    file "/var/log/acme.org";
    level "INFO";
  }
  port 443;
  timeout 600;
}

We can work on a subtree only:

>>> log = n.server.log
>>> log.compress = True
>>> log
compress true;
file "/var/log/acme.org";
level "INFO";

Looking at the whole tree, we also see the change:

>>> n
server {
  listen "127.0.0.1";
  log {
    compress true;
    file "/var/log/acme.org";
    level "INFO";
  }
  port 443;
  timeout 600;
}

Subtree can be reparented. For example, to move the log subtree to the root, we can use the Move special wrapper, to do the following:

>>> n.log = Move(n.server.log)
>>> n
log {
  compress true;
  file "/var/log/acme.org";
  level "INFO";
}
server {
  listen "127.0.0.1";
  port 443;
  timeout 600;
}

We can also copy entire subtree:

>>> n.default.log = n.log
>>> n
default {
  log {
    compress true;
    file "/var/log/acme.org";
    level "INFO";
  }
}
log {
  compress true;
  file "/var/log/acme.org";
  level "INFO";
}
server {
  listen "127.0.0.1";
  port 443;
  timeout 600;
}

And we can also remove subtree:

>>> del n.log
>>> n
default {
  log {
    compress true;
    file "/var/log/acme.org";
    level "INFO";
  }
}
server {
  listen "127.0.0.1";
  port 443;
  timeout 600;
}

Note that copying a subtree then removing the original one is not the same as moving it, because the move operation is optimized to avoid any copy.

5.2 Example: Using Schema

A schema allows to check and enforce various properties on the tree, such as its structure, the syntax of its value, but it also serves for providing documentation and performing redirection (explained later).

The default schema doesn't enfore anything.

Here is a schema that requires that all leafs are integers, and that keys are exactly 3 characters long:

class IntegerOnlySchema(Schema):
    def validate(self, tree, name, value):
        if len(name) != 3:
            raise Schema.Error('A key must be exactly 3 characters long')
        if not isinstance(value, int):
            raise Schema.Error('A leaf must contains an integer')

>>> n = Configuration(…, schema=IntegerOnlySchema())
>>> n.foo = 10
>>> n.bar = 20
>>> n.baz = "Not A Number"
Traceback (most recent call last):
...
tree.Error: A leaf must contains an integer
>>> n.quux = 30
Traceback (most recent call last):
...
tree.Error: A key must be exactly 3 characters long

By default, the schema is applied recursively to the entire subtree, but it is possible to provide specific schema for each part of the tree.

Given these 2 schemas:

class LogSchema(Schema):
    def validate(self, tree, name, value):
        if name != 'filename':
            raise Schema.Error('Only the filename key can be set')
        if not isinstance(value, str):
            raise Schema.Error('Expected a string for the filename key')
    def descend(self, tree, name):
        raise Schema.Error('This tree can only contains leafs.')

class MonitorSchema(Schema):
    def validate(self, tree, name, value):
        if name not in ('enabled', 'passive'):
            raise Schema.Error('Only the enabled or passive keys can be set')
        if not isinstance(value, bool):
            raise Schema.Error('Expected a boolean')
    def descend(self, tree, name):
        raise Schema.Error('This tree can only contains leafs.')
    def missing(self, tree):
        return {'enabled', 'passive'} - tree._keys()
    def check(self, tree):
        if self.missing(tree):
            raise Schema.Error('Mandatory key(s) missing: {!r}'.format(self.missing(tree)))

The root schema dispatches to LogSchema or MonitorSchema according to the path followed in the tree:

class RootSchema(Schema):
    def validate(self, tree, name, value):
        if name not in ('log', 'monitor'):
            raise Schema.Error('Expected either log or monitor key')
    def descend(self, tree, name):
        if name == 'log':
            return LogSchema()
        elif name == 'monitor':
            return MonitorSchema()
        else:
            raise RuntimeError('Invalid key {!r}'.format(name))
    def help(self, tree, name):
        return {
            'log': 'Log general settings',
            'monitor': 'Cluster monitoring settings'
        }.get(name)

Using the schema:

>>> n = Configuration(…, RootSchema())

We can tests that the schema rules are enforced:

>>> n.log.file = '/var/log/acme.log'
Traceback (most recent call last):
...
tree.Error: Only the filename key can be set
>>> n.log.filename = '/var/log/acme.log'

>>> n.monitor.enabled = 1
Traceback (most recent call last):
...
tree.Error: Expected a boolean
>>> n.monitor.enabled = True

>>> n.foo.bar = 'Not Ok'
Traceback (most recent call last):
...
RuntimeError: Invalid key 'foo'

Since we provided an help method and a missing method, dumping the tree also shows documentation and what mandatory keys are missing:

>>> n
##
## Log general settings
##
log {
  filename "/var/log/acme.log";
}
##
## Cluster monitoring settings
##
monitor {
  enabled true;
  /* Warning: missing mandatory key 'passive' */
}

Attempting to commit:

>>> commit
Traceback (most recent call last):
...
tree.Error: Mandatory key(s) missing: {'passive'}

5.3 Example: Schema with redirection

A schema can also provide a method to reference different subtrees at various places in the tree structure.

Here is a (contrived) schema that makes a leaf redirect to the root node if its value is __ROOT__:

class RedirectSchema(Schema):
    def pose(self, tree, name, value):
        if value == '__ROOT__':
            return tree._root

Testing it:

>>> n = Configuration(…, RedirectSchema())
>>> n.foo.bar = 1
>>> n.foo.baz = '__ROOT__'
>>> n
foo {
  bar 1;
  baz @("__ROOT__");
}

We see that the value of foo.baz is displayed with @() which mean that the value is interpreted if we try to access it.

And indeed displaying foo.baz will show the root node:

>>> n.foo.bar
1
>>> n.foo.baz
foo {
  bar 1;
  baz @("__ROOT__");
}
>>> n.foo.baz.foo.baz.foo.bar
1

It's up to the schema to interpret the value of the leaf to perform a redirection of any sort.

Say we have the following structure:

node 0c832a01-f899-4e53-bdef-c832ae498f79 {
  name "DataStore";
}
node 91dfc21d-3469-46e6-98da-7440a4b17be2 {
  name "Capture";
}
zone 1 {
  name "R&D Department";
  rule 1 {
    node @("91dfc21d-3469-46e6-98da-7440a4b17be2");
    vlan 10;
  }
}

with the following schema:

class Conf(Schema):
    def format(self, tree, name):
        return 'arg' if name in {'node', 'zone', 'rule'} else None
    def pose(self, tree, name, value):
        if name == 'node':
            return tree._root.node[value]

Note that we used format to instruct the tree printer to combine node and the ID, and same for zone and rule. This doesn't change the structure of the tree, but only the way it is displayed.

When we go through the node key of zone.1.rule.1, we get redirected and everything happen as if the tree in question was present within rule:

>>> n.zone[1].rule[1].node.name
'Capture'

We can easily change the reference:

>>> n.zone[1].rule[1].node = '0c832a01-f899-4e53-bdef-c832ae498f79'
>>> n.zone[1].rule[1].node.name
'DataStore'

As expected, if the referenced node changes, the change is visible through the reference:

>>> n.node['0c832a01-f899-4e53-bdef-c832ae498f79'].name = 'Central'
>>> n.zone[1].rule[1].node.name
'Central'

If we want to see the actual value of the leaf, before the redirection is applied, we can use the low level method _get():

>>> n.zone[1].rule[1]._get('node', raw=True)
'0c832a01-f899-4e53-bdef-c832ae498f79'

6 Schema

To ease definition of schema, a layer on top of the Schema class allows to describe structure, validation, documentation and redirection in an convenient way.

Each node of the tree is described by an instance of the Type class.

The Type class take two arguments:

a dictionary describing each allowed keys,
an optional predicate to check the overal structure of the tree.

The dictionary passed to Type is of the following form: {<key>: <specification>}.

The key can be either a specific name, or a regular expression. It is the pattern attribute of the specification that specify how to interpret it.

The specification is a dictionary with the following keys:

pattern: A boolean indicating if the key should be interpreted as a regular expression. Default: False.
required: A boolean indicating if the key is mandatory. Default: False.
type: The schema to use for the subtree. Default: the default schema, which does not enforce any constraint.
description: A text describing the meaning of the key. Default: no description.
arg: A boolean indicating how to format the display. If True, the key is assumed to be a prefix for all the sub-trees. Default: False.

If a key should contains a leaf, then its type should derive from ValueValidator. Some validators are already provided:

BooleanValidator: Ensure that the value is either false or true.
IntegerValidator: Ensure that the value is a JSON encodable Python integer.
StringValidator: Ensure that the value is a JSON encodable string.

For example, the following declaration for the root node says that we must have 2 keys: machine and zone. We provide a description, and mark them as mandatory. In turn, the type item leads to another Type instance for validating the respective sub tree.

root_validator = Type({
    'machine': {
        'type': machines_validator,
        'required': True,
        'description': 'Machines (UI, DataStore and Capture)',
        'arg': True
    },
    'zone': {
        'type': zones_validator,
        'required': True,
        'description': 'Zones for tagging captured traffic',
        'arg': True
    }
})

The hypothetical machines_validator could look like this:

machines_validator = Type({
    '^uuid(?:[0-9]|[1-9][0-9]+)$': {
        'pattern': True,
        'type': Type({
            'hostname': {
                'type': HostnameValidator(),
                'required': True,
                'description': 'Hostname'
            },
            'master': {
                'type': BooleanValidator(),
                'required': True,
                'description': 'Indicates if this is the master machine'
            },
            'ntp': {
                'type': Type({
                    'enabled': {
                        'type': BooleanValidator(),
                        'description': 'Indicate if NTP should be enabled on this machine'
                    }
                }),
                'description': 'Settings related to NTP'
            }
        })
    },
})

For convenience we could have split the previous validator into several variables.

7 Transactions

Since transactions are optimistics, changing the tree must be attempted in a loop until it succeed. The point is that usually the configuration is mostly read, but rarely modified.

A commit() will succeed only if no other processes changed the tree in the meantime.

This could be abstracted with such a function:

def try_until_success(fun):
    """
    Apply changes to the configuration.

    Args:
        fun -- (callable) the function that will perform changes on
          the given configuration tree.
    """
    while True:
        conf = Configuration(...)
        fun(conf)
        try:
            conf._commit()
        except ConcurrencyError:
            time.sleep(random.random())
        else:
            break

This allow to perform atomic update:

def changes(conf):
    if conf.foo < 20:
        conf.bar += 10

try_until_success(changes)

8 Inspiration

systemconf (PerformanceVision)
JunOS (Juniper routers)
Falcor (Netflix)
GraphQL (Facebook)
ImmutableJS (Facebook)
xpath (XML)
/sys, /proc/ (Linux)
Append-only BTree, B+Tree, CouchDB
MVCC (PostgreSQL)

8.1 Related projects

EdgeDB

9 About this documentation

This documentation is not finished yet.

While it explains all the layers, it is missing an introduction to understand how this project is useful.

It should also better explain shortcomings (bad for very frequent updating for instance) and advantages (the atomic aspect, the ability to work on very large tree).

Documentation produced by Org-Mode (Emacs) and Graphviz.