An essential part of every database is to make a decision about which tradeoff to make, and where to make them. Every database makes them in one way or another, and I feel it is important to be upfront about them. Being honest about what is won and what is lost makes makes it easier for users to make an informed decision when picking a database for their use-case.
Please keep in mind that all systems make tradeoffs and that just because some are not open about them does not mean they do not exist. In such cases it is more likely that they are being swept under the rug, often at great cost and inconvenience to the user.
Metrics have several unique properties that distinguish them from other types of data. These properties have been embraced to form the bedrock of DalmatinerDB's design.
If those properties hold true for you then the chances are good that DalmatinerDB is a good fit. If they don’t, another system with different tradeoffs might be a better choice.
Metrics are immutable; once a metric is created it isn’t going to change. For example, a low CPU usage last Monday at 05:31 will not start to spike today.
However, it is likely that writing the metric can be delayed, thus resulting in straggler values that occur in the 'past'.
DalmatinerDB allows for metric input in second or even sub-second precision. At those short intervals it is more important to allow the majority of the metrics to be written correctly than to guarantee that every metric has every second accounted for.
DalmatinerDB will interpolate the missing values to the best of it’s abilities. This is usually acceptable for aggregated data. It is very costly to guarantee complete accuracy for the rare case where a single metric spikes for a very short time before returning to normal. DalmatinerDB considers this too high a price to pay in terms of performance and scalabilty.
Perhaps the first decision to make is either to pick Consistency or Availability. With metrics and the notion of immutability there is little harm in picking Availability here. DalmatinerDB will stay available for read and write options - even in the event of a network partition - at the cost of giving stale reads on both sides of the partition until it is healed (given side A can’t know what was written at side B).
Given the immutability of metrics it can be argued that it is impossible to generate conflicting values on both sides of a split, thus merging is simple and lossless. Conflict resolution is enforced on reads, and currently there is no active anti-entropy.
DalmatinerDB is designed to run on ZFS and the use of other filesystems are strongly discouraged. While DalmatinerDB will operate on any filesystem, the performance will be degrade significantly without ZFS or comparable filesystem as a base.
DalmatinerDB relies heavily on taking advantage of ZFS facilities such as the ARC, ZIL, checksums and volume compression. Expecting those things to be handled on a filesystem level makes it possible to remove most of the code for caching, compression and data integrity validation. This improves code simplicity, stability, and performance significantly.
There is no technical reason why DalmatinerDB won’t run on a different FS, it will just lose some of the the edge it gets by taking advantage of the advanced features. The Linux folks are very proud of btrfs so even so DalmatinerDB is not tested on it, it should give a comparable experience.
Updated less than a minute ago