Synchronized Systems Technology: A New Approach to Distributed Ledger Technology

26-Sep-2022

Revised 3-Aug-2023

Why Is A New Approach Needed?

TL;DR: Current non-blockchain DLT does not easily and performantly do what many solutions designers in the distributed data space actually want and need. Below are some ideas for a better approach.

Distributed Ledger Technology ("DLT") is a very broad term that in theory covers both traditional blockchain tech like Bitcoin and Ethereum, newer entrants such as Corda, and hybridized/layered models such as Quorum. Some of the design goals of non-blockchain DLT include:

Private-permissioned engagement to avoid issues caused by anonymity (which of course was a major design feature of traditional blockchains)
Substantially improved data compartmentalization, visibility, and security (i.e. not everything 100% in the open for any actor to view)
Substantially improved transactional performance
Elimination of framework gotchas like gas price fluctuation, nondeterminism of block commits vs. business transaction commits, etc.

As the space has matured over the past 10 years, it has become increasingly clear that these goals are not enough. The platform for sharing cryptographically secured data ("cryptoassurance") cannot be a new data platform because unless all the data access activity occurs there, a cryptoassurance gap is created between the new platform and the old. The larger and more complex (and yes, more legacy) the environment, the more likely that a new cryptoassured data platform will:

Not be the main locus of operational and analytic activity but rather just another ETL source point from which data will be extracted and loaded into the "mainstream" systems
Require separate infrastructure, operational, security, and vendor management

It has also become clear that the data and software design, development, testing, and release process of current DLT -- especially at Day 2 -- is at a minimum markedly different than the SDLC for the rest of technology footprint and at worst, brittle and difficult to orchestrate upgrades and/or changes to logic, entitled actors, data access permissions, etc.

In other words, current DLT exacerbates the problem instead of solving it.

We believe the Synchronized Systems Technology ("SST") principles described below address what we have learned.

The Design Principles Of Synchronized Systems Technology ("SST")

Cryptographically-assured reliable and consistent replicas of data
"The" foundational feature of SST is ensuring consistent replicas of data on multiple nodes. The data is opaque to the platform but many conventional, secure, and well-adopted techniques that power https-based e-commerce are employed to provide cryptowrapping of the data. Cryptowrapping means applying hashes like SHA2 and digital signatures to arbitrary content to prove immutability, authenticity, and ownership.
Data is directly and performantly queryable in existing database infrastructure
Arguably the most important goal, SST is implemented such that data is not only cryptowrapped but also directly queryable in an existing conventional database of choice (e.g. Postgres, MongoDB, Oracle, MySQL, SQLite, etc.).
- No supporting database "off to the side", no additional ETL, and thus no vulnerability to a "cryptogap" where secure material passes thru a workflow step than can suborn the hashes and digital signatures
- No ORM or other conversion/mapping subsystem to maintain
- All SST data is indexable, partitionable, and available to functions and extensions in the host database exactly like other "regular" unshared data
- All SST data is operationally managed, secured, storage optimized, and integratable exactly like other "regular" data.
Broadly as an example, a Postgres database supporting product transactions with 16 tables now has 17; the new peer table contains the SST-managed shared data and is operationally consistent with the original 16 tables. Traditional blockchain and even many newer DLT environments have no capability to do this, so performant querying, reporting, and analytics must always be performed "off-chain" on some form of a mirror.
Data is arbitrarily complex and sized
Current DLT environments but especially traditional blockchains are not designed to handle even modest sized records of data and many incur substantial costs for carrying the data on the network:
- As of 1-Aug-2023, 1K of storage in one smart contract costs $58.64 on Ethereum.
- Rich shape construction in Ethereum solidity is difficult. Definition is straightforward but the lack of true append() and keyIter() methods make manipulation logic much more cumbersome than that offered by most other current, popular languages, not to mention that structure definitions cannot be shared/passed between the contract and client-side w3 code.
Data "records" contain subsections to which access is assigned
Most DLT designs make it difficult to precisely and selectively permit write and read access to "subsections" of a record. In the pure blockchain space this isn't even an option; by design everything is transparent, requiring use of content encryption to "hide" sensitive data. A standard example is a negotiated custom product sale. The details of the product are open to all but the buyer/seller negotiations are private -- and the whole thing needs to be managed as one record.
State change is actuated through events
Anything that needs to change state does so through a event management system, not through simple function calls on the smart contract. A well-designed event model enables:
- All local node changes on a contract to have a sequence master to arbitrate order of execution and facilitate precise replay when recoverable errors occur.
- Ability of multiple nodes to coordinate and rationalize changes initiated by them vs. changes initiated by other nodes. Related to this is the need to clearly separate intent to change state (and the processing thereof) vs. actual changed state. It is crucially important to manage intent to change if the change is not possible.
- Specific and queryable information on actions (events) that created state change. This completes the state->event->new_state model.
- Flexible sync/async calling paradigms. Publishing and reacting to events is a more "layerable" model. If so desired, an event dispatch-and-wait mechanism can be easily constructed on top of it. There are too many nuances and variations -- timeouts, threading, reordering, etc. -- to crisply implement this functionality in the core.
Note that SST essentially has no -- and needs no -- read API/model because data is queried and analyzed directly in the host database using the native host database languages and tools, just like "regular" unshared data.
No special domain-specific languages (DSLs)
SST needs to use groovy, python, Java, and other widely known languages. The problem space demands simplicity and wide availability of talent to manage the artifacts, especially since multiple organizations are typically involved in a business process. Lowest common denominator rules apply.
Smart contracts contain logic and are separate from state
The definition and specific capabilities of a "smart contract" varies but for the purposes of SST, it means a piece of expressive code logic that is also cryptowrapped and shared/agreed amongst participants. Two or more instances of data can exist as separate states while being managed by the same contract. Think of state as the private data in an object and the contract logic as the method calls in the object.
Smart contracts are versioned just like state data
This is a significant departure from Ethereum, where complicated means are required to "move" data and state from one contract to an updated version. Smart contracts are not required to change but if they do, the logic to supercede previous versions is very straightforward in SST.
Smart contracts are just expressive logic layers to generate lower level state change events.
Most smart contract use cases exist to enforce rules on the underlying data using highly expressive languages like Java and python. Calling a contract method that performs a state change operation is basically a layer over the existing underlying data section access permissions; in other words, the same effect could be had by issuing a lower level change data event directly.
All actors are known
For practical business concerns, it is much easier to have a permissioned, authenticated model for actor interaction.
Event notification is a core platform capability
Kafka, Solace, RabbitMQ, and other providers are pluggable into the platform. Notification is based on and granular to contract/business events, not mining or other platform activities. This is once again a marked difference from Ethereum where event notifications occur when a block is mined. You must then dig through the block to search for a particular transaction and then additionally determine what business level event precipitated the transaction.
Zero compile time dependencies on data shapes
It is vitally important to keep the system fully compile-time independent and generic so that both the core and most subsystems remains small and infrequently changed. The addition of a field to state cannot require systematic recompilation of all states, contracts, and/or -- in the worst case scenario -- framework codes. Only at the "edges" of the architecture should data shapes be addressed using bespoke code and then at the edge maintainer's peril.
Very low runtime dependency graph
An unfortunate consequence of the adoption of open source software is that often platforms require literally hundreds of libraries. This is manageable solely within the context of the out-of-the-box platform itself but as integration to user apps and services progresses, an increasing number of version clashes will occur and require non-revenue time to resolve. Particularly in the Java space, libs like com.fasterxml.jackson, netty, org.apache.commons, and rxjava used in DLT frameworks are not at the same revision as those demanded by applications linking with the platform SDK. In addition, in the enterprise space, all libraries must be subject to security scans, end-of-service monitoring, and other governance/provenance policies. It is essential that SST presents the smallest possible complexity profile so that all efforts can be focused on the multi-participant system design itself.
Along these lines, it is important that smart contracts also drive to the smallest possible dependency footprint because the greater the number dependencies, the greater the likelihood that different organizations sharing the same synchronized code will have version, security, or other conflicts.
No gas
A synchronized system used by a set of participants involves costs they are willing to incur because the system adds value to the participants. The whole notion of creating an incentive / compensation model for anonymous parties to mine blocks and charge for new data storage is simply unnecessary for SST.
Highly integratable with off-SST resources
SST presents a core data and DAL platform that easily integrates with other data and technologies. The node synchronization, cryptowrapping, and other mechanics sit on top of this -- and thus are not part of it. In fact, the SST core data platform is entirely usable without the synchronization, permissions, and other pieces. This opens the opportunity for "co-location" of off-SST data on the same persistence backplane, completely secured with entitlements, creating an efficient hybrid system.
Alternate synchronization protocols possible
Related to the above, SST has a default synchronization model for multiple participants that involves last state event seen, last initiating node event seen, and remote node comparison of incoming new event to remote state. It is possible to design other protocols that are simpler and faster or slower and less race-condition sensitive. In fact, multiple synchronization protocols can be running at the same time for a particular contract/state design.

Interesting Use Cases For Synchronized Systems Technology

Third party / regulatory read-only observation of complex data
In this use case, a participant like a regulatory body enjoys complete, consistent transparency as well as independent analytics on complex data including positions, risk, and other data. The effect is like a shared database but by virtue of node data synchronization, there is no central point of ownership / failure / compromise, and no participant can technically deny access to the data by another participant. Regulators no longer have to "request" reports and verification of data; they are watching the exact same business flow and critical data that is driving the actions of the other participants they are regulating.
Active management and analysis of long-lived processes
Original blockchain design principles are simply not aligned with tracking state change of a thing over weeks, months, or years. Instead, blockchains are focused on single value transfer events, not process flows. As a result, most smart contract environments still have a legacy root in the "immutability" of a contract because of the simplicity of a single value transfer event. The reality of business processes is that they may be amended over time with a need to track changes.
Performant vending of data to many, many readers
Most blockchains did not have performance as priority. SST is designed around data and being able to quickly move large amounts of it off the platform, scalable to thousands or more connections on a single node.
Selective secured sharing of data
Not all participants in a data sharing design need access to absolutely all the data. For example, in a real estate deal, all parties see the common info (address, age of home, etc.), but only buyer and seller can see and change the sale price, and only the inspector is allowed to change the measured radon value but everyone can see it.
Simplified, performant business event monitoring
Most technology footprints will have a mix of SST and non-SST implementations. Because SST contracts are changed only through the submission of events, the semantics for state change are directly and easily coupled to the physical artifacts being published. State change through method calls at first appears easier (y=f(x)) but too many factors come into play that are not easily addressed in such a synchronous-biased invocation including:
- The important distinction between intent to change and actual verified change
- Too many domains of Errors and Exceptions (e.g. data, synchronization, perimissions, infra failure) could be encountered through a single call.
- Sophisticated means to initiate async behavior in a programmatically synchronous way, e.g.
```
	    Handle h = TheObject.doMethodButActuallyStartSomethingInBackgroud();
	    ...
	    Result r = h.getValue(); // implies wait if not yet available...
	    
```
  initially look attractive but as above, too many alternate sets of errors and exceptions are baked into the otherwise simple calls, making recovery logic non-trivial.
Capturing "time-series" of data in contract state
SST smart contracts (and the underlying state they manage) can use arrays just as easily as scalar values. For example, a request-for-quote solution might append a tuple containing submitter ID, price, and timestamp to an array. A single state can thus represent the history of the RFQ process, eliminating the need to synthesize the history by pulling all the prior states.
Simplified document management for modest sized documents
Current DLTs and essentially all blockchains simply cannot handle storing actual documents (docx, xlsx, PDF, etc.) or other digital content such as JPEGs or PNGs; they are simply too large. The current solution is to take a hash of the material and store that in the smart contract along with an URI to the actual material, typically stored on BLOB storage. While the process is functional and enjoys scaling to even huge pieces of content but it means that the security scope / controls have to be extended outside the platform. SST can handle states up to a few megabytes if a one-platform design is desired.

SST is achievable using Susynct today

In assessing the capabilities desired in a synchronized system it should be clear that a fresh implementation is a better way to go rather than taking an existing first generation blockchain and twisting it and/or layering on top of it. Ten years have taught us that they are different kinds of systems for different purposes.

Like this? Dislike this? Let me know