Synchronized Systems Technology: A New Approach to Distributed Ledger Technology
Why Is A New Approach Needed?
TL;DR: Current non-blockchain DLT does not easily and performantly do what many solutions
designers in the distributed data space actually want and need. Below are some ideas
for a better approach.
Distributed Ledger Technology ("DLT") is a very broad term that in theory
covers both traditional blockchain tech like Bitcoin and Ethereum,
newer entrants such as Corda, and hybridized/layered models such as Quorum.
Some of the design goals of non-blockchain DLT include:
- Private-permissioned engagement to avoid issues caused by anonymity
(which of course was a major design feature of
traditional blockchains)
- Substantially improved data compartmentalization, visibility, and
security (i.e. not everything 100% in the open for any actor to view)
- Substantially improved transactional performance
- Elimination of framework gotchas like gas price fluctuation,
nondeterminism of block commits vs. business transaction commits, etc.
As the space has matured over the past 10 years, it has become increasingly
clear that these goals are not enough. The platform for sharing cryptographically
secured data ("cryptoassurance") cannot be a new data platform because unless
all the data access activity occurs there, a cryptoassurance gap is
created between the new platform and the old. The larger and more complex
(and yes, more legacy) the environment, the more likely that a new cryptoassured data platform will:
- Not be the main locus of operational and analytic activity but rather
just another ETL source point from which data will be extracted and loaded into the
"mainstream" systems
- Require separate infrastructure, operational, security, and vendor management
It has also become clear that the data and software design, development,
testing, and release process of current DLT -- especially at Day 2 -- is at a
minimum markedly different than the SDLC for the rest of technology footprint
and at worst, brittle and difficult to orchestrate upgrades and/or changes to
logic, entitled actors, data access permissions, etc.
In other words, current DLT exacerbates the problem instead of solving it.
We believe the Synchronized Systems Technology ("SST") principles described
below address what we have learned.
The Design Principles Of Synchronized Systems Technology ("SST")
- Cryptographically-assured reliable and consistent replicas of data
"The" foundational feature of SST is ensuring consistent replicas of data on multiple
nodes. The data is opaque to the platform but many conventional, secure, and
well-adopted techniques that power https-based e-commerce are employed to provide
cryptowrapping of the data. Cryptowrapping
means applying hashes like SHA2 and digital signatures to arbitrary
content to prove immutability, authenticity, and ownership.
- Data is directly and performantly queryable in existing database infrastructure
Arguably the most important goal, SST is implemented such that
data is not only cryptowrapped but also directly queryable
in an existing conventional database of choice
(e.g. Postgres, MongoDB, Oracle, MySQL, SQLite, etc.).
- No supporting database "off to the side", no additional ETL,
and thus no vulnerability to a "cryptogap" where secure material
passes thru a workflow step than can suborn the hashes and digital
signatures
- No ORM or other conversion/mapping subsystem to maintain
- All SST data is indexable, partitionable, and available to
functions and extensions in the host database exactly like
other "regular" unshared data
- All SST data is operationally managed, secured, storage optimized, and
integratable exactly like other "regular" data.
Broadly as an example, a Postgres database supporting product
transactions with 16 tables now has 17; the new peer table contains the
SST-managed shared data and is operationally consistent with the
original 16 tables.
Traditional blockchain and even many newer DLT environments
have no capability to do this, so performant querying, reporting, and
analytics must always be performed "off-chain" on some form of a mirror.
- Data is arbitrarily complex and sized
Current DLT environments but especially traditional blockchains are not
designed to handle even modest sized records of data and many incur
substantial costs for carrying the data on the network:
- As of 1-Aug-2023, 1K of storage in one smart contract costs
$58.64 on Ethereum.
- Rich shape construction in
Ethereum solidity
is difficult. Definition is straightforward but the lack
of true append() and keyIter() methods make
manipulation logic much more cumbersome than that offered by most
other current, popular languages, not to mention that structure
definitions cannot be shared/passed between the contract and
client-side w3 code.
- Data "records" contain subsections to which access is assigned
Most DLT designs make it difficult to precisely and selectively permit
write and read access to "subsections" of a record. In the pure
blockchain space this isn't even an option; by design everything
is transparent, requiring use of content encryption to "hide" sensitive
data. A standard example is
a negotiated custom product sale. The details of the product are open
to all but the buyer/seller negotiations are private -- and the whole
thing needs to be managed as one record.
- State change is actuated through events
Anything that needs to change state does so through a event management
system, not through simple function calls on the smart contract. A
well-designed event model enables:
- All local node changes on a contract to have a sequence master to
arbitrate order of execution and facilitate precise replay when
recoverable errors occur.
- Ability of multiple nodes to coordinate and rationalize changes
initiated by them vs. changes initiated by other nodes. Related to
this is the need to clearly separate intent to change state
(and the processing thereof) vs. actual changed state.
It is crucially important to manage intent to change if the change
is not possible.
- Specific and queryable information on actions (events) that
created state change. This completes the state->event->new_state
model.
- Flexible sync/async calling paradigms. Publishing and
reacting to events is a more "layerable" model. If so desired,
an event dispatch-and-wait mechanism can be easily constructed
on top of it.
There are too many nuances and variations -- timeouts, threading,
reordering, etc. -- to crisply implement this functionality in
the core.
Note that SST essentially has no -- and needs no -- read API/model
because data is queried and analyzed directly in the host database using
the native host database languages and tools, just like "regular"
unshared data.
- No special domain-specific languages (DSLs)
SST needs to use groovy, python, Java,
and other widely known languages. The problem space demands simplicity
and wide availability of talent to manage the artifacts, especially since
multiple organizations are typically involved in a business process.
Lowest common denominator rules apply.
- Smart contracts contain logic and are separate from state
The definition and specific capabilities of a "smart contract" varies but
for the purposes of SST, it means a piece of expressive code logic that
is also cryptowrapped and shared/agreed amongst participants.
Two or more instances of data can exist as separate states while being
managed by the same contract.
Think of state as the private data in an object and the contract logic
as the method calls in the object.
- Smart contracts are versioned just like state data
This is a significant departure from
Ethereum, where complicated means are required to "move"
data and state from one contract to an updated version. Smart contracts
are not required to change but if they do, the logic to supercede previous
versions is very straightforward in SST.
- Smart contracts are just expressive logic layers to generate lower level state change events.
Most smart contract use cases exist to enforce rules on the underlying
data using highly expressive languages like Java and python.
Calling a contract method that performs a state change operation is
basically a layer over the existing underlying data section access
permissions; in other words, the same effect could be had by issuing
a lower level change data event directly.
- All actors are known
For practical business concerns, it is much easier to have a
permissioned, authenticated model for actor interaction.
- Event notification is a core platform capability
Kafka, Solace, RabbitMQ, and other providers are pluggable into
the platform. Notification is based on and granular to
contract/business events, not mining or other platform activities.
This is once again
a marked difference from Ethereum where event notifications
occur when a block is mined. You must then dig through the block
to search for a particular transaction and then additionally
determine what business level event precipitated the transaction.
- Zero compile time dependencies on data shapes
It is vitally important
to keep the system fully compile-time independent and generic so that
both the core and most subsystems remains small and infrequently
changed. The addition of a field to state cannot require systematic
recompilation of all states, contracts, and/or -- in the worst case
scenario -- framework
codes. Only at the "edges" of
the architecture should data shapes be addressed using bespoke code
and then at the edge maintainer's peril.
- Very low runtime dependency graph
An unfortunate consequence of the adoption of open source software is that
often platforms require literally hundreds of libraries.
This is manageable solely within the context of the out-of-the-box platform
itself but as integration to user apps and services progresses, an
increasing number of version clashes will occur and require non-revenue
time to resolve. Particularly in the Java space, libs like
com.fasterxml.jackson, netty, org.apache.commons,
and rxjava used in DLT frameworks are not at the same revision
as those
demanded by applications linking with the platform SDK. In addition, in
the enterprise space, all libraries must be subject to security scans,
end-of-service monitoring, and other governance/provenance policies. It
is essential that SST presents the smallest possible
complexity profile so that all efforts can be focused on the
multi-participant system design itself.
Along these lines, it is important that smart contracts also drive
to the smallest possible dependency footprint because the greater
the number dependencies, the greater the likelihood that different
organizations sharing the same synchronized code will have version,
security, or other conflicts.
- No gas
A synchronized system used by a set of participants involves costs they are
willing to incur because the system adds value to the participants.
The whole notion of creating an incentive / compensation model for
anonymous parties
to mine blocks and charge for new data storage is simply unnecessary for SST.
- Highly integratable with off-SST resources
SST presents a core data and DAL platform that easily integrates with
other data and technologies. The node synchronization,
cryptowrapping, and other mechanics sit on top of this -- and thus
are not part of it. In fact, the SST core data platform is entirely
usable without the synchronization, permissions, and other
pieces. This opens the opportunity for "co-location" of off-SST
data on the same persistence backplane, completely secured with
entitlements, creating an efficient hybrid system.
- Alternate synchronization protocols possible
Related to the above, SST has a default synchronization model for
multiple participants that involves last state event seen, last
initiating node event seen, and remote node comparison of incoming
new event to remote state. It is possible to design other protocols
that are simpler and faster or slower and less race-condition sensitive.
In fact, multiple synchronization protocols can be running at the
same time for a particular contract/state design.
Interesting Use Cases For Synchronized Systems Technology
- Third party / regulatory read-only observation of complex data
In this use case, a participant like a regulatory body enjoys complete,
consistent transparency as well as independent analytics on complex data including
positions, risk, and other data. The effect is like a shared database but
by virtue of node data synchronization, there is no central point of
ownership / failure / compromise, and no participant can technically deny
access to the data by another participant. Regulators no longer have to
"request" reports and verification of data; they are watching the exact
same
business flow and critical data that is driving the actions of the other
participants they are regulating.
- Active management and analysis of long-lived processes
Original blockchain design principles are simply not aligned with tracking
state change of a thing over weeks, months, or years. Instead, blockchains
are focused on single value transfer events, not process flows. As a result,
most smart contract environments still have a legacy root in the
"immutability" of a contract because of the simplicity of a single value
transfer event. The reality of business processes is that they may
be amended over time with a need to track changes.
- Performant vending of data to many, many readers
Most blockchains did not have performance as priority. SST is designed
around data and being able to quickly move large amounts of it
off the platform, scalable to thousands or more connections on a single node.
- Selective secured sharing of data
Not all participants in a data sharing design need access to absolutely
all the data. For example, in a real estate deal, all parties see
the common info (address, age of home, etc.), but only buyer and seller
can see and change the sale price, and only the inspector is allowed to
change the measured radon value but everyone can see it.
- Simplified, performant business event monitoring
Most technology footprints will have a mix of SST and non-SST
implementations. Because SST contracts are changed only through
the submission of events, the semantics for state change are
directly and easily coupled to the physical artifacts being
published. State change through method calls at first appears
easier (y=f(x)) but too many factors come into play that
are not easily addressed in such a synchronous-biased invocation
including:
- Capturing "time-series" of data in contract state
SST smart contracts (and the underlying state they manage) can use
arrays just as easily as scalar values.
For example, a request-for-quote solution might append a tuple
containing submitter ID, price, and timestamp to an array.
A single state
can thus represent the history of the RFQ process, eliminating
the need to synthesize the history by pulling all the prior states.
- Simplified document management for modest sized documents
Current DLTs and essentially all blockchains simply cannot handle storing
actual documents (docx, xlsx, PDF, etc.) or other digital content
such as JPEGs or PNGs; they are simply too large. The current solution is
to take a hash of the material and store that in the smart contract along
with an URI to the actual material, typically stored on BLOB storage.
While the process is functional and enjoys scaling to even huge
pieces of content but it means that the security scope / controls have to
be extended outside the platform. SST can handle states up to a few
megabytes if a one-platform design is desired.
SST is achievable using Susynct today
In assessing the capabilities desired in a synchronized system it should be
clear that a fresh implementation is a better way to go rather than taking
an existing first generation blockchain and twisting it and/or layering on
top of it. Ten years have taught us that they are different kinds of
systems for different purposes.
Like this? Dislike this? Let me know
Site copyright © 2013-2024 Buzz Moschetti. All rights reserved