My Architecture Principles

16-Jul-2013

Like this? Dislike this? Let me know

Here are The Five Principles I employ in designing and evaluating software and systems:

How do you turn it off?
Show me a system where a rational, risk-mitigated cutover/shutdown plan can be described and I'll show you a system that has excellent technical integration capabilities and a well-designed and externalizable state and data model. Also: the reality is that systems can (and should!) have lifetimes. In general, senior managers accept system lifecycle and enthusiastic, progressive developers welcome it. Those who do not have a different motivation/incentivization profile.
Focus on the data, not the database
When building a system, start on "the inside" and work hard to define what kinds of objects, data, and other entities you need. Don't start with a set of table definitions and work backward into the code layer. This approach applies to the GUI as well, and in fact all other architecturally peripheral components.
Strongly separate GUIs from everything and assume they are throwaway
GUIs are seductive because it's what most of the users touch. It's what the user thinks the app and/or system is all about. It's how people get paid. But the GUI is not the system, and there's no such thing as a single GUI to a system. There's the "big desktop" web version, the iPhone and iPad versions, the Flash-free HTML5 version, etc, etc. There are as many GUIs as you need/want. Power users, admins, small real estate plug-ins to other apps, etc. They are all views on exactly the same underlying layer of data and functions. Furthermore, GUI technology changes rapidly or has technical deployment limitations. When the noise around Flash vs. HTML5 subsides a bit, it will be replaced with noise around HTML5 vs. HTML6. There will always be noise in the GUI space; accept it and move on. And most importantly, don't let the GUI drive the design of the underlying system (see #2 above).
Assume all data designs are wrong day 1 and have to change
Whether you take 1 day or 1 year to capture requirements for a system, the chances are very high that something will be missed. Leaving aside the process required and appropriate for truly mission-critical systems (space and submersibles, medical, etc.), for most systems the tradeoff between time to market and risk of underspecification favors the former. Therefore, you need to simply embrace this fact and keep your implementations semisoft especially in the early stages of development. If adding an extra string or double to an entity causes days (or more) of follow on work (excluding time needed to actually prep, announce, coordinate, and push a tested release), there's something wrong with the design. Some things to watch out for:
- Don't use doubles where you deal with money; use a Money class (see Let's Talk About Money).
- Avoid "scalarization." Lists, arrays, and Maps are great things; use them! If you see things like contactname_1, contactname_2, contactname_3 showing up instead of List<Contact>, chances are you've got some amount of scalarization in the design. And the primary driver for scalarization is .... see #2 above.
- In general, objects do not have keys (see this article).
- Build in versioning to your data. The simple addition of an integer representing version level can go a long way in providing context for data that changes over time.
- There is No Perfect Database, so don't agonize over trying to create one.
- In global systems, date and time are very important. Just capturing date in new systems design is very limiting. For important transactions, make sure you unambiguously and context-free capture both local and GMT time. By unambiguously and context-free I mean the datetime representation needs no other resource, runtime library call, timezone lookup, etc. to manifest itself and does so in a clear way. You need to capture local time for local processing needs and GMT normalized time to do global coordinated processing and full-set transaction time analysis.
- Don't use date (or datetime) as the implicit key for membership in a set of data to be processed. It is perfectly OK to use date as an input or a query/filtering criteria for defining the set, but that filter and resulting set needs to stand on its own, explicitly, so that other set-based operations can use it as the authoritative source of data instead of dropping down to the raw date(time).
- Avoid integer keys. This is a controversial issue, yes. Integers tend to get munged up when they are imported/exported, people depend on them for implied sequentiality, etc.
- Beware of epic scale class inheritance models. The compiler has a greater capacity to grok such things than humans but we have to ultimately understand and support it.
- Beware of epic scale entitlement models. Rules engines and XACML and the PxP frameworks have a greater capacity to grok such things than humans but we have to ultimately understand and support it.
- Challenge the need for objects that contain only getter/setters of strings with no underlying logic. This scenario occurs much more often than one might expect. Chances are a simple HashMap would suffice and provides better day 2 flexibility. I can already hear the objections of the Intellisense/IDE lobby on this, but features in IDEs should not drive basic software design. These kind of "dumb" string-get/set objects are usually markers for database-driven design where the contract for the object is "whatever the DB vends me for table X"; see rule #2 above. That said, some simple objects are indeed quite valid and useful and should survive such a challenge.
- Don't use flatfiles. It's 2013; use JSON instead. Better still, if you're up to it, use BSON. Unless you're operating in the billion record per file stratosphere, modern networks and storage and CPU can accomodate chunkier formatted files. Formatted files are significantly better at handling non-scalar data, missing fields, new fields, custom/bespoke data, etc. True, flatfiles are easy to produce and consumer (strtok!), are very fast (typically no high water marking with memory), and require little or sometimes no external software like generators or parsers. All good for development time but bad for day 2. Their rigidity and lack of explicit structure means that 99% of the time, changing a flatfile is too risky so a new flatfile is created. One year later you have a dozen flatfiles (and their associated emitters and crackers) for essentially the same information. Three years later you have 50.
- Be wary of implied functionality in the ordering of collections. If a particular order is required and must be deterministic and reproducable, ensure the sorting is clear and explicit and key sets in Maps are not used to drive ordering.
- Separate data you control from data you cannot (or don't want to) control and do not reengineer / remodel the latter. Don't try to take a customer's user-defined data on a feed and create a set of tables to hold it. You've doubled your work and it's not a scalable practice across many customers. If the user-defined data has strong structure and needs to be indexed and accessed by your system, consider persistence in a document store like mongodb.
Don't be distracted by standards
This is obviously worded so as to be provocative, but it has proven useful in raising the eyebrows (and sometimes blood pressure) in a mixed room of developers and managers. The idea is to design your system first, then reach into the bag of standards to see what can be used for an implementation. All too often, a design session starts with "we will use Java 1.6 and Oracle 11g and XML and ..." There are two critical issues with this approach:
1. The spectrum of technologies that could be used to well-implement (including support) a solution is likely artificially narrowed.
2. Declaring components and versions is misinterpreted as "design" and the real work of crafting information architecture and software to manipulate it gets sidelined.

Like this? Dislike this? Let me know