MongoDB 3.4 and Money: Hello BigDecimal!
This rant was published three years ago, 30-Dec-2013. At the time, support
for an arbitrary precision decimal type was only on the roadmap. Last month,
with the release of MongoDB 3.4, decimal128 (a.k.a. BigDecimal) was formally
introduced into the type system of MongoDB with support in all drivers and
utilities like mongoimport.
The concepts introduced here, however, are as appropriate now as they were back
then and now even easier to implement thanks to decimal128 support.
Contemporary financial systems
data designs need more than just the amount of money involved; they need
the currency code
(see Let's Talk About Money).
Money (and price) without a currency code is not money;
it is just a number. This tight pairing is actually more easily managed
in MongoDB than in traditional RDBMS because the amount and currency code --
and indeed, potentially more day 2 attributes -- can
be well-managed together in a subdocument.
The Traditional Way in RDBMS
Consider a data design that requires us to capture transaction
amount, tax, and rebate amount. Let's start with BigDecimals in an RDBMS:
create table Trade (
amount numeric,
tax numeric,
rebate numeric
);
So right off the bat, we have a problem: there are only "magnitude" columns
with no associated currency codes. And
we don't want to simply assume that everything is in USD (this is 2017).
So perhaps we try this:
create table Trade (
amount numeric,
tax numeric,
rebate numeric,
ccode text
);
This is a step in the right direction but there is a serious
consequence: all numerics "localized" in this table are assumed to
have the same currency code. This may not always be the case! Even if it
is 95% of the time, the hacked-up logic required to deal with the 5% tail will
increase the brittleness of the otherwise adequate 95%. In addition, as the
schema grows over time, the relationship between the ccode column
and the magnitude columns becomes less clear. Other than direct examination of
the way the column values are actually processed in the data access layer,
there are no structural items or even hints that these fields are closely
related.
The next level of completeness would be to provide a ccode for each numeric:
create table Trade (
amount numeric,
amount_ccode text,
tax numeric,
tax_ccode text,
rebate numeric,
rebate_ccode text
);
The data access layer now has everything it needs to make a Money objects
for amount, tax, and rebate. But let's examine some subtle issues:
- Suppose, from a business standpoint, rebate is permitted to be unset.
In this physical implementation, which field should be the indicator of
being set, rebate or rebate_ccode? Let's assume the numeric will be
the indicator and we will take advantage of getBigDecimal()
in the JDBC API to return null or an allocated object instance.
- * Side note: The other JDBC calls that return native types (e.g.
getDouble(), getInt(), getLong(), etc. will return 0 if the
column is NULL. No kidding. To accurately determine if the column
was NULL, you need to call ResultSet.wasNull() after one of the
getter methods.
- If we have an existing rebate and we wish to make it unset, do we
null BOTH the numeric and the text currency code?
- What does it mean if one side of the pair is set (or unset) without
the other?
- How strongly followed will the convention be for the naming of the pair
so that given a request for money field foo, both foo and foo_ccode will
be fetched? How do we know that these are in fact a pair?
The last point above becomes more important as the table grows and the
field names do not necessarily "line up" and make it clear that they are in
fact a pair. Arguably, the more robust way to name the fields is as follows:
create table Trade (
amount_val numeric,
amount_ccode text,
tax_val numeric,
tax_ccode text,
rebate_val numeric,
rebate_ccode text
);
We finally have data structures (peered magnitude and
currency code) and naming conventions (_val and _ccode)
that provide a robust, longer-lived solution than the first attempt --
and it is somewhat more involved than simply "adding a BigDecimal column."
Below is a representative example of the code that would be used
to produce the Money object in a data access layer.
// Let's not dwell on how to construct the query. Assume:
// 1. We are looking for one and only one trade:
// 2. We want to have some semblance of logic to drive pairing
// based on "basename" of the money field, e.g. 'tax'.
// In other words, we don't want 'tax_val' and 'tax_ccode' field names
// as hardcoded strings floating around.
//
String[] mflds = new String[] { "amount", "tax", "rebate" };
// Build a dynamic SQL string along the lines of:
// select amount_val, amount_ccode, tax_val, tax_ccode,
// rebate_val, rebate_ccode from Trade where key = 'AA123'
//
StringBuffer sb = new StringBuffer();
sb.append("select ");
for(int i = 0; i < mflds.length; i++) {
if(i > 0) { sb.append(","); }
sb.append(mfld);
sb.append("_val");
sb.append(",");
sb.append(mfld);
sb.append("_ccode");
}
sb.append(" from Trade where key = '");
sb.append(bizKey);
sb.append("'");
ResultSet rs = stmt.executeQuery(sb.toString());
if(rs.next()) {
// We are using getBigDecimal() to indicate null or not:
for(String mfld : mflds) {
BigDecimal v = rs.getBigDecimal(mfld + "_val");
if(v != null) {
String ccode = rs.getString(mfld + "_ccode");
Money m = new Money(v, ccode);
someMap.put(mfld, m); // e.g. "amount" -> Money()
}
}
}
return someMap;
The MongoDB 3.4 Way
In MongoDB, we would take advantage of rich shape modeling to well-capture
the relationship between val and ccode. There is no explicit schema (the
data itself is the schema) so we will use an insert() function
from the command line interface to demonstrate:
db.trade.insert( {
"amount": {
"val": NumberDecimal("77.09"),
"ccode": "USD"
},
"tax": {
"val": NumberDecimal(".87"),
"ccode": "USD"
},
"rebate": {
"val": NumberDecimal("2000"),
"ccode": "JPY"
}
});
Note the constructor for the new NumberDecimal type. We use string arguments
specifically to avoid problems with literal numbers being captured as 32 bit
integers or floats.
It is important to recognize two use cases here that
are no different than those experienced with a traditional
RDBMS:
- Within currency, all numeric comparisons and sorting
work as expected.
- Cross-currency operations are not possible without a
currency conversion table and such operations are outside the domain
of the persistor anyway.
Here is a representative example of the code that would be used
to produce the Money object in a data access layer. Note that the logic
is simplified because we can treat amount, tax, rebate as "objects":
String[] mflds = new String[] { "amount", "tax", "rebate" };
// Construction of predicate and projection is simpler because
// of the map-of-map structural approach instead of creating a SQL
// string with required whitespace, commas, quoted strings, etc.
DBObject predicate = new BasicDBObject("key", key);
DBObject projection = new BasicDBObject();
for(String mfld : mflds) {
projection.put(mfld, 1); // subdoc will include val and ccode children
}
DBObject v = coll.find_one(predicate, projection);
if(v != null) {
for(String mfld : mflds) {
Map m2;
if((m2 = v.get(mfld)) != null) {
BigDecimal a = ((org.bson.types.Decimal128)m2.get("val")).bigDecimalValue();
String s = m2.get("ccode");
Money m = new Money(a, s);
someMap.put(mfld, m); // e.g. "amount" -> Money()
}
}
}
return someMap;
The object-oriented approach also makes it much more straightforward to both
interrogate for money "setness" and to make money "unset" in MongoDB:
db.trade.find( { "rebate": {"$exists": 1 }});
db.trade.update( { "bizKey": "A2"}, {"$unset": {"rebate": ""} } );
It is particularly useful that unsetting a field with a subdocument removes
all the fields at once. In traditional RDBMS design, data can become
logically "broken" when one or more of the peered fields is removed but one or
more are not.
Day 2 Advantage of MongoDB
In certain scenarios, it may be paramount to efficiently perform numeric
comparisons or sorting cross-currency within the database.
An option is to use a floating point
approximation of the value pre-converted to USD dollars and cents. We cannot
outright substitute the precise version in BigDecimal with the
floating point approximation;
we must make it a peer. Both
implementations require some safe practices around CRUD (i.e. if the precise
value changes, so must the approximation).
In MongoDB, the obvious way to implement this is to make the
approximation a peer within the subdocument:
db.trade.insert( {
"amount": {
"val": NumberDecimal("77.09"),
"ccode": "USD",
"approxUSD": 77.09
},
"tax": {
"val": NumberDecimal("87"),
"ccode": "USD",
"approxUSD": .87
},
"rebate": {
"val": NumberDecimal("2000"),
"ccode": "JPY",
"approxUSD": 18.97
}
});
The RDBMS implementation, however, is becoming less straightforward because
the intrinsic bundling of the components of money are being flattened out,
heavily relying on the naming convention alone to infer structure:
create table Trade (
amount_val numeric,
amount_ccode text,
amount_approxUSD numeric,
tax_val numeric,
tax_ccode text,
tax_approxUSD numeric,
rebate_val numeric,
rebate_ccode text,
rebate_approxUSD numeric
);
This means a non-trivial ALTER TABLE
activity to add the new peer field _approxUSD everywhere you use money.
But even more significant (and daunting) is that any code that might
require _approxUSD must have the query changed to fetch _approxUSD,
resulting in a non-trivial effort to modify all the SELECT
statements to fetch the new peer field.
In comparison, in MongoDB the
existing find() calls do not need to change. The new peer fields,
if they exist, will be automatically fetched as part of the "object."
The naming convention must also "reserve" the suffixes so that they are not
used by other columns to capture non-money data. Failure to do so results
in confusion and possible misinterpretation of the new field as money --
by itself, a "broken" money field (having no
_ccode and _approxUSD peers).
It therefore makes sense to create as explicit and informative a suffix as
possible, ideally using special characters. The tradeoff is that
typically fields so named must be wrapped with quotes. This is not a
"problem" per se but it does complicate SQL statement construction because
typical simple string substitution of column names may break (e.g. need to
escape the extra quotes) and certain tools may react negatively to both the
special characters and the quotes:
create table Trade (
"amount:val" numeric,
"amount:ccode" text,
"amount:approxUSD" numeric,
"someother_val" numeric,
"tax:val" numeric,
"tax:ccode" text,
"tax:approxUSD" numeric,
"rebate:val" numeric,
"rebate:ccode" text,
"rebate:approxUSD" numeric
);
It should be clear that when consistent and proper future-proofing techniques
are applied to modeling money in a persistor, MongoDB is both easier and more
robust than traditional RDBMS.
Like this? Dislike this? Let me know
Site copyright © 2013-2024 Buzz Moschetti. All rights reserved