| Updated 10-Jan-2024 Original post 15-Dec-2017 |
Like this? Dislike this? Let me know |
A collection (pun!) of snippets that I find useful. Some of these are "easier" to deal with if you simply $unwind the arrays but unwinding big arrays in big objects can become very expensive. In general, be kind: don't unwind. Use $map,$reduce, and $filter whenever you can!
{"items": [ [0, 1, 2], [0], [2,5] ]}
db.foo.aggregate([
// This is unfortunately not as straightforward as simply doing
// {newList: {$concatArrays: "$items"}}
// because $items is a single array of arrays, NOT a vararg list of
// arrays that is expected in $concatArrays. In other words,
// {$concatArrays: "$items"}}
// is not the same thing as
// {$concatArrays: [ arr1, arr2, arr3 ]}
// from a call signature perspective.
//
// Solution: We will "walk" $items with $reduce and call $concatArrays over and
// over to build the list. We use $items as the input and overwrite it for output.
// This is a clever use of $reduce by treating an array as the output value instead of a simple scalar:
{$project: {items: {$reduce: {
input: "$items",
initialValue:[],
in:{$concatArrays: [ "$$value", "$$this"]}
}}
}}
]);
{ "_id" : 0, "items" : [ 0, 1, 2, 0, 2, 5 ] }
db.foo.aggregate([
{$project: { items: {"$setUnion": [
{$reduce: {
input: "$items",
initialValue:[],
in:{$concatArrays: [ "$$value", "$$this"]}
}} ] }
}}
]);
{ "_id" : 0, "items" : [ 0, 1, 2, 5 ] }
{_id:0, items: [ {a:0}, {a:0}, {b:7}, {a:1},{b:0},{b:7} ] }
matches = [];
for(i = 0; i < items.size(); i++) {
if(items[i]['b'] == 7) {
matches.push(i);
}
}
db.foo.aggregate([
{$project: {
XX: {$reduce: {
// for $$this in range(0, {$size:"$items"})
input: {$range:[0,{$size:"$items"}]},
initialValue: [], // Clever use of array as value.
in: {$cond: {
// Use getField as of v5.0:
if: {$eq:[7, {$getField: {input: {$arrayElemAt:['$items','$$this']}, field: 'b'}} ]},
then: {$concatArrays: [ '$$value', ['$$this'] ]},
else: '$$value' // nothing to do; return $$value unchanged
}}
}}
}}
]);
{ XX: [ 2, 5 ] }
{_id:0,
lvl1: [
{a:1, lvl2: [
{b:1, lvl3: [
{c:1, z:'A'},
{c:2, z:'B'},
{c:3, z:'C'},
{c:4, z:'D'}
]
},
{b:2, lvl3: [
{c:1, z:'W'},
{c:2, z:'X'},
{c:3, z:'Y'},
{c:4, z:'Z'}
]
}
]},
{a:2, lvl2: [
{b:3, lvl3: [
{c:0, z:'A'},
{c:0, z:'B'},
{c:0, z:'C'},
{c:4, z:'D'}
]
},
{b:4, lvl3: [
{c:4, z:'W'},
{c:3, z:'X'},
{c:2, z:'Y'},
{c:1, z:'Z'}
]
}
]}
]
}
There is of course the initial urge to flatten everything out with `$unwind`:
db.foo.aggregate([
{$unwind: '$lvl1'},
{$unwind: '$lvl1.lvl2'},
{$unwind: '$lvl1.lvl2.lvl3'},
{$match: {'lvl1.lvl2.lvl3.c':{$gt:2}}},
// ... now put it all back together with $group....
]);
A better approach is to have a cascade of $reduce operators to "dig through" the hierarchy and perform a $filter at the end. Conceptually, we want to do this (pseudocode):
for l1 in lvl:
for l2 in l1.lvl2:
l2['lvl3'] = filter(l2['lvl3'], "c > 2")
db.foo.aggregate([
{$addFields: { // $addFields to overwrite lvl1; use $project if you wishh
lvl1: {$reduce: { // set lvl1... open ff
input: '$lvl1', // ...from lvl, i.e. overwrite
initialValue: [], // prep for array loop
/* The basic loop is $concatArrays: [ $$value, [ one object ] ]
Since $reduce is passing us one object at time as $$this, almost always
we will see the second arg wrapped with [] to create an array of one. */
in: {$concatArrays: [ // start ee
'$$value',
/* Our first key-value overlay stage. But the overlay itself is a function;
remember composition! */
[ {$mergeObjects: [ // start dd
'$$this',
{lvl2: {$reduce: { // start cc
input: '$$this.lvl2', // setup lvl2 for overwrite
initialValue: [],
// Second loop on lvl2:
in: {$concatArrays: [ // start bb
'$$value',
/* Our second key-value overlay stage. This time, the overlay does not
set up a $reduce "loop" but instead performs the filter we seek: */
[ {$mergeObjects: [
'$$this',
{lvl3: {$filter: {
input: '$$this.lvl3',
as: 'qq',
cond:{$gt:['$$qq.c',2]}
}}
}
]}
] // list of 1
]} // close bb
}}
} // close cc
]}] // close dd incl array-of-one wrapper
]} // close ee
}} // close ff
}} // close addFields
]);
Automatic carryover:
{$concatArrays: [ '$$value', [ {$mergeObjects: [ '$$this', [ expr resolves to object ] ] }] ]}
Selective carryover:
{$concatArrays: [ '$$value', [ expr resolves to object ]]}
e.g. in the following, fields in $$this are dropped and only
f1 and f2 are carried through; note if f1 and/or
f2 did exist in $$this, they would be overwritten:
{$concatArrays: [ '$$value', [ {f1:v1,f2:v2} ] ]}
Here is a more step-by-step way to approach the solution above:
db.foo.aggregate([
{$addFields: {
lvl1: '$lvl1' // noop
}}
])
{$reduce: {input:'$ARR',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {} ]}]]}}}
db.foo.aggregate([
{$addFields: {
lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {} ]}]]}}}
}}
])
db.foo.aggregate([
{$addFields: {
lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this',
{lvl2: '$$this.lvl2'}
]}]]}}}
}}
])
db.foo.aggregate([
{$addFields: {
lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this',
{lvl2: {$reduce: {input:'$$this.lvl2',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {} ]}]]}}} }
]}]]}}}
}}
])
db.foo.aggregate([
{$addFields: {
lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this',
{lvl2: {$reduce: {input:'$$this.lvl2',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this',
{lvl3: '$$this.lvl3'}
]}]]}}} }
]}]]}}}
}}
]);
db.foo.aggregate([
{$addFields: {
lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this',
{lvl2: {$reduce: {input:'$$this.lvl2',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this',
{lvl3: {$filter: {
input: '$$this.lvl3',
as: 'qq',
cond:{$gt:['$$qq.c',2]}
}} }
]}]]}}} }
]}]]}}}
}}
]);
db.foo.aggregate([
{$addFields: {
_id:false,
lvl1: {$reduce:
{input:'$lvl1',
initialValue:[],
in:{$concatArrays:['$$value',[
{$mergeObjects:[
'$$this',
// We want to be able to not only get c > 2 but find the
// highest value of c at each lvl2. For this we need to "hang on"
// to the filtered list so that we can assign it AND then traverse
// it to find the max c. Remember you cannot use a variable you
// set in an object in the same "pass", e.g.
// {
// a: some list,
// b: {$size: '$a'} // nope
// }
//
// but you can use $let to set up a value in vars then use it more
// than once in 'in'.
//
{$let: {
vars: {tmp_l2: {$reduce: {input:'$$this.lvl2',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this',
{lvl3: {$filter: {
input: '$$this.lvl3',
as: 'qq',
cond:{$gt:['$$qq.c',2]}
}} }
]}]]} }} },
in: {
lvl2: '$$tmp_l2', // easy
// Need to create of max for inner lvl3, then the max of
// those for all of lvl2:
max_c: {$max: {$reduce: {input: '$$tmp_l2',
initialValue: [],
in:
// $$this is an object so we can dotpath to lvl3
// and apply one addition dotpath to
// extract an array of ONLY c:
{$concatArrays: ['$$value', [{$max: '$$this.lvl3.c'}]]}
}}
}
} // close 'in'
}}
]}
]]
}
}}
}}
]);
Like this? Dislike this? Let me know