Updated 10-Jan-2024 Original post 15-Dec-2017 |
Like this? Dislike this? Let me know |
A collection (pun!) of snippets that I find useful. Some of these are "easier" to deal with if you simply $unwind the arrays but unwinding big arrays in big objects can become very expensive. In general, be kind: don't unwind. Use $map,$reduce, and $filter whenever you can!
{"items": [ [0, 1, 2], [0], [2,5] ]}
db.foo.aggregate([ // This is unfortunately not as straightforward as simply doing // {newList: {$concatArrays: "$items"}} // because $items is a single array of arrays, NOT a vararg list of // arrays that is expected in $concatArrays. In other words, // {$concatArrays: "$items"}} // is not the same thing as // {$concatArrays: [ arr1, arr2, arr3 ]} // from a call signature perspective. // // Solution: We will "walk" $items with $reduce and call $concatArrays over and // over to build the list. We use $items as the input and overwrite it for output. // This is a clever use of $reduce by treating an array as the output value instead of a simple scalar: {$project: {items: {$reduce: { input: "$items", initialValue:[], in:{$concatArrays: [ "$$value", "$$this"]} }} }} ]); { "_id" : 0, "items" : [ 0, 1, 2, 0, 2, 5 ] }
db.foo.aggregate([ {$project: { items: {"$setUnion": [ {$reduce: { input: "$items", initialValue:[], in:{$concatArrays: [ "$$value", "$$this"]} }} ] } }} ]); { "_id" : 0, "items" : [ 0, 1, 2, 5 ] }
{_id:0, items: [ {a:0}, {a:0}, {b:7}, {a:1},{b:0},{b:7} ] }
matches = []; for(i = 0; i < items.size(); i++) { if(items[i]['b'] == 7) { matches.push(i); } }
db.foo.aggregate([ {$project: { XX: {$reduce: { // for $$this in range(0, {$size:"$items"}) input: {$range:[0,{$size:"$items"}]}, initialValue: [], // Clever use of array as value. in: {$cond: { // Use getField as of v5.0: if: {$eq:[7, {$getField: {input: {$arrayElemAt:['$items','$$this']}, field: 'b'}} ]}, then: {$concatArrays: [ '$$value', ['$$this'] ]}, else: '$$value' // nothing to do; return $$value unchanged }} }} }} ]); { XX: [ 2, 5 ] }
{_id:0, lvl1: [ {a:1, lvl2: [ {b:1, lvl3: [ {c:1, z:'A'}, {c:2, z:'B'}, {c:3, z:'C'}, {c:4, z:'D'} ] }, {b:2, lvl3: [ {c:1, z:'W'}, {c:2, z:'X'}, {c:3, z:'Y'}, {c:4, z:'Z'} ] } ]}, {a:2, lvl2: [ {b:3, lvl3: [ {c:0, z:'A'}, {c:0, z:'B'}, {c:0, z:'C'}, {c:4, z:'D'} ] }, {b:4, lvl3: [ {c:4, z:'W'}, {c:3, z:'X'}, {c:2, z:'Y'}, {c:1, z:'Z'} ] } ]} ] }
There is of course the initial urge to flatten everything out with `$unwind`:
db.foo.aggregate([ {$unwind: '$lvl1'}, {$unwind: '$lvl1.lvl2'}, {$unwind: '$lvl1.lvl2.lvl3'}, {$match: {'lvl1.lvl2.lvl3.c':{$gt:2}}}, // ... now put it all back together with $group.... ]);
A better approach is to have a cascade of $reduce operators to "dig through" the hierarchy and perform a $filter at the end. Conceptually, we want to do this (pseudocode):
for l1 in lvl: for l2 in l1.lvl2: l2['lvl3'] = filter(l2['lvl3'], "c > 2")
db.foo.aggregate([ {$addFields: { // $addFields to overwrite lvl1; use $project if you wishh lvl1: {$reduce: { // set lvl1... open ff input: '$lvl1', // ...from lvl, i.e. overwrite initialValue: [], // prep for array loop /* The basic loop is $concatArrays: [ $$value, [ one object ] ] Since $reduce is passing us one object at time as $$this, almost always we will see the second arg wrapped with [] to create an array of one. */ in: {$concatArrays: [ // start ee '$$value', /* Our first key-value overlay stage. But the overlay itself is a function; remember composition! */ [ {$mergeObjects: [ // start dd '$$this', {lvl2: {$reduce: { // start cc input: '$$this.lvl2', // setup lvl2 for overwrite initialValue: [], // Second loop on lvl2: in: {$concatArrays: [ // start bb '$$value', /* Our second key-value overlay stage. This time, the overlay does not set up a $reduce "loop" but instead performs the filter we seek: */ [ {$mergeObjects: [ '$$this', {lvl3: {$filter: { input: '$$this.lvl3', as: 'qq', cond:{$gt:['$$qq.c',2]} }} } ]} ] // list of 1 ]} // close bb }} } // close cc ]}] // close dd incl array-of-one wrapper ]} // close ee }} // close ff }} // close addFields ]);
Automatic carryover: {$concatArrays: [ '$$value', [ {$mergeObjects: [ '$$this', [ expr resolves to object ] ] }] ]} Selective carryover: {$concatArrays: [ '$$value', [ expr resolves to object ]]} e.g. in the following, fields in $$this are dropped and only f1 and f2 are carried through; note if f1 and/or f2 did exist in $$this, they would be overwritten: {$concatArrays: [ '$$value', [ {f1:v1,f2:v2} ] ]}
Here is a more step-by-step way to approach the solution above:
db.foo.aggregate([ {$addFields: { lvl1: '$lvl1' // noop }} ])
{$reduce: {input:'$ARR',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {} ]}]]}}}
db.foo.aggregate([ {$addFields: { lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {} ]}]]}}} }} ])
db.foo.aggregate([ {$addFields: { lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {lvl2: '$$this.lvl2'} ]}]]}}} }} ])
db.foo.aggregate([ {$addFields: { lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {lvl2: {$reduce: {input:'$$this.lvl2',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {} ]}]]}}} } ]}]]}}} }} ])
db.foo.aggregate([ {$addFields: { lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {lvl2: {$reduce: {input:'$$this.lvl2',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {lvl3: '$$this.lvl3'} ]}]]}}} } ]}]]}}} }} ]);
db.foo.aggregate([ {$addFields: { lvl1: {$reduce: {input:'$lvl1',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {lvl2: {$reduce: {input:'$$this.lvl2',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {lvl3: {$filter: { input: '$$this.lvl3', as: 'qq', cond:{$gt:['$$qq.c',2]} }} } ]}]]}}} } ]}]]}}} }} ]);
db.foo.aggregate([ {$addFields: { _id:false, lvl1: {$reduce: {input:'$lvl1', initialValue:[], in:{$concatArrays:['$$value',[ {$mergeObjects:[ '$$this', // We want to be able to not only get c > 2 but find the // highest value of c at each lvl2. For this we need to "hang on" // to the filtered list so that we can assign it AND then traverse // it to find the max c. Remember you cannot use a variable you // set in an object in the same "pass", e.g. // { // a: some list, // b: {$size: '$a'} // nope // } // // but you can use $let to set up a value in vars then use it more // than once in 'in'. // {$let: { vars: {tmp_l2: {$reduce: {input:'$$this.lvl2',initialValue:[],in:{$concatArrays:['$$value',[{$mergeObjects:['$$this', {lvl3: {$filter: { input: '$$this.lvl3', as: 'qq', cond:{$gt:['$$qq.c',2]} }} } ]}]]} }} }, in: { lvl2: '$$tmp_l2', // easy // Need to create of max for inner lvl3, then the max of // those for all of lvl2: max_c: {$max: {$reduce: {input: '$$tmp_l2', initialValue: [], in: // $$this is an object so we can dotpath to lvl3 // and apply one addition dotpath to // extract an array of ONLY c: {$concatArrays: ['$$value', [{$max: '$$this.lvl3.c'}]]} }} } } // close 'in' }} ]} ]] } }} }} ]);
Like this? Dislike this? Let me know