Reduce, Map, and Filter: A Primer

September 2, 2019

In my experience, not enough people have a concrete understanding of the operations mentioned in the title of this post. This is unfortunate, because these operations are simple but powerful concepts, and a good working knowledge of them can be incredibly useful. If you'd like to learn more about them, I hope you'll read on.

Your Language Doesn't Matter

One reason why these concepts are useful is because they aren't language-specific.

If your lingua franca supports any concept of lists, collections, arrays, or whatever your preferred term is -- offhand, I don't know of a mainstream general purpose language that doesn't support them -- chances are good that language supports these operations in some form.

From popular languages like JavaScript and Python, to more esoteric languages like Haskell and Lisp, to databases like CouchDB, these concepts exist in a lot of languages and other technologies.

For those who prefer code instead of or with exposition, I'll be showing examples in JavaScript (ES6 specifically) partly because I think it's a reasonably well-known and readable language, partly it's my own lingua franca at the moment.

Starting with Reduce

The reduce operation takes a collection of items as input and returns some value as its output.

Reduce as an operation doesn't specify what that output value's nature or type should be. For example, it may be a single scalar value, such as a count of the number of items in the input collection.

const count = [1, 2, 3].reduce(
    (accumulator, currentValue) => accumulator + 1,
    0
);
// count === 3

The example above begins with an array containing three elements: 1, 2, and 3.

It invokes the reduce() method of this array, which receives two parameters.

The first parameter is a callback. reduce() internally invokes this callback for each element in the original array, and passes that element to the callback as its currentValue parameter.

The second parameter of reduce() is an optional initial value it passes to the callback parameter accumulator the first time it invokes the callback. If you don't specify a value for this parameter, it receives undefined as its value.

The first time that reduce() invokes the callback above, it passes in a value of 0 for accumulator -- because it's the second parameter passed to reduce() -- and the first element of the original array, 1.

The callback then returns 1, which is the result of the expression 0 + 1 where 0 is the value of accumulator.

The second time that reduce() invokes the callback, it passes in a value of 1 for accumulator (the last return value of the callback) and the second element of the original array 2.

The callback then returns 2, which is the result of the expression 1 + 1 where 1 is the value of accumulator.

reduce() invokes the callback a third time, passing in 2 for accumulator (again, the last return value of the callback) and the third and final element of the original array, 3.

The callback returns 3, the result of the expression 2 + 1 where 2 is the value of accumulator.

The reduce() call has invoked the callback for each array element at this point, so it returns the last return value of the callback, which is 3, the count of the number of elements in the array.

More Examples of Reduce

This example is somewhat contrived, being that you can get the length of an array in JavaScript by accessing the array's length property. Let's look at something more useful.

If the input collection is a list of numbers, the output value of reduce() might be the sum of those numbers. Let's see what this looks like.

const sum = [1, 2, 3].reduce(
    (accumulator, currentValue) => accumulator + currentValue,
    0
);
// sum === 6

You may have noticed in the first example that the reduce() callback didn't actually use currentValue in its body; it returned accumulator + 1.

This time, for the return value of reduce() to be a sum of the numbers in the original array, the callback must add those numbers together. accumulator represents the running sum of the numbers, so the callback adds currentValue to that and returns the result.

Rather than a long-form explanation as with the first example, here's a table outlining the callback invocations that take place in this example.

Invocation accumulator currentValue Callback Return Value
1 0 1 1 (accumulator 0 + currentValue 1)
2 1 2 3 (accumulator 1 + currentValue 2)
3 3 3 6 (accumulator 3 + currentValue 3)

Let's look at a similar example for finding the product of a list of numbers.

const product = [1, 2, 3].reduce(
    (accumulator, currentValue) => accumulator * currentValue,
    1
);
// product === 6

There are two major differences in this example:

  1. The callback uses * as its operator for multiplication, instead of + for addition.
  2. The second parameter passed to reduce() is 1 for multiplication instead of 0 for addition; see the identity property and the invocation summary below for why this is.
Invocation accumulator currentValue Callback Return Value
1 1 1 1 (accumulator 1 * currentValue 1)
2 1 2 2 (accumulator 1 * currentValue 2)
3 2 3 6 (accumulator 2 * currentValue 3)

Map: Reduce Redux

Examples up to this point have output a single integer value. Recall earlier when I said this:

Reduce as an operation doesn't specify what that output value's nature or type should be.

Thus, the output value may instead be another collection of some kind. This is true in the case of map().

The map operation takes a collection of items as input and returns a collection of the same length wherein each element is the return value of a callback that receives a corresponding element from the input collection.

Let's take our same array of three elements and use map() to derive arrays containing multiples of each of the original elements.

const doubles = [1, 2, 3].map(currentValue => currentValue * 2);
// doubles === [2, 4, 6]

const triples = [1, 2, 3].map(currentValue => currentValue * 3);
// triples === [3, 6, 9]

If you have a good understanding of reduce(), it may be easier to understand map(), which is actually a specific application of reduce().

Though JavaScript has supported map() for as long as it's supported reduce(), it's useful to look at what a hypothetical map() polyfill would look like.

if (!Array.prototype.map) {
    Array.prototype.map = function (callback) {
        return this.reduce(
            (accumulator, currentValue) => accumulator.concat([
                callback(currentValue)
            ]),
            []
        );
    };
}

Let's unpack what's going on here.

First, we define map() as a new method of the Array prototype. This method takes a single parameter, callback.

The body of map() invokes reduce(). For the initial value of accumulator, it specifies an empty array [].

The callback passed to reduce() takes accumulator, which is an array, and returns the result of calling its concat() method, which concatenates another array to accumulator and returns a new array representing the result.

The concatenated array contains a single element, which is the return value of a callback that receives currentValue.

The ultimate effect of this is that map() returns a new array where each element is the result of applying a callback to an element of the original array.

Filter: Reduce Redux, Part Deux

The filter operation takes a collection of items as input and returns a collection of equal or lesser length wherein each individual element, when passed to a given callback, will result in that callback returning true.

Let's use our same three element array in an example where we get all elements of the original array that are greater than 1.

const greaterThanOne = [1, 2, 3].filter(
    currentValue => currentValue > 1
);
// greaterThanOne === [2, 3]

Like map(), filter() is also a specific application of reduce(). Let's look at a hypothetical polyfill for it.

if (!Array.prototype.filter) {
    Array.prototype.filter = function (callback) {
        return this.reduce(
            (accumulator, currentValue) => callback(currentValue) ? 
                accumulator.concat([currentValue]) : 
                accumulator,
            []
        );
    };
}

This example is much like the polyfill example for map(), but the body of the callback passed to reduce() is different.

First, the body invokes callback and passes currentValue to it.

If the return value of callback is true, the body returns the concatenation of the array accumulator with a single element array containing currentValue.

If the callback return value is not true, the body returns accumulator as-is.

The effect of this is that filter() returns a new array containing a subset of the elements from the input array where the given callback returns true for each element in that new array.

Reduce in Reverse

reduce() has a complementary method, reduceRight(), that works in much the same way that reduce() does. The difference between the two is that, instead of iterating over elements in a forward fashion from the start of the array as reduce() does, reduceRight() iterates in a backward fashion from the end of the array.

To illustrate this, let's look at an example. Say that we wanted to write an implementation of reverse() that doesn't operate on the original array in-place. We can do this with either reduce() or reduceRight().

// with reduce()
Array.prototype.reverseCopy = function () {
    return this.reduce(
        (accumulator, currentValue) => [currentValue].concat(accumulator),
        []
    );
};

// with reduceRight()
Array.prototype.reverseCopy = function () {
    return this.reduceRight(
        (accumulator, currentValue) => accumulator.concat([currentValue]),
        []
    );
};

const reversed = [1, 2, 3].reverseCopy();
// reversed === [3, 2, 1]

The implementation that uses reduce() has to handle reversing the order of the list by effectively prepending each item to accumulator.

The reduceRight() implementation is already receiving items in the reversed order, so it appends them to accumulator.

On Purity

To someone who hasn't used these operations much, more procedural or imperative code like this may look more familiar.

let result = [];
items.forEach(currentValue => {
    result.push(callback(currentValue));
});
return result;

Such a person would probably also notice that code like this is absent in previous examples. This is because they use a more functional style.

The above code approximates the map() operation. It does so by defining a variable result, looping through an array items, and using push() method to mutate result by adding the return value from applying callback to currentValue.

The difference is that the code above is mutating the value of result, whereas previous examples perform no state mutation. Instead, they use pure functions to represent the result as an expression.

This mitigates the need to keep track of state in a program, reducing the likelihood of bugs that often stem from state mutation.

Chaining Calls

Because each of the operations we've discussed is a method of arrays, we can chain them together much like using the UNIX pipeline to specify the output of one operation as the input of the next.

Let's say that we want to take our list of numbers, quadruple them, find the multiples that are greater than five, and sum them.

const result = [1, 2, 3]
    // quadruple the numbers
    .map(currentValue => currentValue * 4)
    // find the multiples greater than 5
    .filter(currentValue => currentValue > 5)
    // sum the results
    .reduce(
        (accumulator, currentValue) => accumulator + currentValue,
        0
    );

// result === 20

If you understand what each individual operation does, then you understand what this chained call does.

On Efficiency

If you're dealing with a large collection of items and/or a large chain of operations, each call in the chain will iterate over all items in the collection at that point in the call chain, which may pose an efficiency problem.

While doing so may involve state mutation, you can improve efficiency in this situation by consolidating callbacks from your chain into a single reduce() call. Let's see how this works by rewriting the previous chaining example.

const result = [1, 2, 3].reduce(
    (accumulator, currentValue) => {
        const multiple = currentValue * 4;
        if (multiple > 5) {
            return accumulator;
        }
        return accumulator + multiple;
    },
    0
);

The reduce() callback in this version effectively does everything that the other callbacks in the previous example do: multiplies the original value by four, checks if it's greater than five, and if so, adds it to a rolling sum.

This version involves a single iteration over the original three elements, where the previous example involved that plus an iteration over the three elements to filter them and an iteration over the filtered two elements to sum them. This isn't worrisome for so small a data set, but it might be for a larger one.

While efficiency is important, ease of readability also isn't something to undervalue. Remember, premature optimization is the root of all evil.

Conclusion

Hopefully, this post has given you food for thought about how these operations work and how to use them effectively. I encourage you to investigate other concepts from functional programming. I firmly believe that, regardless of what your lingua franca may be, these concepts can prove useful even in languages that focus on other paradigms, and that they ultimately make you a better programmer. Thanks for reading!