Memory-Efficient Merging of One and Two-Fold Arrays

Merging a One-Fold Array and Two-Fold Array

Hello! For this post, we'll be tackling yet another technical challenge . But, this time, it will be a challenge centered around arrays and efficient memory usage.

To start, we'll define our technical challenge description and criteria below.

Write a function that accepts two arguments. Each argument is an array of numbers.
The second array is always twice the size of the first array. Any element in the second array that is located at an index greater than the last index of the first array is simply a placeholder.
The return value of that function should be a sorted array that contains the numbers from both arrays - i.e. a merging of both provided arrays.
The function should optimize for memory usage during its execution.

An example pair of arrays provided to the function would be

[10, 9, 8, 7]

[1, 3, 4, 6, 0, 0, 0, 0]

and the resulting merged array would be

[1, 3, 4, 6, 7, 8, 9, 10]

Knowing the above, we can get started on our approach and code. We'll use good ol' strongly-typed TypeScript to do it, so let's get to it!

The Simple and Not-So-Memory-Optimized Approach

Through a quick read of the technical challenge requirements, and the simplistic example we're given, we can come to understand that the challenge can be solved through existing array utilities.

We know that the second array is always twice the size of the first array
We also know that anything past the Nth index of the second array, where N is the last index of the first array, is simply throwaway placeholder that we don't need to keep track of
We know we need to merge these two arrays, sort the resulting array, and return that

We'll implement pretty quick using the following code

function mergeArraysNaive(
    firstArray: Array<number>,
    secondArray: Array<number>
) {
    // Our approach will be a simplistic approach of
    // 1. Identify the length of the first array, let's say N
    // 2. From the second array, knowing anything past its own N-index
    //    is placeholder that can be thrown away, splice out those items
    // 3. Merge those two portion of arrays
    // 4. Sort the resulting merged array and return that

    const endpoint = firstArray.length;
    const portionOfSecondArray = secondArray.splice(0, endpoint);

    // Merge the spliced off section of the second array onto the
    // first array, and then just do a simple numeric sort
    const mergedAndSortedArray = firstArray
        .concat(portionOfSecondArray)
        .sort((numberOne, numberTwo) => numberOne - numberTwo);
    console.log(`
        Merged and Sorted Array
        ${mergedAndSortedArray}
    `);

    return mergedAndSortedArray;
}

// The below outputs
// Merged and Sorted Array
// 1,3,4,6,7,8,9,10
mergeArraysNaive([10, 9, 8, 7], [1, 3, 4, 6, 0, 0, 0, 0]);

This approach is straightforward and works. We're identifying the length of the first array, and using that to splice off the portion of the second array from its 0th index through to the Nth index (exclusive). We're then using firstArray.concat(...) to add that spliced off portion from the second array onto the first array.

We do a final call to .sort(...) to sort our merged array in ascending order and we've achieved the requirements of the technical challenge. But there's a catch with what we've done here.

Our above implementation is most definitely not optimized for memory usage. There are two memory implications with our naive approach above

Unused placeholder memory usage
We are splicing out the 0th to Nth elements of the second array, which in itself is fine. But we are still left with the remaining placeholder elements. For a minimal example with our arrays above, the unused placeholders isn't a problem.
However, when we're at the magnitude of a second array that has 1,000,000 elements, then that means we have over 500,000 unused elements taking up space in memory. At that scale, such unused memory does become a problem.
Allocation of new memory
We are also allocating completely new memory when we're doing our firstArray.concat(portionOfSecondArray) call.
Under the hood, that array concatenation call is causing our program to allocate new empty slots in our first array to be able to place each item from portionOfSecondArray into each of those empty slots. This is no bueno because when we're on the magnitude of a half-a-million elements in that second array portion, we'd be allocating an additional half-a-million empty slots into our first array.

Knowing these two problems above, our initial naive implementation definitely has room for optimization of memory usage. Let's see how we'll address these issues in the next section.

Merging with an Memory-Optimized Approach

Our main focus as part of this refactor and optimization of our naive implementation of array merging will be mitigating the allocation of new memory.

When we're dealing with arrays that contain millions of items, ensuring our logic and code optimize for such scale is critical to maintaining solid application performance, and helping prevent any out-of-memory problems during real-world usage of our code.

A couple facts to call back to regarding our technical challenge requirements is that

The second array is always twice the size of the first array
Each item in the second array located at an index past Nth index is simply placeholder values that can be thrown away, per se. In this case, Nth index is the last index of the first array

Knowing this, there's quite the glaring optimization we can think about and seek to implement. That optimization is, rather than take the necessary items from the second array and concatenating them onto our first array, we can most definitely do the reverse to yield a more memory-optimized solution.

How does this result in a more memory-optimized solution, one might ask? Well, it has to do with existing allocated memory.

In our naive approach, we understand that our concatenation call to join the items from the second array onto the first array was causing our program to allocate new memory onto the first array. This is a needed operation so that each item from the second array could be placed into those allocated memory slots of the first array.

However, we know that the second array is already twice the size of the first array, and it has placeholder elements in it, at each index past the Nth index. This means we should technically be able to overwrite each of those placeholder elements in the second array, with the items from the first array.

Because the second array size is twice the size of the first array, then we know everything in the first array should be able to be stored within the second array. Subsequently, this would require no new memory allocation, since all of this space in memory has already been allocated for the second array when our function was called.

Let's go ahead and implement this in our memory-optimized solution below.

function mergeArraysOptimized(
    firstArray: Array<number>,
    secondArray: Array<number>
) {
    // Our approach will be a simplistic approach of
    // 1. Identify the length of the first array, let's say N
    // 2. From the second array, knowing anything past its own N-index
    //    is placeholder that is not needed information
    // 3. We'll iterate the first array and store each item from it into
    //    an index of the second array. The index to store into will be calculated
    //    as N + currentIndex and will look something like secondArray[N + currentIndex] = itemFromFirstArray
    // 4. Sort the resulting merged array and return that

    const endpoint = firstArray.length;

    // Iterate the elements of the first array, gathering each of its items
    // and storing the item at an index offset by its own length
    // We're able to do this because we know anything in the second array past
    // the index Nth is placeholders that are not needed, and we can essentially overwrite
    for (
        let currentIndex = 0;
        currentIndex < firstArray.length;
        currentIndex++
    ) {
        const itemFromFirstArray = firstArray[currentIndex];
        const indexToPlaceInto = currentIndex + endpoint;

        // Store into this index in the second array
        secondArray[indexToPlaceInto] = itemFromFirstArray;
    }

    // By this point, we've replaced each placeholder from the second array
    // with each element from the first array. All we have to do now is sort the second array
    const mergedAndSortedArray = secondArray.sort(
        (numberOne, numberTwo) => numberOne - numberTwo
    );
    console.log(`
        Merged and Sorted Array Optimized
        ${mergedAndSortedArray}
  `);

    return mergedAndSortedArray;
}

We'll run both our naive-implementation and our optimized-implementation with the same input and validate that both approaches yield the same result. See below

// The below outputs
// Merged and Sorted Array
// 1,3,4,6,7,8,9,10
const naiveArray = mergeArraysNaive([10, 9, 8, 7], [1, 3, 4, 6, 0, 0, 0, 0]);

// The below outputs
// Merged and Sorted Array Optimized
// 1,3,4,6,7,8,9,10
const optimizedArray = mergeArraysOptimized(
    [10, 9, 8, 7],
    [1, 3, 4, 6, 0, 0, 0, 0]
);

console.log(
    `Both arrays are equal - ${naiveArray.sort().toString() === optimizedArray.sort().toString()}`
);

// Which outputs the below

Merged and Sorted Array
1,3,4,6,7,8,9,10
    

Merged and Sorted Array Optimized
1,3,4,6,7,8,9,10
  
Both arrays are equal - true

And that does it. We've completed our technical challenge question with both a naive implementation that does not take memory allocation and usage into account, and an optimized solution, that mitigates the allocation of net-new memory, and incorporates existing allocated memory to do so.

See the attached code for the GitHub Gist that contains both implementations!

Merging a One-Fold Array and Two-Fold Array

The Simple and Not-So-Memory-Optimized Approach

Merging with an Memory-Optimized Approach

zerochass