August 21, 2024

Is Pass By Reference really that fast?

You may find this article silly. It presents knowledge in a format of a chain of thought. Making hypotheses and trying to prove them wrong.

More often than not, I have heard people mentioning that while calling a function with some arguments, Pass By Reference is a lot faster than Pass By Value. The common logical (and intuitive) explaination given is - when you use Pass By Value, the argument value is copied from one place to another, and that adds to the function calling overhead.

While this argument makes sense, there are more caveats involved than we think. Never underestimate the power of modern compilers.

For starters, try to predict which one of these functions will be faster. Assume that array size is huge.

// Function using pass by value
std::vector<int> passByValue(std::vector<int> arr) {
    return arr;
}

// Function using pass by reference
std::vector<int> passByReference(std::vector<int>& arr) {
    return arr;
}

If your answer is passByReference, this blog is for you. If not, still give it a read, I'll try to be valuable for you advanced folks too.

Here are the actual results -

Pass by value time: 2580.4 ms
Pass by reference time: 2604.08 ms
Pass by reference is 0.990908x faster

Average performance over 100 runs -
Pass by value is 1.00002x faster.

The answer is, None of them is significantly faster than the other. What exactly happened here? The answer is our first and most basic compiler level optmization. Here, the compiler's internal mechanism saw that the arr is directly being return without using and made some optimizations while running.

The compiler level optimization at play here is called Return Value Optimization (RVO). I will try to explain what is happening here, but you are better off googling it and reading from a better source.

RVO - it avoids making useless copies of the object if it is not necessary. In a sense, it is still copying the object, but not in the time consuming manner that you and I are thinking of.

Here, the returned object is constructed in place of the function call. This eliminates the unnecessary move/copy constructor call. It will initilize the local arr variable in the same place where it will need to store the returned object. Avoiding the entire need of copying.

Now you may ask, if the argument is not returned as is, let's say some operation is done on it, which most of the real-world applications do, then what would be the result? lets take this example -

const int ARRAY_SIZE = 100000;
const int ITERATIONS = 1000;

// Function using pass by value
long long passByValue(std::vector<int> arr) {
    long long sum = 0;
    for (int i = 0; i < ITERATIONS; ++i) {
        for (size_t j = 0; j < arr.size(); ++j) {
            sum += arr[j];
        }
    }
    return sum;
}

// Function using pass by reference
long long passByReference(const std::vector<int>& arr) {
    long long sum = 0;
    for (int i = 0; i < ITERATIONS; ++i) {
        for (size_t j = 0; j < arr.size(); ++j) {
            sum += arr[j];
        }
    }
    return sum;
}

This time, we are doing massive sum operations inside the functions. And we are doing it a significant number of times so that the time difference is amplified.

So lock your answer - Which one will be faster ?

Answer is same - none of them will be significantly faster.

Pass by value time: 368.079 ms
Pass by reference time: 371.043 ms
Ratio (value/reference): 0.992012

Now the obvious question - Why? because even when we are doing massive operations, all the operations are READ operations & not even a single WRITE operation is happening. And as long as a WRITE operation isnt needed, the compiler avoids making a copy of the object, since it is not needed.

One slight deviation here - If we were to be naive and ask - if the compiler is making a local copy of the object inside Pass By Value, there must be some time saved compared to Pass By Reference since Pass By Ref has to "lookup" the value. Think about this and how this can be demonstrated. If I figure it out before you, I'll let you know!!. Now lets move on.

Now the next example is obvious - let's do a plain and simple write operation and see if our hypotheses are true -

std::vector<int> passByValue(std::vector<int> arr) {
    for (int& num : arr) {
        num *= 2;
    }
    return arr;
}

void passByReference(std::vector<int>& arr) {
    for (int& num : arr) {
        num *= 2;
    }
}

No need to lock in your answer this time! Your answer is probably correct.

Pass by value time: 1127.84 ms
Pass by reference time: 891.393 ms
Pass by reference is 1.26525x faster

Now as already explained, we are doing a WRITE operation. Now the compiler cant pull a smart one on us and HAS TO make a local copy, so that it doesnt end up modifying the original object in case of Pass By Value.

You can still see that its not a massive difference, but maybe its not well-amplified here. Feel free to experiment on this.

Conclusion -

while Pass By Value and Pass By Reference may have some performance trade-offs in some cases, I suggest we do not look at them as a performance optimization. Pass By Value or Pass By Ref are features simply provided to let the programmer choose if they want to modify the original value or not, and some other quirks. The modern compilers are very very smart and will mostly take care of the performance part in this case.

🗨️ You can just talk to this blog!

Interact with content in a whole new way using custom prompts.

Chat with ChatGPT Chat with Claude

Prefer other platforms? Copy the prompt →

Copy Prompt