Skip to content

Practical C++ value categories

Practical reasons to know and understand C++ value categories.

Definition of value categories

Before the C++11 standard was accepted value categories had been quite simple to grasp: expression was either a lvalue or an rvalue. There hadn’t been many reasons to care about it in day-to-day coding. I can’t remember myself pondering too much whether what I wrote will work correctly with both lvalues and rvalues. Things have changed since then and now we have not two but five category values. Well, there are really three distinct value categories and two mixed, which are best presented by the following figure:

value categories

The first thing to notice is that value category is a characteristic of an expression. It’s not a characteristic tied to a particular variable or a function. That’s the first, key thing to understand. The other characteristic of an expression is its type. Each expression in C++ has those.

OK, but why am I writing this? It’s been about 5 years since the C++11 was released. The internet is full of articles pertaining value categories and a bunch of questions on StackOverflow. The problem is most of them are incomplete/inaccurate/misguided or just plain wrong. This isn’t surprising as there’s no single, simple definition of each category. They’re rather defined on a case-by-case basis. Maybe that’s why so many people still ask about them and have problems differentiating xvalues from lvalues. So, instead of writing another weak post on the topic, I’ll give you a list of high-quality articles, written and peer-reviewed (mine aren’t) by experts:

Most of them, as you can see, concern rvalue references. That’s not a coincidence, as rvalue references were introduced to enable move semantics and that required refined value categories that will make move operations sane and safe.

The above list is a lot of reading and the goal of this post is to explain WHY  you should read those articles and convince you that it’s worth the effort. At least, if you want to write correct C++. OK, let’s move on to the examples. I’ll start with the basics and then move to the less known stuff.

Function overloading

First, something simple. I think the most important thing is that C++11 lets lvalues and rvalues take part in function overloading. The distinction is made by the usage of appropriate reference type of an argument.

#include <iostream>

void f(int &&) { std::cout << "&&\n"; }
void f(int &)  { std::cout << "&\n"; }
    
int main() {
    auto n = 10;
    f(n);     // identifier of a variable is almost always an lvalue
    f(42);    // 42 is a prvalue (like most literals)
    f(int{}); // int{} is a prvalue
}

The output of this program is this:

&
&&
&&

Maybe that’s not spectacular but it’s very important. This is a basis for move semantics.  When you take a look at move constructor and move assignment, that’s all there is to it. No new keywords, no special symbols to distinguish them.

struct A {
   A()          { std::cout << "ctor"; }
   A(const A &) { std::cout << "copy-ctor"; }
   A(A &&)      { std::cout << "move-ctor"; }
   A & operator=(const A & ) { std::cout << "copy assignment"; }
   A & operator=(A &&)       { std::cout << "move assignment"; }
   ~A() { std::cout << "dtor"; }
};

Efficiency and safety

Move semantics was introduced to, err, move things! The problem is, that’s not always safe, so it can’t be implicit in all scenarios. If it was, we would have a problem:

void f(A && a) { ... }
void g(A & a) {
    a.process(42);
    f(a);            // passing lvalue
    a.process(1729); // oops, is it safe?
}

A function taking an rvalue reference should be able to assume that it has a unique reference to a passed object. From the above code, it’s obvious that’s not the case. Functions taking arguments by rvalue references can modify them in any way that will leave them in “valid but unspecified state”. That means a.process(1729) is only safe after move if the function has no preconditions regarding the state of the object.

Implicit move of lvalues is dangerous and is forbidden by the standard. But that doesn’t mean moving lvalues makes no sense. What if we want to pass a vector we no longer care about to another function? That’s what std::move() was created for:

void g(std::vector<int> arg) { ... }
int f(int n) {
    auto v = std::vector<int>{n};
    do_stuff(v);
    auto coefficient = std::accumulate(std::begin(v), std::end(v), 0);
    g(std::move(v));
    return coefficient;
}

In this case, we don’t care anymore about the state of v after we pass it to g().  But it’s an lvalue so we have to be explicit that we know what we’re doing. We know that v is potentially big and we want to move it, thus making the code more efficient. std::move() doesn’t really move anything, it just casts an expression to an xvalue, which is a kind of an rvalue. This way arg in g(std::vector<int> arg) is move-constructed. Applying std::move when necessary requires at least basic understanding of value categories.

More concise code

The fact that we can now distinguish between lvalues and rvalues let’s us write more concise code. For example, working with temporary streams is now possible without extra effort:

std::string they, make, what;
std::stringstream{"Cats make chaos"} >> they >> make >> what;
std::ofstream{"note.txt"} << they << make << what;

That’s not a huge improvement – previously we would have to declare variables to get lvalue or do other tricks. But code without tricks is a better code, right?

Lifetime extension

In C++98 lifetime of a temporary object can be extended by assigning it to a const &.  What about assigning to rvalue reference?

class T{};
const T & lref = T{};       (1)
T && rref = T{};            (2)
T && mrref = std::move(T{}) (3)

(1) and (2) are the simplest examples of temporary lifetime extension. In this context T{} is a prvalue, which is a kind of an rvalue.  Now, what about (3)? As I said earlier, std::move returns xvalue. By looking at the code we know that we passed temporary to std::move() and we might expect that its lifetime is extended. That’s not the case, though, because xvalues are potentially just casted lvalues and therefore their lifetime is not automatically extended. T{} is destroyed at the end of the full expression and mrref is a dangling reference.

Returning from functions

Rvalue references can confuse newcomers. In C++98 returning a reference to non-static local object was one way to create a dangling reference. How about returning rvalue reference:

class T{};
T && f() {
    T t;
    ...
    return std::move(t);
}

T g() {
    T t;
    ...
    return std::move(t);
}

The f() function might by written out of confusion about what std::move() really does. We return rvalue to a local object, which is destroyed at the end of the scope. So, it’s the same story as with lvalue references – dangling reference.

In the function g() we return by value but we also try to ‘help’ the compiler and tell it to move. That’s bad for several reasons. Although t is an lvalue, the standard says that:

“A copy or move operation associated with a return statement may be elided or considered as an rvalue for the purpose of overload resolution in selecting a constructor.”

There’s really no point in “helping” a compiler this way. It’s intelligent enough to select appropriate constructor or even elide copy/move constructor and create the object in place, thanks to the optimization known as (N)RVO. That makes some people create rules like “never do return std::move(t)”, which is plain wrong. There are cases when it’s desirable to use it. Consider:

std::vector<T> f(std::vector && v) {
    // do something with v
    return std::move(v);
}

In this case, a compiler can’t assume it can move v, which is an lvalue of reference type. We have to explicitly ask for it. If std::move is omitted in this case, a compiler will have to use a copy constructor.

Summary

Value categories are important in many contexts in C++. Their understanding can help writing better code and lead to less eyebrow raising. Three main categories (lvalue, xvalue, prvalue) have different characterictics, which can make the same expressions give varying results.  I hope that above examples convinced you to spend some time learning about them. If I got something wrong or you’d like to share some other interesting examples, please let me know in the comments.


Header photo “Value”by GotCredit, available under Creative Commons Attribution license.

2 Comments

  1. cubbi cubbi

    Using std::move to move out of a sink parameter (non-forwarding T&&) is of course correct. The actual recommendation against the optimization-breaking use of std::move in return statements is “Never write return move(local_variable);”, see for example https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es56-write-stdmove-only-when-you-need-to-explicitly-move-an-object-to-another-scope (to cover all cases, it could have said “return move(local_variable_or_by_value_parameter)”, but that’s a bit unwieldy)

    • kszatan kszatan

      You’re right, that’s the actual recommendation. I’ve included that example because I’ve seen many times people using shorter, incorrect version “never do return std::move(x)”, without adding that ‘x’ is a local variable (or more precisely that it has automatic storage). Thanks for the comment.

Leave a Reply

Your email address will not be published. Required fields are marked *