The missing C++ smart pointer

  • Rip off of this article from 6 years ago?

    https://hackernoon.com/value-ptr-the-missing-c-smart-pointer...

    https://buckaroo.pm/blog/value-ptr-the-missing-smart-ptr

    People have been writing pointer-like value semantic wrappers for type-erasure for decades.

  • Transparently copyable heap-allocated object would are a recipe for introducing invisible performance issues, especially in generic code.

    Rust requires types to explicitly opt-in to being implicitly copied, while C++ requires you to opt-out by deleting the copy-constructor.

    Accidentally copying small structs on the stack is a minor performance problem. Copying an std::box<int> in a hot loop could cause heap fragmentation, lock contention and huge amounts of wasted memory due to heap alignment requirements (32 bytes on 64-bit arches).

  • I think there's a bit of confusion here around "value semantics".

    No C++ smart pointer has "value semantics", relative to its target T. You can see this because == performs address comparison, not deep comparison, and `const` methods on the smart pointer can be used to mutate the target (e.g. in C++, operator* on unique_ptr is always const, and yields a T&).

    This is in contrast to Rust, where Box performs deep equality, and has deep const/mut. In Rust, Box is basically just a wrapper around a value to have it on the heap (enabling things like dynamic polymorphism, like in C++). In C++, the pointer is its own entity, with its own separate equality, and so on.

    Const-ness of operations, operator==, and assignment/copying behavior all have to be consistent with each other. For example, if `box` was simply `unique_ptr` with a copy constructor (somehow, and as the table in the blog post basically implies), then you would have that after `auto a = b;`, `a != b`, which obviously doesn't work. This means that the hypothetical `std::box` would have to have its comparison and const-ness adjusted as well. In C++ terms, this isn't really a pointer at all. The closest thing to what the author is suggesting is actually `polymorphic_value`, I believe, which IIRC has been proposed formally (note that it does not have pointer in the name).

    Also as an aside, smart pointers are not suitable a) for building data structures in general, and b) building recursive data structures in particular. The former is because meaningfully using smart pointers (i.e. letting them handle destruction) inside an allocator aware data structure (as many C++ data structures tend to be, and even data structures in Rust) would require duplicating the allocator over and over. The latter is because compilers do not perform TCO in many real world examples (and certainly not in debug mode); if you write a linked list using `std::unique_ptr` the destructor will blow your stack.

  • I'm not sure I see the benefit vs std::unique_ptr. In the rare case you do want to deep-copy a unique_ptr, you can always use std::make_unique() to invoke the copy constructor

  • First of all, Box is a terrible name because the term has been used for boxed pointers (putting tag bits into unused parts of the pointer) and, in some languages, immediates, for four decades at least. Also all C++ standard identifiers are thankfully lowercase snake case.

    I don't understand why the author writes "raw value is straightforward and efficient... However, you can't allocate them dynamically and you can't build recursive data structure such as a linked list or a tree with them." There is clearly something I don't understand here. Consider an int -- you can dynamically allocate one, you can put it in a tree. Putting a box into a tree will still require other data applicable to the tree itself, same as an int), and so on. So I don't understand the point being made here.

    And the deep copy behavior is rarely what I want in a mutable structure anyway (it's always safe, if usually wasteful, in a R/O structure).

  • I hadn't even thought about it, I was like Box<T> is basically std::unique_ptr<T> anyway so what's the point -- but yes, Rust's types all either can't be copied at all, or they implement Clone and thus Clone::clone, which is what you'd call a "deep copy" if you're used to that nomenclature.

    I think the underlying cause is that Rust's assignment semantic is a destructive move, not a copy†, which frees up the opportunity for an actual copy to be potentially expensive, matching reality. In a language where assignment is copy, that operation must be cheap and so we've obliged to make up an excuse for how although this is a "copy" it doesn't behave the way you want, it's just a "shallow copy".

    † Although it will nearly always work to think of Rust's assignments as destructive move, as an optimisation types whose representation is their meaning can choose to implement Copy, a trait which says to the compiler that it's fine to actually just copy my bits, I have no deeper meaning - thus if the type you're using is Copy then assignments for that type are in fact performed just by copying and don't destroy anything. So a byte, a 64-bit floating point number, a 4CC, an IP address none of those have some larger significance, they're Copy, but a string, a HashMap, some custom object you made (unless it can and did opt in to Copy), those are not Copy.

    Crucially, from an understanding point of view. Implementing Copy requires a trivial implementation of Clone. As a result it feels very natural.

  • That's pretty much polymorphic_value <https://wg21.link/p201> or indirect_value <https://wg21.link/p1950>

  • Seems to me that the critical problem with this idea is "deep copy."

    There is no builtin deep copy facility. Without the facility then a box pointer would be dangerous leading to weird effects when the copy is too shallow.

    You could solve deep copy with a template that relies on each class providing a deep copy function if one is needed. But again, this will make bugs if someone forgets to provide the function.

    Rather than make an error-prone feature in the standard library, I think it would be better to just explicitly roll this yourself. A sensible constructor copy should already do a deep copy -- or ensure copy-on-write to simulate a deep copy. So copying is as easy a calling make_shared (original) or make_unique (original).

  • > std::box<T> addresses these issues by offering deep copying and automatic garbage collection

    This is pretty much impossible when holding a pointer of base class. However, this is a primary reason for having pointers in the first place (polymorphism, and having abstract base classes).

    In all other cases, you're probably better off with either the raw value, std::variant or std::reference_wrapper.

  • > Inspired by Box<T> in Rust, the std::box<T> would be a heap-allocated smart pointer.

    so, is it the pointer that is heap-allocated or the pointee? frankly, i find this article somewhat incoherent, and importantly, it lacks code examples illustrating what it is talking about.

  • The immutable data-structures library Immer provides such type:

    https://sinusoid.es/immer/containers.html#box

  • I like the idea. I have wondered why not have a "garbage collected" smart ptr std::gc_ptr<T> that can allow cycles with other such ptrs to avoid the short coming of std::shared_ptr<T>. You would need to define which gc_ptr's are in your "root set" to initiate the "mark and sweep" of the graph of ptrs. This would be useful for heavily linked data structures with cycles.

  • There are some (a lot of) changes that C++ should have inspired by Rust, but as the other comments have said. I really don’t feel like a smart pointer that acts like unique but copies by value is all that necessary.

    That doesn’t seem to fix a memory bug (cause doing this with a unique ptr, then the compiler would yell at you for using copy), it seems to just make it easier than having to write `std::make_unique(*otherptr);`

  • I have written and been using that same smart pointer type for years, under the pretty horrible name of holder_cloner_t<> (at least it's clear). It is indeed the right solution to a very common and important type of problem. Looking forward to something like this in the standard library one of these decades.

  • Low level-capable languages need GC lifecycle hooks and replacements similar to Rust alloc, but flexible enough to plug in BWS, ORCA, or DIY.

    I also think the semantics of shared and unshared const and mutable state need to be made explicit. Pony is very good about this more so than Rust by bringing into the language.

  • I've seen this sort of pointer (assuming the author means it's nullable) be called "clone_ptr<T>" but it called T's clone() method. Because T might be a base class, invoking the pointee's copy constructor in C++ is not a great idea.

  • I'd rather have an optionally-owned pointer type so I can handle writing virtual methods that can return a value aliasing an already existing value or create a new one on demand. Otherwise you either have to roll your own or bloat your API:

      virtual std::flexptr<Thing> get_expensive_thing();
    
    vs:

      virtual bool has_expensive_thing();
      virtual const Thing& get_expensive_thing();
      virtual std::unique_ptr<Thing> build_expensive_thing();
    
    It'd probably have shared_ptr semantics but you'd have to treat it as a const ref for lifetime purposes, which might make it distasteful to the std library folks.

  • The author could implement the deep-copying pointer and share the .h on their GitHub. You don’t need a language extension for this in C++ as types can implement operator overrides and copy and move constructors.

    But I doubt many people would use it, and that’s probably why it doesn’t belong in std::.

    In contrast, before C++ 11, developers would write their own RAII-style smart pointers. So it made sense to save them the labor. I don’t think a pointer that doesn’t allow shallow copies is usually found in codebases. It sounds like a specific use-case pointer.

    It’s a neat type that people coming from other languages could like, but maybe not quite standard library-ready?

  • A

        non_null_unique_ptr<T>
    
    that is enforced at compile time would be far more valuable for me. That would mean some kind of destructive move where the compiler guarantees that you can not access a moved from object.

  • As many as mentioned before, i think the authors is mixing a lot of concepts and might be trying to write rust in C++ ( or vice-versa).

    From what i understand the key here is that authors seem to be mixing references and pointers.

    What the author wants here (std::box) is a garbage collected/RAII'ed reference to a heap allocated object.

    smart_pointers are pointers... which is to say their identity is distinct from pointed object.

    I don't think std::box is missing as much as it doesn't really fit well in C++ current memory model, and distinction between references, pointer and ownership.

  • I just do my darndest never ever use the heap. Let the stack be your memory manager.

    I think if you come from other languages you assume the heap is the default when it should be the exception.

  • Without code examples its really hard to judge.

    The lack of this type can be viewed as a pessimization for copying objects.

  • The problem I see with this is, that you don't always know how to make a deep copy. Who knows, what happens when you copy a variable of type Foo?

    Taking that aside, I agree it would make a lot of sense to write code in that style^^

  • Is this similar to std::optional? It's a box containing a value. Copying the optional copies the value.

  • I feel like copy be default makes box a weird paradigm in C++.

    In Rust you will love unless you write `Clone`.

  • This kinda reminds me of the cow pattern a lot of swift stdlib types use.

  • What would prevent me from making a std::box<FILE*> and blowing up my program?

  • std::vector<MyType> is a pretty good 'box' like container. Dynamically allocated and applied RAII semantics. If you only want one instance, then dynamic allocation shouldn't (?) be necessary.