March 13, 2019

Issues DIP1000 can’t (yet) catch

D’s DIP1000 is its attempt to avoid memory safety issues in the language. It introduces the notion of scoped pointers and in principle (modulo bugs) prevents one from writing memory unsafe code.

I’ve been using it wherever I can since it was implemented, and it’s what makes it possible for me to have written a library allowing for fearless concurrency. I’ve also written a D version of C++’s std::vector that leverages it, which I thought was safe but turns out to have had a gaping hole.

I did wonder if Rust’s more complicated system would have advantages over DIP1000, and now it seems that the answer to that is yes. As the the linked github issue above shows,  there are ways of creating dangling pointers in the same stack frame:

void main() @safe {
    import automem.vector;

    auto vec1 = vector(1, 2, 3);
    int[] slice1 = vec1[];
    vec1.reserve(4096);  // invalidates slice1 here
}

Fortunately this is caught by ASAN if slice1 is used later. I’ve worked around the issue for now by returning a custom range object from the vector that has a pointer back to the container – that way it’s impossible to “invalidate iterators” in C++-speak. It probably costs more in performance though due to chasing an extra pointer. I haven’t measured to confirm the practical implications.

In Rust, the problem just can’t occur. This fails to compile (as it should):

use vec;

fn main() {
    let mut v = vec![1, 2, 3, 4, 5];
    let s = v.as_slice();
    println!("org v: {:?}\norg s: {:?}", v, s);
    v.resize(15, 42);
    println!("res: v: {:?}\norg s: {:?}", v, s);
}

I wonder if D can learn/borrow/steal an idea or two about only having one mutable borrow at a time. Food for thought.

March 07, 2019

The joys of translating C++’s std::function to D

I wrote a program to translate C headers to D. Translating C was actually more challenging than I thought; I even got to learn things I didn’t know about the language even though I’ve known it for 24 years. The problems that I encountered were all minor though, and to the extent of my knowledge have all been resolved (modulo bugs).

C++ is a much larger language, so the effort should be considerably more. I didn’t expect it to be as hard as it’s been however, and in this blog I want to talk about how “interesting” it was to translate C++11’s std::function by hand.

The first issue for most languages would be that it relies on template specialisation:

template<typename>
class function;  // doesn't have a definition anywhere

template<typename R, typename... Args>
class function<R(Args...)> { /* ... */ }

This is a way of constraining the std::function template to only accept function types. Perhaps surprisingly to some, the C++ syntax for the type of a function that takes two ints and returns a double is double(int, int). I doubt most people see this outside of C++ standard library templates. If it’s still confusing, think of double(int, int) as the type that is obtained by deferencing a pointer of type double(*)(int, int).

D is, as far as I know, the only other language other than C++ to support partial template specialisation. There are however two immediate problems:

  • function is a keyword in D
  • There is no D syntax for a function type

I can mitigate the name issue by calling the symbol function_ instead; however, this will affect name mangling, meaning nothing will actually link. D does have pragma(mangle) to tell the compiler how to mangle symbols, but std::function is a template; it doesn’t have any mangling until it’s instantiated. Let’s worry about that later and call the template function_ for now.

The second issue can be worked around:

// C++: `using funPtr = double(*)(int, int);`
alias funPtr = double function(int, int);
// C++: `using funType = double(int, int);`
alias funType = typeof(*funPtr.init);

As in C++, the function type is the type one gets from deferencing a function pointer. Unlike C++, currently there’s no syntax to write it directly. First attempt:

// helper to automate getting an alias to a function type
template FunctionType(R, Args...) {
    alias ptr = R function(Args);
    alias FunctionType = typeof(*ptr.init);
}

struct function_(T);
struct function_(T: FunctionType!(R, Args), R, Args...) { }

This doesn’t work, probably due to this bug preventing the helper template FunctionType from working as intended. Let’s forget the template constraint:

extern(C++, "std") {
    struct function_(T) {
        import std.traits: ReturnType, Parameters;
        alias R = ReturnType!T;
        alias Args = Parameters!T;
        // In C++: `R operator()(Args) const`;
        R opCall(Args) const;
    }
}

void main() {
    alias funPtr = double function(double);
    alias funType = typeof(*funPtr.init);
    function_!funType f;
    double result = f(3.3);
}

This compiles but it doesn’t link: there’s an undefined reference to std::function_::operator()(double) const. Looking at the symbols in the object files using nm, we see that g++ emitted _ZNKSt8functionIFddEEclEd but dmd is trying to link to _ZNKSt9function_IFddEEclEd. As expected, name mangling issues related to renaming the symbol.

We could manually add a pragma(mangle) to tell D how to mangle the operator for the double(double) template instantiation, but that solution doesn’t scale. CTFE (constexpr if you speak C++ but not D) to the rescue!

// snip - as before
pragma(mangle, opCall.mangleof.fixMangling)
R opCall(Args) const;

// (elsewhere at file scope)
string fixMangling(string str) {
    import std.array: replace;
    return str.replace("9function_", "8function");
}

What’s going on here is an abuse of D’s compile-time power. The .mangleof property is a compile-time string that tells us how a symbol is going to be mangled. We pass this string to the fixMangling function which is evaluated at compile-time and fed back to the compiler telling it what symbol name to actually use. Notice that function_ is still a template, meaning .mangleof has a different value for each instantiation. It’s… almost magical. Hacky, but magical.

The final code compiles and links. Actually creating a valid std::function<double(double)> from D code is left as an exercise to the reader.