April 20, 2015

D1->D2 Part 3: magic module

Since the very early discussions of a possible D2 transition process one thing was clear: we want to avoid version statements, or at least minimize their usage as much as possible. Polluting the code with version statements makes it harder to maintain and keep different branches in sync. And it is impossible to version only parts of declaration in D2, forcing code duplication and/or not-so-obvious workarounds.

But quite a lot of versioning is needed to get something that works in both D1 and D2 world. Fortunately, D has some pretty good capabilities of abstracting stuff away—even abstracting away the syntax difference. And that was how transition.d was created.

Quick highlight

/*******************************************************************************

    Replacement for `typedef` which is completely deprecated. It generates
    usual `typedef` when built with D1 compiler and wrapper struct with
    `alias this` when built with D2 compiler.

    Used as mixin(Typedef!(hash_t, "MyHash"))

    Template Parameters:
        T       = type to typedef
        name    = identifier string for new type
        initval = optional default value for that type

*******************************************************************************/

template Typedef(T, istring name, T initval)
{
    static assert (name.length, "Can't create Typedef with an empty identifier");
    version(D_Version2)
    {
        mixin(`
            enum Typedef =
                ("struct " ~ name ~
                "{ " ~
                T.stringof ~ " value = " ~ initval.stringof ~ ";" ~
                "alias value this;" ~
                " }");
        `);
    }
    else
    {
        const Typedef = ("typedef " ~ T.stringof ~ " " ~ name ~
            " = " ~ initval.stringof ~ ";");
    }
}

How does it work?

The basic idea is simple: use templates, aliases and string mixins to hide the differences between D1 and D2 behind a common wrapper. That way it will get expanded to a proper working version everywhere as soon as other compiler gets used—without separate git branches for D1 and D2 and without thousands of version blocks everywhere.

There are two bits worth paying attention to in the above-mentioned typedef snippet:

1) D2 code is hidden behind a string mixin which looks awkward, but it is necessary. The problem is that D1 compiler can’t recognize D2 code as valid language grammar, and even if it is versioned away, it still must be parsed by language spec. Turning it into a string that gets immediately mixed fixes the problem as mixin happens only during semantic phase, which won’t happen for versioned out code.

Of course, it was never intended to be used that way, but it is an important lesson to learn for any other language—if you want to enable transparent swapping between major versions of your language, ensure that it is possible to suppress lexing/parsing for the code in some way.

2) We don’t use std.typecons.Typedef because it is too error-prone and requires explicit management of redundant “cookie” argument to create unique types. To able to replace existing typedef statement automatically with a regexp it would need to be wrapped in a similar mixin to generate the “cookie” value, but simply declaring a struct instead is much more simple and reliable at this point.

So far, it seems to work so much better than the Phobos version that it makes me sad name is already taken in Phobos and I can’t simply make a pull request to replace it.

More goodies

Of course, that is not the only such wrapper in transition.—in fact, this module grows rapidly as I keep going through our library sources and discovering new interesting cases to take care of.

Sometimes additions are very trivial and totally unexpected:

/*******************************************************************************

    In D1 ModuleInfo is a class. In D2 it is a struct. ModuleInfoPtr aliases
    to matching reference type for each of those.

*******************************************************************************/

version (D_Version2)
{
    alias ModuleInfo* ModuleInfoPtr;
}
else
{
    alias ModuleInfo  ModuleInfoPtr;
}

Sometimes those help to swap between tango and phobos utilities in uniform manner:

/*******************************************************************************

    Helper template that can be used instead of deprecated octal literals. In
    some cases preserving octal notation is really important for readability and
    those can't be simply replace with decimal/hex ones.

    Template Params:
        literal = octal number literal as string

*******************************************************************************/

version (D_Version2)
{
    static import std.conv;

    template Octal(istring literal)
    {
        static if (literal[0] == '0' && literal.length > 1)
        {
            mixin("enum Octal = Octal!(literal[1..$]);");
        }
        else
        {
            mixin("enum Octal = std.conv.octal!literal;");
        }
    }
}
else
{
    static import tango.text.convert.Integer;

    template Octal(istring literal)
    {
        const Octal = tango.text.convert.Integer.parse(literal, 8);
    }
}

unittest
{
    static assert (Octal!("00") == 0);
    static assert (Octal!("12") == 10);
    static assert (Octal!("1") == 1);
    static assert (Octal!("0001") == 1);
    static assert (Octal!("0010") == 8);
    static assert (Octal!("666") == (6 + 8*6 + 8*8*6));
}

As you may see, some extra logic was needed here because parsing utilities provided by tango and phobos have subtle differences in a way they process some formats. I could have used tango utilities for both versions, but that would create a bunch of cyclic imports as tango needs to use this module, too.

And there are also helpers to deal with runtime differences as opposed to language ones:

/*******************************************************************************

    Helper to smooth transition between D1 and D2 runtime behaviours regarding
    array stomping. In D2 appending to array slice after length has been changed
    results in allocating new array to prevent overwriting old data.

    We use and actually rely on that behaviour for buffer re-usage.
    `assumeSafeAppend` from object.d enables stomping back but adding using this
    no-op wrapper in D1 code will save time on trying to find those extremely
    subtle issues upon actual transition.

    All places that reset length to 0 will need to call this helper.

    Params:
        array = array slice that is going to be overwritten

*******************************************************************************/

void enableStomping(T)(ref T array)
{
    version(D_Version2)
    {
        assumeSafeAppend(array);
    }
    else
    {
        /* no-op */
    }
}

By the way, I foresee this one to be huge pain once we get to actually porting and debugging deployed services. There is also custom druntime build I use that asserts on extra allocation to prevent stomping to make finding those cases easier.

This is, by the way, one of worst kinds of breaking changes one could make—code that still compiles, does not crash, does not throw, but silently creates a huge performance regression if unattended. Doing such changes must be accompanied with an extremely good migration plan and tools if you want to keep users happy! In this specific case I am not happy at all but, well, someone needs to finally do this.

Tango-D2

Now that there is some basic understanding of what kind of code we need, it should become more obvious why I can’t use existing port of tango library to D2. The main problem is that we don’t need just D2 version of tango, but a very specific type of code that works for both D1 and D2 at the same time. Trying to retro-fit Tango-D2 for that and port all our internal patches to it would require a totally new set of tools that wouldn’t be useful for any other projects of ours.

At the same time, doing the same port again myself, I can always cheat by looking at Tango-D2 sources to learn what they have decided about some of more tricky moments. And that learning experience will help a lot when we move to our own libraries. The research done by SiegeLord has saved a lot of my time and I am very grateful for that.

It is probably worth mentioning that, as of today, this second specialized port of Tango is almost complete and is being used as a testing ground to polish the porting process.

April 14, 2015

D1->D2 Part 2: -v2 flag

So, we return to the D2 transition topic. This time I’ll write in a bit more in detail about the requirements we have defined and tools used to get it all done.

It was already mentioned that most of our applications and libraries are moving targets—there is no point of time in which you can just stop all development, do the full port and continue forward. We needed something that can be done in small chunks and used in production immediately—even if that means more accumulated effort in total.

Some of necessary changes were easy in that regard. For example, this is perfectly legal D1:

class A {
    void foo() {}
}

class B : A {
    void foo() {}
}

D2, though, will complain about overriding the method without using explicit override keyword. But hey, this is trivial to fix, just add the keyword!

class A {
    void foo() {}
}

class B : A {
    override void foo() {}
}

This is both legal in D1 and D2 at the same time and switching the compiler won’t produce any errors. And this was exactly the first step to do – find all small differences between two versions that can be fixed by just using different style of D1 code and ensure that compatible style is used.

With the kind help of Walter Bright, a new compiler flag was added to make finding such places easy:

$ dmd1 -v2-list
Available hints for -v2 flags:
  explicit-override    overriding methods need to be explicitly annonatted with 'override'
  syntax               basic syntax differences (reserved keywords, loop syntax, etc)
  octal                octal numeric literals need to be replaced
  const                const storage class can't be used
  switch               implicit case fall-through is not allowed, default statament is mandatory
  volatile             volatile statements are not supported anymore
  static-arr-params    static array parameter will become passed by value

This was important thing to do because naive usage of the D2 compiler to find such issues quickly resulted in being overwhelmed with thousands of unrelated and hard-to-filter messages, and even the errors that matter often provided hard-to-understand diagnostics, because of being tuned to analyze legal D2 code, not D1 code that is being ported.

Having dedicated diagnostics flags allowed us to methodically go through libraries taking care of one specific issue kind at time, while still working on the master branch and keeping the code perfectly usable by existing D1 projects. Each kind of change applied to one of libraries resulted only in tiny steps towards full D2 compatibility, but nevertheless a step forward while keeping our services uninterrupted.

Of course, only some of required changes are that simple.

April 12, 2015

Support for SDL 2.0.4 in Derelict

Thanks to Ben Boeckel, DerelictSDL2 now has support for SDL 2.0.4. It’s available in version tag 1.9.5, so you’ll want to specify that in your DUB configuration. Since SDL 2.0.4 hasn’t actually been released yet, Derelict still attempts to load SDL 2.0.2/3 by default. If you want are interested in using SDL 2.0.4 with Derelict, you’ll need to specify it with a SharedLibVersion when you call the load method.

DerelictSDL2.load(SharedLibVersion(2,0,4));

April 06, 2015

Faster GC

D 2.067.0 is out and comes with a much faster GC for most applications. We’re seeing speedups of roughly 30% for GC heavy applications, with a slightly increase in memory consumption.

I recently gave a presentation about the improvements at the Berlin D meetup, slides are here.

As announced in the changelog the two most important improvements are a better grow strategy and a faster tail-recursive marking, more details will follow soon. This work was complemented by many refactorings and a better benchmark suite.