July 13, 2014

Splicing git repositories

Some projects eventually get split up into multiple source repositories for whatever reason. Sometimes, it is useful however to present the project as a single repository – it’s more difficult to examine a project’s development history when it is scattered among several repositories. Specifically, git bisect will not be very helpful if there is strong inter-dependency between the repositories, such as with the reference D programming language implementation. Or you may just not like the layout, which pushes you into using a complicated build process you may not want to use, which was the case for me with the new DerelictOrg repositories.

I have created d-dot-git a while ago to solve the first problem. This program creates a new repository which contains the D component repositores as submodules, referencing mainline commits in each repository through a history which chronologically follows all D repositories. The result is a repository with a linear history which you can then easily use git bisect with. You can see the resulting repository here. Digger takes the chore of setting up git bisect run and building D away, and allows you to specify a D source code test case directly. I’ve also covered this during part 2 of my DConf 2014 presentation (“Reducing D Bugs”).

I wanted to do something different for Derelict – submodules wouldn’t cut it. The current structure of the Derelict project (it has a history of moves and refactorings) consists of a DerelictOrg GitHub organization, with a repository for each Derelict component. The repositories have a partially-overlapping directory structure (each repository has a source/derelict directory, which then contains the sub-package with the respective component source code). I’d have liked one repository with the root being the contents with the derelict package being the repository root, thus containing the component sub-packages at the root level directly. This would allow using the repository as a git submodule: one would only need to clone a project with --recursive, and no further setup or dependencies are needed (aside from rdmd, which is included with D).

There are existing approaches and tools that could’ve worked to achieve this:

  • git filter-branch followed by a subtree merge would be OK for a one-off conversion. However, I wanted a live mirror, which would keep itself in sync with the latest DerelictOrg changes.
  • David Fraser's splice-repos and Philippe Bruhat's git-stitch-repo solve the same problem.
    However, they use git-fast-import/export to move all the data around, which makes them less efficient than they could be, as they need to shuffle all the data from the target directories directly.
  • The program I wrote, git-splice-subtree, works by reading and writing raw Git objects from the Git database. Since the files and the subtrees are not affected, the program only concerns itself with creating git commit and tree objects that point to the existing ones in the right pattern. It does this by fetching all repositories' objects into a single one, then spawning batch git database object readers (git cat-file) and writers (git hash-object) and piping data to them directly. By not touching any of the repositories' actual files, and spawning as little processes overall as possible, it is quite speedy (almost instant on the DerelictOrg repositories, which is great for a cronjob). I've also managed to avoid having to choose between redundant disk writes for temporary files, or spawning a process for every commit, by tricking git hash-object into reading from the same named pipe over and over.

    DerelictMerge generates the spec file for git-splice-subtree. The resulting repository is on BitBucket (to avoid unintentionally pinging any of the authors on GitHub).

    July 10, 2014

    Derelict and @nogc

    I’ve begun the process of adding support for @nogc in a backward-compatible way for all of the Derelict packages. For now, I’ve added it to all branches of the SDL2, GLFW3 and GL3 bindings. More will come over the next few days. I’ll also try to make sure that I’ve got all of the bindings appropriately updated, branched and tagged.

    Note that the @nogc attribute is only applied to the function pointers and not to any of the loader methods. It’s probably fine to add it to some of them (such as isLoaded). I’ll need to do a sweep at some point to add @nogc and other attributes to the loaders where appropriate, but for now it is not a priority.

    If you are maintaining a Derelictified binding and want to take advantage of backwards-compatible @nogc, you’ll need to import derelict.util.system. Then you can add @nogc to your extern(C) function pointer declarations, but it has to come before nothrow if you want it to compile with anything other than DMD 2.066. To be backwards compatible, it’s implemented as a UDA when not using 2.066, and UDAs must be declared first in the attribute list.

    July 03, 2014

    std.typecons.wrap Improvements Continued

    My last post mentioned that a change to a Structural template caused a unittest failure. The specific details are as follows.

    dlang
        Human h1 = new Human();
        // structural upcast (two steps)
        Quack qx = h1.wrap!Quack;   // Human -> Quack
        Flyer fx = qx.wrap!Flyer;   // Quack -> Flyer
        // strucural downcast (two steps)
        Quack qy = fx.unwrap!Quack; // Flyer -> Quack
        Human hy = qy.unwrap!Human; // Quack -> Human
        assert(hy is h1);
        // strucural downcast (one step)
        Human hz = fx.unwrap!Human; // Flyer -> Human
        assert(hz is h1); // FAIL

    The failure was on the direct step from a Flyer straight back to a Human. The Flyer object did not wrap a Human, instead it was a Quack. But the change to use Structural!T doesn't know anything about Quack. This meant that checking if the Flyer could be cast to a Structural!Human would fail because it was not. Templates in D are not contravariant so the Flyer was also not a Structural!Object, even though for all intents and purposes it is.

    I believe there is some classinfo I could obtain and setup appropriate calls to make the appropriate jumps back to the original type. However, I can have the compiler do this for me. So I returned to the original object approach by making classes and interfaces use Structural!Object and leave the specific type only for structs.

    This is a workable solution since a structure will never nest. If a struct is wrapped into multiple types, the second wrap will be wrapping the Impl class created from the first

    On a similar note, I found a request by Andrei which I think is essentially addressed by this work, Class!T.