February 25, 2015

String Replacement Driven by CSV Data

Sometimes it is worth writing a up a script that does something of use only once; there are times these this could be a need to generate a bunch of scripts which actually only differ by some predefined set of data. Usually you'll want to write such a program so that the data drives the execution. However this can be fairly challenging if the script language isn't really meant for such a task, an example of this might be SQL. At which point it may be desirable to have a script connect to the database and build a transaction; this may not be an option if security prevents running/installing the needed script, allowing only for someone to run SQL.

Due to the feature set of D building a simple replacement script generator isn't too challenging. And if you need to do more complicated text processing to get at specific data, I personally find D to decent processing options even if they tend to be a little more complicated since the tools for text processing tend to emphasize a single pass causing multi-pass processing to be a little more work.

Here is a program that does a basic generation driven by CSV data for a simple string template.

dlang
import std.algorithm;
import std.csv;
import std.exception;
import std.file;
import std.path;
import std.range;
import std.stdio;

immutable replacementString =
`Some data [ReplaceThis] and [ReplaceThat]`;

auto replacementData =
`23,"fishes that eat"
42,eagles`;

struct Data {
    string number;
    string comment;
}

void main(string[] args) {
    version(FileInput) {
        enforce(args.length == 2,
          "Usage: ./" ~ args[0].baseName ~ " FileName");
        auto replacementData = readText(args[1]);
    }
    foreach(record; csvReader!Data(replacementData))
        writeln(replacementString.replaceWithData(record));
}

string replaceWithData(string str, Data d) {
    return str.replace("[ReplaceThis]", d.number)
        .replace("[ReplaceThat]", d.comment);
}

This script can then be run with rdmd scriptName.d, and I provided a version to show loading data from a file which is rdmd -version=FileInput scriptName.d dataFile.csv.

Conclusion

This is not a statement of D being the most concise means to accomplish this particular task, or that another language like Python, Ruby, PHP, etc. aren't fully capable of performing the same task just as or more easily. What I'd like to get across is that such a simple task is pretty straight forward in a language providing machine accesses such as pointers.