Here’s the challenge: translate Cantabile into another language in one afternoon. Challenge Accepted!

In these early builds of Cantabile 3 I’m trying to address all the mundane tasks that often get left to the end and in the process turn into a lot of work.

One such area is language translation. When I received an enquiry on Saturday evening asking if Cantabile will be translated into other languages it immediately struck me as something I should get on top of now. I wasn’t planning for this right now but decided to spend my Sunday afternoon on it and see how far I could get.

As I don’t speak any languages other than English so I’m going to need to rely on professional translation services and/or volunteers to do translations. For this to work it really needs to be done in a way that:

  • Is not too burdensome during development (so I can stay on top of it)
  • Is automated
  • Makes it as easy as possible for the translators

I also needed to pick a language for this first translation. Given the name “Cantabile” the choice was obvious: Italian.

Keeping Development Simple

Anyone who’s worked on software that needs to be translated will be familiar with the tiresome process that’s typically used… give each string an Id, define the Id in a file, add the Id and the associated string to a string table in another file and then where you actually need the string call some function to load it. Given how many strings are in a product like Cantabile that’s a lot of work.

(If you’re not familiar with the term ‘string’ in this context it’s what programmers call a piece of text — a string of characters).

Rather than use an Id for each string I decided to use the string itself, and wrote a C# extension method that does the translation. All I need to do now is suffix all translatable strings with T(). eg:

"Insert Plugin".T()

That’s easy enough… I just need to remember to do it while I’m writing the code.

Retrofitting the Existing Code Base

While adding T() is simple enough to do it’s not something I’ve been doing so far. To try and knock this over quickly I wrote a simple program that scanned the entire Cantabile code base to locate all the strings. It was then just a matter of picking out the ones that should be translated and adding the suffix.

Of about 3,000 strings almost 900 need to be translated.

Automating the Process

Because Cantabile is under active development the set of strings it contains is going to be constantly changing. One thing I didn’t want to have to do is manually update a list of these strings.

So I wrote another program that scans the code base, this time just looking for strings with a .T() suffix and writes them out to a JSON file.

This generated file looks something like this:

{
    "Column Headers":
    {
        "contexts": [
            ".\\Cantabile\\Controls\\Composited\\ColumnHeaders.cs",
        ],
    },
    "Instruments && Effects":
    {
        "contexts": [
            ".\\Cantabile\\Controls\\Composited\\ContentPanel.cs",
        ],
    },
    "Level Meter":
    {
        "contexts": [
            ".\\Cantabile\\Controls\\Composited\\LevelMeter.cs",
        ],
    },
    "MIDI Activity Indicator":
    {
        "contexts": [
            ".\\Cantabile\\Controls\\Composited\\MidiActivityIndicator.cs",
        ],
    },

You’ll notice the English term (eg: “Column Headers”) along with some context information ie: the name of the C# source file (or files) where the string was found. This is not critical to the process but provides some extra contextual information for translators to know where and how the string is used.

I now had list of all the strings that need translating — time to actually translate them.

Machine Translation

Machine translation is far from perfect but it’s a really good start that I’m sure will reduce the workload for anyone working on a translation. I read up about Google Translate and found they have a programmatic web API. It’s cheap and it’s fast. Time for another utility program.

This program reads the JSON file above, passes each English string to Google Translate and updates the file with the translated text. It now looks like this:

"Column Headers": 
{
    "contexts": 
    [
        ".\\Cantabile\\Controls\\Composited\\ColumnHeaders.cs"
    ],
    "translation": "Le intestazioni di colonna",
    "machine": true
},
"Instruments && Effects": 
{
    "contexts": 
    [
        ".\\Cantabile\\Controls\\Composited\\ContentPanel.cs"
    ],
    "translation": "Strumenti ed effetti",
    "machine": true
},

You’ll notice the translated text has been added as well as an additional setting “machine”: true indicating this was a machine translation that needs to be reviewed by a human translator (who can then delete this setting).

I now had a complete, but flawed translation. Even I know “Circa” isn’t the correct translation for “About” in this context, but that’s something I need to leave to the translators.

Merging

The final piece of the puzzle was a tool that can take two JSON files — the machine generated listing of all strings found in the code base and an existing translation — merge them together and then automatically translate the new strings.

Now when I do a build any new strings will be automatically extracted, translated, and marked for human review.

Challenge Met!

I’m pretty happy with this for a hectic afternoon’s worth of work. As mentioned it’s far from perfect and needs some real translators to work on it. For the some of the major languages I will eventually pay for professional translation. In the meantime if you’d like to contribute to a translation please get in touch.