RFC: Proposal on improving how we share translations with upstream

This is a followup to ConfigCamp where we discussed how to improve the
sharing of downstream translations with the community. Here is a quick
summary of the issue:

  • Red Hat employs a group of professional translators who translate
    projects for communities which Red Hat participates in. This
    professional translation team would like to use Zanata, and I can not
    change that.
  • The Foreman community is using Transifex. We discussed changing from
    Transifex to Zanata, but we felt that it was not the correct move for
    the community.
  • The initial attempt about how to share translations [1] was flawed
    because it relied on pull requests to the code base which interfered
    with the translation review process. The net result is that some
    translations did not make it upstream.
  • Some translations did make it through the old process, and we noticed
    that only about 4% of strings which Red Hat Translators changed from
    upstream were rejected.

</summary>

While in Ghent, Claer suggested that we look at an approach where we use
an api [2] to mutate individual strings. I have a prototype script
working, and so I would like to summarize the process we came up with in
Ghent:

  1. On a recurring basis, I will run a script which loads up newly
    translated strings from Zanata into Transifex only if there is no
    translation in Transifex. The new translations will be marked as not
    reviewed so that any translations which are using this flag in their
    workflow will be unaffected. Already translated strings in Transifex
    will not be changed.

  2. After every Satellite release, I will generate a summary listing of
    translations which have been changed by Red Hat translators. I will post
    these in the dev and users list. I will ask for folks to review the
    changes over an 1-2 two week period. After any changes have been made, I
    will run a script and load those changed strings into Transifex. Again,
    the new translations will be marked as not reviewed so that any
    translations which are using this flag in their workflow will be unaffected.

We believe this is a better process than before because:

a) It uses Transifex as the main workflow, instead of PO files.
b) In ensures that new strings are loaded upstream very often.
c) Although initial estimates show onl a 4% error rate, there are two
chances to catch them (1) during the initial review cycle and (2) by
looking for unreviewed strings in Transifex.

Can you please respond with a +1/-1? If -1, please let me know the
issues you see in the process so that we can address them.

Thanks!

– bk

[1] https://groups.google.com/forum/#!topic/foreman-users/mmx3HMihvt0
[2] https://github.com/jakul/python-transifex

+1 Sounds like a great system

··· ----- Original Message ----- > This is a followup to ConfigCamp where we discussed how to improve the > sharing of downstream translations with the community. Here is a quick > summary of the issue: > > * Red Hat employs a group of professional translators who translate > projects for communities which Red Hat participates in. This > professional translation team would like to use Zanata, and I can not > change that. > * The Foreman community is using Transifex. We discussed changing from > Transifex to Zanata, but we felt that it was not the correct move for > the community. > * The initial attempt about how to share translations [1] was flawed > because it relied on pull requests to the code base which interfered > with the translation review process. The net result is that some > translations did not make it upstream. > * Some translations did make it through the old process, and we noticed > that only about 4% of strings which Red Hat Translators changed from > upstream were rejected. > > > > While in Ghent, Claer suggested that we look at an approach where we use > an api [2] to mutate individual strings. I have a prototype script > working, and so I would like to summarize the process we came up with in > Ghent: > > 1) On a recurring basis, I will run a script which loads up newly > translated strings from Zanata into Transifex only if there is no > translation in Transifex. The new translations will be marked as not > reviewed so that any translations which are using this flag in their > workflow will be unaffected. Already translated strings in Transifex > will not be changed. > > 2) After every Satellite release, I will generate a summary listing of > translations which have been changed by Red Hat translators. I will post > these in the dev and users list. I will ask for folks to review the > changes over an 1-2 two week period. After any changes have been made, I > will run a script and load those changed strings into Transifex. Again, > the new translations will be marked as not reviewed so that any > translations which are using this flag in their workflow will be unaffected. > > We believe this is a better process than before because: > > a) It uses Transifex as the main workflow, instead of PO files. > b) In ensures that new strings are loaded upstream very often. > c) Although initial estimates show onl a 4% error rate, there are two > chances to catch them (1) during the initial review cycle and (2) by > looking for unreviewed strings in Transifex. > > > Can you please respond with a +1/-1? If -1, please let me know the > issues you see in the process so that we can address them. > > Thanks! > > -- bk > > > [1] https://groups.google.com/forum/#!topic/foreman-users/mmx3HMihvt0 > [2] https://github.com/jakul/python-transifex > > -- > You received this message because you are subscribed to the Google Groups > "foreman-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to foreman-dev+unsubscribe@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. >

Be sure to catch cases where a string has been changed both by you and
in Transifex in parallel. In that case I'd suggest not overwriting it
again and instead review the updated string in Transifex.

··· On 24/02/16 22:28, Bryan Kearney wrote: > 2) After every Satellite release, I will generate a summary listing of > translations which have been changed by Red Hat translators. I will post > these in the dev and users list. I will ask for folks to review the > changes over an 1-2 two week period. After any changes have been made, I > will run a script and load those changed strings into Transifex. Again, > the new translations will be marked as not reviewed so that any > translations which are using this flag in their workflow will be unaffected.


Dominic Cleal
dominic@cleal.org

4% before review sounds like an acceptable compromise for the quantity of
extra translations this would bring to the community. +1

Note also that, 4% was with a big time frame. If the empty strings are uploaded
often, this error ratio will go down.

Big +1 for writing the script :slight_smile:

Claer

··· On Wed, Feb 24 2016 at 57:23, Greg Sutcliffe wrote:

4% before review sounds like an acceptable compromise for the quantity of
extra translations this would bring to the community. +1

Just checking, does anyone have concerns with this approach?

Thanks!

– bk

··· On 02/25/2016 03:35 AM, Claer wrote: > On Wed, Feb 24 2016 at 57:23, Greg Sutcliffe wrote: > >> >4%*before* review sounds like an acceptable compromise for the quantity of >> >extra translations this would bring to the community. +1 > Note also that, 4% was with a big time frame. If the empty strings are uploaded > often, this error ratio will go down. > > Big +1 for writing the script:) > > Claer