GUIDES Localization Automation
- Contents
- 1. Introduction
- 2. Localization Service Methods
-
3. Configuring Projects
- 3.1. Config Options
- 4. Localization Service Workflows
-
5. Advanced Topics
- 5.1. Performing Methods for a Filtered Set of Languages
- 5.2. Non-translatable Strings
- 5.3. Metrics Reports
- 5.4. Translation Audit Reports
- 5.5. Language Diff Reports
- 5.6. Usage Reports
- 5.7. Initializing Resource Strings
- 5.8. Different Classes of Files
1. Introduction
UIZE provides a system for automating various processes relating to the localization of a codebase.
1.1. Localization as a Service
UIZE exposes localization automation processes through a service, whose interface is defined in the Uize.Services.Loc
abstract class.
The Uize.Services.Loc
service module defines the following localization service methods...
metrics - generates a metrics report for the primary language resource strings of a project |
|
export - gathers strings from codebase resource files and exports them to consolidated language resources files, for each of the project's supported languages |
|
import - distributes strings from the consolidated language resources files to codebase resource files, for each of the project's supported languages |
|
exportJobs - exports translation job files from the consolidated language resources files, for each of the project's translatable languages |
|
importJobs - imports translated strings from translation job files and merges them into the consolidated language resources files, for each of the project's translatable languages |
|
usage - generates a usage report that contains details about usage of resource strings throughout a project, including information about unreferenced resource strings |
|
extract - extracts strings from the project's codebase and generates codebase resource files for the project's primary language (may not be supported for a project) |
1.1.1. Localization Service Adapter
As a convenience, UIZE provides an adapter base class for the Uize.Services.Loc
service that can be applied, with some protected method overrides and configuration, to a variety of different types of projects.
While it is possible to implement any adapter for the localization service, in reality there are sufficient commonalities between many different types of projects, such that it is beneficial to share implementation for the service adapter across as many different types of projects as possible. This is where the Uize.Services.LocAdapter
module comes in. This module can be subclassed to create adapters for the localization service for different types of projects with minimal additional project type specific code needed.
1.2. Accessed Through a Build Script
The various localization service methods can be accessed using the Uize.Build.Loc
build script.
1.2.1. Running the Build Script
The localization build script can be run in NodeJS using the following command...
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=[methodName] project=[projectName]
1.2.1.1. Parameters
All localization service methods support the following set of common parameters...
1.2.1.1.1. project
The project
parameter is used to specify the project for which the specified localization service method should be executed.
The value specified for the project
parameter should match one of the keys in the moduleConfigs ['Uize.Build.Loc'].projects
object of the uize-config.json
file.
1.2.1.1.1.1. Executing a Method For All Projects
To execute a localization service method for all projects listed in the config, one can either omit the project
parameter or one can specify the special "*" wildcard value.
EXAMPLES
node [pathToUize]build.js Uize.Build.Loc method=[methodName] node [pathToUize]build.js Uize.Build.Loc method=[methodName] project=*
1.2.1.1.2. method
The method
parameter is used to specify the localization service method that should be executed for the specified project.
The value of the method
parameter should be the name of any of the localization service methods supported by the localization service, such as...
metrics |
|
export |
|
import |
|
exportJobs |
|
importJobs |
|
usage |
|
extract |
1.2.1.1.3. console
The console
parameter is used to specify the amount of information that should be logged to the console while the localization service method is being executed.
The console
parameter supports the following possible values...
silent - no information will be logged to the console |
|
summary - only a summary will be logged to the console once execution of the localization service method is complete |
|
verbose - information will be logged to the console for every step that is performed while the localization service method is being executed |
2. Localization Service Methods
The Uize.Services.Loc
service module defines a number of different localization service methods.
2.1. metrics
Generates a metrics report for a project using the metrics
method of the localization service.
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=metrics project=[projectName]
2.1.1. Generates a Metrics Report File
The metrics method analyzes the resource strings for the project's primary language and generates a JSON format metrics report file with the path...
[workingFolder]/metrics/[primaryLanguage].json
Using this scheme, if the value of the workingFolder
config option is 'loc'
, and the value of the primaryLanguage
config option is 'en-US'
, then the path for the metrics file would be...
loc/metrics/en-US.json
2.1.2. Generates Strings Info Files
In addition to generating the strings metrics JSON file, the metrics
method also generates JSON and CSV format strings info files for the primary language.
These files are output to the following paths...
[workingFolder]/strings-info/[primaryLanguage].csv [workingFolder]/strings-info/[primaryLanguage].json
Using this scheme, if the value of the workingFolder
config option is 'loc'
, and the value of the primaryLanguage
config option is 'en-US'
, then the paths for the strings info files would be...
loc/strings-info/en-US.csv loc/strings-info/en-US.json
2.2. export
Exports the resource strings from the codebase resource files of a project to the master resource files.
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=export project=[projectName]
The master resource files are output to the following paths...
[workingFolder]/[primaryLanguage].json [workingFolder]/[pseudoLocale].json [workingFolder]/[translatableLanguage1].json [workingFolder]/[translatableLanguage2].json ... ... ... ... ... ... ... ... ... ... ... [workingFolder]/[translatableLanguageN].json
EXAMPLE
loc/en-US.json loc/en-ZZ.json loc/en-GB.json loc/fr-FR.json loc/de-DE.json
In the above example, the workingFolder
config option is set to 'loc'
, the primaryLanguage
config option is set to 'en-US'
, the pseudoLocale
config option is set to 'en-ZZ'
, and the configured translatable languages are 'en-GB'
(English for Great Britain), 'fr-FR'
(French), and 'de-DE'
(German).
2.2.1. Parameters
The export
method supports the following parameters...
2.2.1.1. initNonTranslatable
By default, the value for each non-translatable string is initialized to the corresponding value of the string from the primary language, but only if the non-translatable string is currently blank.
If this behavior is not suitable, one of the other available initialization modes can be specified using the initNonTranslatable
parameter. The following initialization modes are supported...
primary-if-blank (default) - initializes the values of only blank non-translatable strings to the corresponding values of the strings from the primary language |
|
primary - initializes the values of all non-translatable strings to the corresponding values of the strings from the primary language, regardless of their current values |
|
blank - initializes the values of all non-translatable strings to blank (empty string), regardless of their current values |
|
never - leaves the values of all non-translatable strings as is, keeping their current values |
2.2.1.2. languages
An optional, comma-separated list of one or more of the project's translatable languages for which the export
operation should be performed.
For more detailed information, consult the section Performing Methods for a Filtered Set of Languages.
2.3. import
Imports the resource strings from the master resource files of a project back into the codebase resource files.
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=import project=[projectName]
Executing the import
method will have the effect of overwriting all the codebase resource files for the pseudo-locale and the translatable languages so that their contents reflects the contents of the master resource files. The import
method will also create new codebase resource files, as necessary, if they did not previously exist.
2.3.1. Parameters
The import
method supports the following parameters...
2.3.1.1. languages
An optional, comma-separated list of one or more of the project's translatable languages for which the import
operation should be performed.
For more detailed information, consult the section Performing Methods for a Filtered Set of Languages.
2.4. exportJobs
Exports resource strings for translation from the master resource files of a project to translation job files.
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=exportJobs project=[projectName]
2.4.1. Parameters
The exportJobs
method supports the following parameters...
2.4.1.1. filter
The exportJobs
method supports the ability to filter the strings that are exported to the translation job files, using the optional filter
paramter.
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=exportJobs project=[projectName] filter=[filterName]
2.4.1.1.1. filterName
The following filters are supported for the exportJobs
method...
missing (defualt) - only the translatable strings for which translations are missing for a translatable language will be exported in the translation job file for the language |
|
translated - only the translatable strings for which translations exist for a translatable language will be exported in the translation job file for the language |
|
all - all translatable strings for a translatable language will be exported in the translation job file for the language, whether they have existing translations or not |
2.4.1.2. languages
An optional, comma-separated list of one or more of the project's translatable languages for which the exportJobs
operation should be performed.
For more detailed information, consult the section Performing Methods for a Filtered Set of Languages.
2.4.2. Only Translatable Languages
The exportJobs
method generates translation job files for only the translatable languages configured for the project.
In particular, this means that translation job files are never generated for the primary language or pseudo-locale configured for the project.
2.4.3. Only Translatable Strings
The translation job files generated by the exportJobs
method contain only translatable strings.
All non-translatable strings will be excluded from the translation job files. For more detailed information, consult the section Non-translatable Strings.
2.4.4. File Format
The exportJobs
method will write translation job files in the translation job file format that is configured for the project.
NOTES
before executing the exportJobs method, make sure that the master resource files are up-to-date by executing the export method, if necessary |
2.5. importJobs
Imports translated resource strings from translation job files into the master resource files of a project.
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=importJobs project=[projectName]
2.5.1. Parameters
The importJobs
method supports the following parameters...
2.5.1.1. languages
An optional, comma-separated list of one or more of the project's translatable languages for which the importJobs
operation should be performed.
For more detailed information, consult the section Performing Methods for a Filtered Set of Languages.
2.6. auditTranslations
Performs an audit of the resource strings in the master resource files for all the translatable languages of the project, to detect potential issues of inconsistent translations and identical translations.
EXAMPLE
node [pathToUize]build.js Uize.Build.Loc method=auditTranslations project=[projectName]
The auditTranslations
method analyzes the resource strings for each of the translatable languages of the project, comparing them to the corresponding values from the primary language to detect potential issues of inconsistent translations and identical translations, and then writes a .json
report file for each translatable language with a path of the form...
[workingFolder]/[projectName]/translation-audit/[translatableLanguage].json
2.6.1. Parameters
The auditTranslations
method supports the following parameters...
2.6.1.1. languages
An optional, comma-separated list of one or more of the project's translatable languages for which the auditTranslations
operation should be performed.
For more detailed information, consult the section Performing Methods for a Filtered Set of Languages.
For a more detailed discussion and for information on the contents of the reports, consult the section Translation Audit Reports.
NOTES
before executing the auditTranslations method, make sure that the master resource files are up-to-date by executing the export method, if necessary |
2.7. diffLanguages
Performs a diff between the resource strings in the master resource files of the two specified languages.
EXAMPLE
node [pathToUize]build.js Uize.Build.Loc method=diffLanguages project=... languageA=... languageB=...
The diffLanguages
method loads the resource strings from the master resource files for each of the two specified languages, performs a diff comparison of the resource strings for the two languages, and then writes .json
and .csv
report files containing details regarding the differences, with paths of the form...
[workingFolder]/[projectName]/language-diffs/[languageA]-vs-[languageB].json [workingFolder]/[projectName]/language-diffs/[languageA]-vs-[languageB].csv
2.7.1. Parameters
The diffLanguages
method supports the following parameters...
2.7.1.1. languageA
A locale code string, specifying the first of the two languages whose resource strings should be diffed.
The default value for this property, if not specified, is the locale code of the primary language configured for the project.
2.7.1.2. languageB
A locale code string, specifying the second of the two languages whose resource strings should be diffed.
The default value for this property, if not specified, is the locale code of the primary language configured for the project.
For a more detailed discussion and for information on the contents of the reports, consult the section Language Diff Reports.
NOTES
before executing the diffLanguages method, make sure that the master resource files are up-to-date by executing the export method, if necessary |
2.8. usage
Generates a usage report file for the project, containing information about resource string references in the project's code, along with an assessment of unreferenced strings.
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=usage project=[projectName]
The usage
method generates a JSON format usage report file with the path...
[workingFolder]/metrics/usage-report.json
In order for the usage
method to work reliably, the localization adapter subclass for the project must implement the getReferencingCodeFiles
and getReferencesFromCodeFile
instance methods.
NOTES
see also the extract localization service method |
2.9. extract
Extracts resource strings from references inside the code files of a project and generates one or more codebase resource files.
SYNTAX
node [pathToUize]build.js Uize.Build.Loc method=extract project=[projectName]
This method can be useful for projects where developers typically create resource strings by first referencing them in translation calls in the code, rather than by first creating entries in resource files.
The extract
method is not implemented by the localization service adapter base class - the base class' version of this method just throws an error. Therefore, the localization adapter subclass for the project must implement this method. The getReferencingCodeFiles
and getReferencesFromCodeFile
instance methods that need to be implemented by an adapter subclass in order to support the usage
method can be used in one's implementation of the extract
method.
NOTES
see also the usage localization service method |
3. Configuring Projects
The behavior of the Uize.Build.Loc
build script is configurable to support one or more projects of potentially different types.
Config options for the Uize.Build.Loc
build script should be placed inside the uize-config.json
config file, under the path moduleConfigs ['Uize.Build.Loc']
, as illustrated by the example below...
EXAMPLE
{ // other config options // other config options // other config options // other config options moduleConfigs:{ 'Uize.Build.Loc':{ workingFolder:'loc', projects:{ Uize:{ serviceAdapter:'Uize.Services.LocAdapter.Uize', rootFolderPath:'site-source/js', languages:[ 'en-US', 'de-DE', 'fr-FR', 'ja-JP', 'nl-NL', 'ru-RU', 'zh-CN' ], primaryLanguage:'en-US', pseudoLocale:'en-ZZ' } } } } }
The above example shows the configuration of the Uize.Build.Loc
build script for the UIZE project.
3.1. Config Options
The Uize.Build.Loc
build script supports a number of configuration options.
3.1.1. Working Folder
The workingFolder
config option specifies the path, relative to the current working directory, to the working folder for the Uize.Build.Loc
build script.
The working folder is used to store the output of the various localization service methods.
3.1.2. Per Project Configuration Options
The localization service supports a number of per project configuration options.
3.1.2.1. serviceAdapter
The service adapter can be configured for a project by specifying a module name for the serviceAdapter
config option.
EXAMPLE
serviceAdapter:'MyNamespace.LocServiceAdapters.MobileAppAndroid'
Typically, each project has its own localization service adapter subclass, but it is possible for multiple projects to share the same service adapter module if they are all of the same type but differ in other aspects of their configuration.
3.1.2.2. rootFolderPath
The rootFolderPath
config option lets you specify the path to the root folder under which all the project's codebase resource files can be found.
EXAMPLE
rootFolderPath:'~/git-repos/my-project/res'
If a relative path is specified for the rootFolderPath
option, the path should be relative to the current directory at the time that the localization service scripts are run.
3.1.2.3. languages
The languages
config option lets you specify the translatable languages of the project, for which translation should be managed by the localization service.
It is recommended that the value specified for the languages
option be an array of BCP 47 locale codes, containing both the language and region codes.
EXAMPLE
languages:[ 'de-DE', // German 'fr-FR', // French 'ja-JP', // Japanese 'nl-NL', // Dutch 'ru-RU', // Russian 'zh-CN' // Chinese ]
The languages list may redundantly contain the primary language that is configured for the project in the separate primaryLanguage
config option, but it is not required and it is recommended to omit the primary language from this list. In any event, the primary language is not considered to be a translatable language.
3.1.2.4. brandLanguages
Lets you configure the translatable languages supported per brand.
It is not uncommon, in situations where a project supports multiple brands, for those brands to be associated with markets in different regions, with different regional languages for those regions. In projects that have brand-specific resource strings, the brandLanguages
config option lets you configure different sets of translatable languages on a per brand basis.
EXAMPLE
brandLanguages:{ fooBrand:[ 'de-DE', // German 'fr-FR', // French 'en-GB' // English (Great Britain) ], barBrand:[ 'es-MX', // Spanish (Mexico) 'pt-BR' // Portuguese (Brazil) ] }
In the above example, the brand "fooBrand" supports three languages, while brand "barBrand" supports two languages. There is no overlap in the languages supported by these two brands. Resource strings that are specific to "fooBrand" are not translated to Spanish or Portuguese, while resource strings that are specific to "barBrand" are not translated to German, French, or British English. Resource strings that are not brand-specific and that are, therefore, common to these two brands will be translated for the superset of all the brand languages.
3.1.2.5. primaryLanguage
Lets you specify the primary language for the project.
The primary language is the language that will be used as the source of truth, and the resource strings from the primary language resource files will drive the translation process. Resource strings for the configured translatable languages of the project will be produced by translating the corresponding values of the strings from the primary language resource files.
EXAMPLE
primaryLanguage:'en-US' // US English
It is fairly common for the primary language to be a regional variant of English (such as US English), but this is not a strict requirement. It is recommended that the value specified for the primaryLanguage
option be a BCP 47 locale code, containing both the language and region codes.
3.1.2.6. pseudoLocale
Lets you specify the locale code to be used for the pseudo-locale.
It is recommended that the value specified for the pseudoLocale
option be a BCP 47 locale code, containing both the language and region codes. It is further recommended that specifically the "en-ZZ" locale code be used for the pseudo-locale, since this locale code exists in a range of BCP 47 locale codes that are not assigned to specific languages and are free to be used for proprietary purposes such as pseudo-localization.
EXAMPLE
pseudoLocale:'en-ZZ'
3.1.2.7. pseudoLocalization
Lets you specify configuration options for the pseudo-localization process that is performed when executing the export
localization service method.
EXAMPLE
pseudoLocalization:{ expansion:1.15, wrapper:'[]' }
Generally, it is not necessary to explicitly configure the pseudo-localization process, since the default configuration should suit most projects. If any pseudo-localization option is to be configured differently for specific projects, it is likely to be the expansion factor. The most suitable expansion factor to use will be influenced by the set of translatable languages that the project needs to support, since different languages introduce different amounts of expansion during translation.
3.1.2.8. resourceFileWhitespace
Lets you configure the whitespace options that should be used by the import
localization service method when it writes codebase resource files.
EXAMPLE
resourceFileWhitespace:{ indentChars:' ', linebreakChars:'\n' }
By default, resource files are serialized using a single tab character for indentation and a single new line character linebreaks. If these defaults are not suitable for the project's codebase, then the whitespace options can be specified explicitly using the resourceFileWhitespace
config option.
3.1.2.8.1. Whitespace Options
The value specified for the resourceFileWhitespace
option should be an object containing any of the following supported properties...
3.1.2.8.1.1. indentChars
A string, specifying the character(s) that should be used to denote one level of indentation.
The default value for this property, if not specified, is a single tab character.
3.1.2.8.1.2. linebreakChars
A string, specifying the character(s) that should be used to denote a linebreak.
The default value for this property, if not specified, is a single new line character.
3.1.2.9. translationJobFileFormat
The format of translation job files written by the exportJobs
method can be configured using the translationJobFileFormat
config option.
The translationJobFileFormat
config option supports the following possible values...
csv - translation job files will be exported in CSV (Comma-separated Values) format |
|
xliff - translation job files will be exported in XLIFF (XML Localization Interchange File Format) |
4. Localization Service Workflows
4.1. Performing Translation
In order to have the resource strings for a project be translated, one must first export the resource strings for translation and then import the translated resource strings.
4.1.1. Export the Resource Strings for Translation
Exporting the resource strings for translation for a project involves the following steps...
1. | Execute the export localization service method to update the .json master resource files to reflect the current state of the codebase resource files. |
2. | Execute the exportJobs localization service method to generate updated translation job files. |
Once the translation job files have been generated to reflect the current state of the project's codebase and the strings that actually need translation, these files can be sent to translators for translation.
4.1.2. Import the Translated Resource Strings
Once the translation job files have been processed by the translators and the strings have been translated, the translations can be imported back into the project's codebase with the following steps...
1. | Replace the translation job files inside the jobs folder with the updated files delivered by the trabslators. |
2. | Execute the importJobs localization service method to update the .json master resource files with the translations contained inside the translation job files. |
3. | Execute the import localization service method to update the codebase resource files of the project with the strings from the updated .json master resource files. |
Once the codebase resource files have been updated, a diff sanity check in one's source control tool should verify that the translations have made their way into the resource files in the project's codebase.
4.2. Changes to Resource Strings
4.2.1. Deleting Resource Strings
At times, it will be necessary to delete one or more resource strings because a component has been modified.
Consider that the application may support multiple different languages and there may be translations in the resource files for different languages for the resource strings you wish to delete. With the Loc service, it is not necessary to remove the resource strings from the translated resource files - they will be repaired if you follow these steps...
1. | Delete the resource strings in the primary language resource files. Be sure to remove all brand-specific overrides, if any exist, from the primary language resource files. |
2. | Execute the export method of the Loc service to update the .json master resource files. |
3. | Execute the import method of the Loc service to re-generate the resource files for the supported languages of the project. |
Performing the export
followed by the import
will have the effect of repairing the resource files in the codebase so that the removed resource strings will no longer exist in the resource files for the supported languages.
4.2.2. Renaming Resource Strings
At times, it will be necessary to rename one or more resource strings because a component has been modified.
Consider that the application may support multiple different languages and there may be translations in the resource files for different languages for the resource strings you wish to rename. The Loc service does not provide any special handling for such situations and you will need to rename the resource strings in the resource files for the primary language and all other supported languages.
4.3. Localization Loop
-------> 4. translate jobs -------> -------> 9. translate jobs -------> -----------> ∧ | ∧ | ∧ | | | | | | 3. export 5. import | | 8. export 10. import | | 13. export | jobs jobs | | jobs jobs | | jobs | ∨ | ∨ | |-------------------------------------------------------------------------------------| | | | MASTER RESOURCE FILES | | | |-------------------------------------------------------------------------------------| ∧ | ∧ | ∧ | | | | | | | 1. export | | 2. import 6. export | | 7. import 11. export | | 12. import | | | | | | | ∨ | ∨ | ∨ |---------------------------------------------------------------------------------------------| | | | | | | | | | | A | ------> 4. develop -------> | B | ------> 9. develop -------> | C | | | | | | | | | | |---------------------------------------------------------------------------------------------|
1. | export |
2. | import |
3. | export jobs |
4. | translate jobs and develop |
5. | import jobs |
6. | export |
7. | import |
8. | repeat from step 3 |
5. Advanced Topics
5.1. Performing Methods for a Filtered Set of Languages
Various of the localization service methods support the optional languages
parameter, which lets you execute the methods for a filtered set of the project's translatable languages.
EXAMPLE
node [pathToUize]build.js Uize.Build.Loc method=export project=Foo languages=fr-FR,fr-CA
In the above example, the export
method's operation will be performed for just the "fr-FR" (French French) and "fr-CA" (Canadian French) languages, even though the "Foo" project may support a larger set of translatable languages.
The following methods support the optional languages
parameter...
export |
|
exportJobs |
|
import |
|
importJobs |
|
auditTranslations |
When the optional languages
parameter is either not specified (or if an empty value is specified) when executing any of the above methods, then the method's operation will be performed for all the trnslatable languages configured for the project.
5.2. Non-translatable Strings
Non-translatable strings are strings that should never be pseudo-localized and should never be sent to translators for translation.
Examples of non-translatable resource strings would be...
URLs | |
media asset IDs | |
dimension values | |
color values | |
support e-mail addresses or phone numbers | |
item codes |
5.2.1. Reasons for Treating Some Strings as Non-translatable
There are good reasons to identify certain strings as being non-translatable and then handle such strings differently in the automation process.
5.2.1.1. A Waste of Money
For one thing, it is a waste of money to send non-translatable strings to translators.
Translators will likely not know what to do with such strings, and determining how to handle them may involve some costly back-and-forth communication between the project managers on both sides of the process. And if they actually translate such strings by accident, then they will charge for that translation work.
5.2.1.2. Breakage from Accidental Translation
Accidental translation of strings that should not be translated can result in breakage of the application.
For example, a resource string may be used for storing a color value that should be different per language. Technologies such as CSS support named color values, such as "red" or "green". If the intention is to use the color red for one language and the color green for another language, it is not helpful if the actual text "red" is translated to some other language by the translators. Translating "red" to "rouge" in French would break the application in that the translated color value would not be recognized by CSS.
Instead, such a string should be treated as a non-translatable string, and it would then be the responsibility of a developer to manually change the value in the resource files of the translatable languages, under the direction of a product manager who makes the decision what colors best suit different language audiences.
5.2.1.3. Breakage from Pseudo-localization
Accidental pseudo-localization of some non-translatable strings (such as IDs or URLs) may break the pseudo-localized version of the application and hinder localization QA testing of the pseudo-localized version.
For example, if a resource string was being used to store a code like "AdvancedMode"
, where this code is being used by the application logic and is intended to be different per language, then pseudo-localizing the value from the primary language to "[ÅðṽåñçéðṀöðé__]"
would break the application as it would not understand this value.
5.2.2. Non-translatable Strings Vary by Project
Different types of projects will likely have different varieties of non-translatable strings that it is convenient for the project to store in codebase resource files.
Therefore, it is up to each project to determine which strings are translatable and which strings are not. This is accomplished by implementing an override to the new isTranslatableString
method in the localization service adapter.
5.2.2.1. isTranslatableString
Implementing support for non-translatable strings in a project involves providing an implementation for the isTranslatableString
instance method in the localization adapter subclass for the project.
The implementation in the adapter base class just returns true
, so all resource strings are considered translatable by default.
An implementation for the isTranslatableString
method should expect to receive a single argument, being a string info object that describes a resource string. The string info object will contain a key
property that specifies the resource string's key name, along with a value
property that specifies the value of the resource string for the primary language.
The method implementation should use the information in the string info object to determine a boolean return value, indicating whether or not the string is translatable. The implementation can use either or both of the key
and value
properties to reach its determination- whichever is best suited to the nature of the resource strings in the project.
EXAMPLE
function (stringInfo) { return ( !/_(ID|EMAIL)$/.test (stringInfo.key) && // it's an ID or an e-mail address !/^https?:\/\//.test (stringInfo.value) // it's a URL ); }
In the above example implementation for the isTranslatableString
method, the method is first performing a test on the key of the resource string, checking to see if it ends with "_ID" or "_EMAIL". If this test fails, the method is then testing if the value of the resource string starts with "http://" or "https://".
5.2.2.2. Establish a Convention for Non-translatable Strings
While it is possible to write complex matching logic to test whether or not resource strings should be translatable, it is recommended that projects establish a simple and robust convention.
For example, a project could follow the convention that all non-translatable resource strings are indicated with a specific suffix or prefix in their key names. For instance, all non-translatable resource strings could have a "$" character appended to their key names.
EXAMPLE
function (stringInfo) { // string is translatable if its key doesn't end with "$" return stringInfo.key.slice (-1) != '$'; }
Following such a convention would provide the following benefits...
it becomes clear from the codebase resource files exactly which strings are not to be translated | |
the isTranslatableString test is simple, robust, and highly performant |
|
it is easy to add new non-translatable strings of different types later to a project later, without needing to update the implementation of the isTranslatableString method |
5.2.2.3. Surveying Non-translatable Strings
In order to verify that one's implementation of the isTranslatableString
method for a project is doing the right thing, you can survey the strings that are determined to be non-translatable by using the metrics
localization service method.
When you execute the metrics
method, one of its by-products will be the strings info files that are written to the "strings-info" folder. The strings info is obtained from the resource strings in the primary language resource files for the project and is written to two different format files: one JSON format file, and one CSV format file.
.json - The JSON format file contains an array of string info objects for all of the strings of the project, where each string info object contains properties such as path , value , isTranslatable , etc., and where the values for some properties (such as the metrics property) are sub-objects. |
|
.csv - The CSV format file contains a table of rows for all of the strings of the project, where each row contains a number of columns that represent a flattened version of the string info object for a string. |
Both of these two strings info files contain the same information, but expressed slightly differently. The CSV file is particularly easy to "consume" and can be loaded up in any spreadsheet editor (such as Google Sheets) that correctly supports the CSV format's ability to have quoted field values that contain line breaks (some of your resource strings may contain line break characters).
Once loaded into a spreadsheet tool, the Translatable
column can be used to sort or filter the strings info rows based upon whether or not the strings are considered translatable. For non-translatable strings, the value in the Translatable
column will be false
. The spreadsheet can then be used to verify that there are no non-translatable strings being missed by the isTranslatableString
method, and there are no translatable strings that are being incorrectly caught by it.
5.2.3. Handling of Non-translatable Strings
Non-translatable strings are handled according to the following rules...
5.2.3.1. Never Pseudo-localized
Non-translatable strings are never pseudo-localized when resource strings for the pseudo-locale are generated by the export
localization service method.
Instead, the values from the primary language will be used as is.
5.2.3.2. Never Exported for Translation
Non-translatable strings are never exported to translation job files by the exportJobs
localization service method.
This means that translators never have to deal with these strings, and there's never a possibility that translators will accidentally translate them, resulting in charges for wasted effort or even problems for the application.
5.2.3.3. Initialized to Primary Language Values
Whenever a non-translatable string is added or modified for the primary language of a project, the values for the string for the translatable languages are initialized to the value from the primary language.
In this way, the values for the string in the resource files for the translatable languages don't need to be manually initialized by developers.
5.2.3.4. Manually Modified Values are Respected
While the values for non-translatable strings are initialized to primary language values, the values for the strings in the translatable languages can subsequently be manually modified by developers.
Non-empty values for non-translatable strings in the resource files for the translatable languages of a project are respected throughout the localization automation process.
5.2.4. Initializing Values of Non-translatable Strings
The values of the non-translatable strings of a project can be initialized by executing the export
command and specifying one of the supported initialization modes using the initNonTranslatable
parameter.
node [pathToUize]build.js Uize.Build.Loc method=export project=[projectName] initNonTranslatable=[initMode]
The non-translatable strings initialized in this way will then be updated in the codebase resource files during the next import
operation.
5.3. Metrics Reports
A metrics report can be produced for a project by using the metrics
localization service method.
EXAMPLE
node [pathToUize]build.js Uize.Build.Loc method=metrics project=[projectName]
The metrics
method analyzes the primary language resource strings for a project and writes a .json
file to the following path...
[workingFolder]/[projectName]/metrics/[primaryLanguage].json
The metrics report is a JSON file, so the data contained in it can be easily loaded by other scripts or an application and used to present a visualization of the metrics or to estimate cost for translation.
5.3.1. Metrics Report Contents
5.3.1.1. Total Word Count
The total number of translatable words in all primary language resource files for the project is represented by the wordCount
property.
The value of the wordCount
property is only an estimate and the accuracy of the value depends on how well the project is defined to understand the difference between translatable words and non-translatable text.
The value of this property is useful when estimating the full translation cost for a project, since translators often charge for their services based upon word count and a cost per word.
5.3.1.2. Total Character Count
The total character count for all the translatable words in all primary language resource files for the project is represented by the charCount
property.
The accuracy of the value of this property depends on the accuracy of the wordCount
property, since this property is the sum of the character count for all translatable words. This value can be useful when estimating the full translation cost for a project, since translators may base cost estimates for their services partially upon total character count for all translatable words.
5.3.1.3. Total Resource Files
The total number of resource files for the language is represented by the resourceFiles
property.
The value of the resourceFiles
property reflects the total number of codebase resource files for the language and not the total for all languages, so this value doesn't increase as you add support for new languages.
5.3.1.3.1. Brand-specific Resource Files
The total number of brand-specific resource files for the language is represented by the brandSpecific
property of the resourceFiles
object.
What constitutes a brand-specific resource file will vary by project and depend on the project definition. Typically, brand-specific resource files provide overrides for certain resource strings that need to be different for specific brands, so the number of brand-specific resource files is usually smaller than the total number of resource files. Moreover, different brands may have a differing number of brand-specific resource files.
5.3.2. Strings Info
.
5.4. Translation Audit Reports
A translation audit report can be produced for a project by using the auditTranslations
localization service method.
EXAMPLE
node [pathToUize]build.js Uize.Build.Loc method=auditTranslations project=[projectName]
The auditTranslations
method analyzes the resource strings for each of the translatable languages of the project, comparing them to the corresponding values from the primary language to detect potential issues of inconsistent translations and identical translations, and then writes a .json
report file for each translatable language with a path of the form...
[workingFolder]/[projectName]/translation-audit/[translatableLanguage].json
5.4.1. Not a Guarantee of Problems
It is important to note that the issues listed in the translation audit reports are not a guarantee of real problems in translation.
For instance, there can be valid reasons for translations of any given primary language text to be inconsistent for a translatable language. For example, a single word that is used for different resource strings can have a different meaning based upon context In one context, the word may be used as a verb, while in another context the same word may be used as a noun. In the translatable language, the verb and noun forms may be different, while in the primary language they may be the same. This is the case with an English word like "call".
There can also be valid reasons for translations of any given primary language text to be identical for a translatable language. This will, naturally, be the case for many resource strings when translating from US English to British English, but it can also be the case for specific English words whose usage has been adopted in other languages. Such is the case with a word like "OK" (typically used in the UI in OK buttons), which is considered acceptable (and even preferable) in French language UI.
Generally, the longer the source text from the primary language, the more likely that an issue listed in the translation audit report is a real problem. The reports are provided as a useful guide to potential issues, and these potential issues should be assessed and investigated further, if necessary.
5.4.2. Translation Audit Report Contents
The translation audit report object contains the following sections, describing different potential translation issues...
5.4.2.1. Identical Translations
An identical translation is a translation where the value of a resource string in a translatable language is identical to the corresponding value of that resource string in the primary language.
Identical translations can be an indication of errors made during the translation process, where the source text is accidentally used unaltered as the translation. Information on all the resource strings that have identical translations for a given translatable language can be found in the identicalTranslations
property in the translation audit report for the language. The value of the identicalTranslations
property is an array of the form..
identicalTranslations:[ identicalTranslation0OBJ, identicalTranslation1OBJ, ... identicalTranslationNOBJ ]
Each element of the identicalTranslations
array is an object describing an instance of an identical translation, having the form...
{ path:stringPathARRAY, value:stringValueSTR }
The value of the path
property is a string path array that contains all the information necessary to identify the resource string in the project, starting with the resource file that contains the string as the first element, and ending with the string key as the last element. The value of the value
property is a string, specifying the value (i.e. text) of the string.
EXAMPLE
identicalTranslations:[ { path:[ 'strings.xml', 'OkButton' ], value:'OK' }, { path:[ 'strings.xml', 'FaxButton' ], value:'FAX' } ]
In the above example, two strings have translations that are identical to the source text from the primary language (English, in this case). This case can arise with a language like French, where it is acceptable to use "OK" and "FAX" in UI text. In fact, "OK" would even be considered preferable to using a stricter translation like "D'accord". So, in this case, the indentical translations are not an indication of a real problem.
5.4.2.2. Inconsistent Translations
An inconsistent translation is a case in which some source text contained in resource strings of the primary language is translated in more than one way in the resource strings of a translatable language.
Inappropriate inconsistent translations can arise in larger projects where translation jobs are distributed amongst multiple translators who translate in parallel and don't leverage shared translation memory, or when translation of future additions to an existing project are handed off to a new translation vendor without providing them with a translation memory or previous translations. Inconsistencies can also arise through basic mistakes in the translation process.
Information on all inconsistent translations for a given translatable language can be found in the inconsistentTranslations
property in the translation audit report for the language. The value of the inconsistentTranslations
property is an array of the form...
inconsistentTranslations:[ inconsistentTranslation0OBJ, inconsistentTranslation1OBJ, ... inconsistentTranslationNOBJ ]
Each element of the inconsistentTranslations
array is an object of the form...
{ source:sourceTextSTR, translations:[ translation0OBJ, translation1OBJ, ... translationNOBJ ] }
The value of the source
property is a string, specifying the source text from the primary language. The value of the translations
property is an array of objects describing the different translations of the source text, where each object has the form...
{ translation:translationTextSTR, strings:[ stringPath0ARRAY, stringPath1ARRAY, ... stringPathNARRAY ] }
The value of the translation
property is a string, specifying the translated text from the translatable language. The value of the strings
property is an array of string path arrays, identifying all the strings that have this translation for the translatable language. The value for each element of the strings
array is a string path array that contains all the information necessary to identify the resource string in the project, starting with the resource file that contains the string as the first element, and ending with the string key as the last element.
The structure of the inconsistentTranslations
object is illustrated by the following example, in which there are two different translations for the source text "Call".
EXAMPLE
inconsistentTranslations:[ { source:'Call', translations:[ { translation:'Appeler', strings:[ [ 'strings.xml', 'CallButton' ], [ 'strings.xml', 'CallLink' ] ] }, { translation:'Appel', strings:[ [ 'strings.xml', 'CallHistoryLabel' ] ] } ] } ]
In this example, the inconsistent French translations are legititate because the word "Call" is used as a verb in the first case (translated to "Appeler") and as a noun in the second (translated to "Appel").
5.5. Language Diff Reports
A language diff report can be produced for a project by using the diffLanguages
localization service method.
EXAMPLE
node [pathToUize]build.js Uize.Build.Loc method=diffLanguages project=... languageA=... languageB=...
The diffLanguages
method loads the resource strings from the master resource files for each of the two specified languages, performs a diff comparison of the resource strings for the two languages, and then writes .json
and .csv
report files containing details regarding the differences, with paths of the form...
[workingFolder]/[projectName]/language-diffs/[languageA]-vs-[languageB].json [workingFolder]/[projectName]/language-diffs/[languageA]-vs-[languageB].csv
5.5.1. Investigating Drift
Language diff reports may be useful in situations where you suspect that string values may have drifted in an undesirable way between two regional variants of a language, such as between US English (en-US) and British English (en-GB).
In one example, undesirable drift may occur if process is not followed correctly and changes are made directly in the resource strings for one regional variant of a language and those changes are not replicated in other regional variants of that same language, particularly if the values were identical beforehand and no longer are after the change.
5.5.2. Difference Types
Language diff reports are very straightforward and describe just three types of differences...
'missing in A' - the string is missing or empty for language A, but present and non-empty for language B |
|
'missing in B' - the string is missing or empty for language B, but present and non-empty for language A |
|
'different' - the string is present and non-empty for both language A and language B, but the values differ |
5.5.3. Language Diff Report Contents
When you execute the diffLanguages
method, two different format report files are written to the "language-diffs" folder: one JSON format file, and one CSV format file.
.json - The JSON format diff report contains an array of string diff objects for all strings that have differences, where each string diff object contains the properties key , path , valueInA , valueInB , and difference . |
|
.csv - The CSV format diff report contains a table of rows for all of the strings that have differences, where each row contains columns that represent a flattened version of the string diff object for a string. |
Both of these two diff report files contain the same information, but expressed slightly differently. The CSV file is particularly easy to "consume" and can be loaded up in any spreadsheet editor (such as Google Sheets) that correctly supports the CSV format's ability to have quoted field values that contain line breaks (some of your resource strings may contain line break characters). Once loaded into a spreadsheet tool, the spreadsheet can be sorted by the "Difference" column or filtered by the difference type you are interested in.
5.5.3.1. JSON Format Diff Report
The JSON format diff report is a serialized array of string diff objects, of the form...
[ stringDiff0OBJ, stringDiff1OBJ, ..., stringDiffNOBJ ]
Each element of the string diff objects array is a string diff object of the form...
{ key:keySTR, path:pathARRAY, valueInA:valueInASTR, valueInB:valueInBSTR, difference:differenceSTR // 'different' | 'missing in A' | 'missing in B' }
Each string diff object contains the following properties...
key - a string, specifying the key of the resource string (the last element of the string path array) |
|
path - an array, specifying the full path of the resource string, where the first element is the path of the resource file containing the string, and where the last element is the key of the resource string |
|
valueInA - a string, specifying the value of the resource string in language A, or an empty string if the string is missing in language A |
|
valueInB - a string, specifying the value of the resource string in language B, or an empty string if the string is missing in language B |
|
difference - a string, specifying the difference type for the resource string between language A and language B |
The structure of the JSON format language diff report is illustrated by the following example, in which there are two differences between the two languages compared in the diff...
EXAMPLE
[ { key:'label', path:[ 'strings.js', 'colorPicker', 'label' ], valueInA:'color', valueInB:'colour', difference:'different' }, { key:'tooltip', path:[ 'strings.js', 'colorPicker', 'tooltip' ], valueInA:'Click to choose a color.', valueInB:'', difference:'missing in B' } ]
5.5.3.2. CSV Format Diff Report
The CSV format diff report is a table containing the following columns...
Resource File - the path of the resource file containing the resource string | |
Key - the key of the resource string | |
Path - a JSON serialization of the full path of the resource string | |
Difference - the difference type for the resource string between language A and language B | |
Value in A - the value of the resource string in language A, or an empty string if the string is missing in language A | |
Value in B - the value of the resource string in language B, or an empty string if the string is missing in language B |
The structure of the CSV format language diff report is illustrated by the following example, in which there are two differences between the two languages compared in the diff...
CSV EXAMPLE
Resource File,Key,Path,Difference,Value in A,Value in B strings.js,label,"['strings.js','colorPicker','label']",different,color,colour strings.js,tooltip,"['strings.js','colorPicker','tooltip']",missing in B,Click to choose a color.,
5.6. Usage Reports
.
5.7. Initializing Resource Strings
The values of resource strings of a project can be initialized by executing the export
command and specifying value initialization options using the initValues
parameter.
node [pathToUize]build.js Uize.Build.Loc method=export project=[projectName] initValues=[initOptions]
The strings initialized in this way will then be updated in the codebase resource files during the next import
operation.
5.7.1. Value Initialization Options
Value initialization options are specified using the initValues
parameter, and these options determine which resource strings get initialized, when (under which conditions) they get initialized, and to what values they get initialized.
The value specified for the initValues
parameter should have the following syntax...
initValues=[initWhat],[initWhen],[initTo]
5.7.1.1. initWhat
The first portion of the initialization options specifies what resource strings should be initialized, and supports the following values...
none - None of the resource strings will be initialized (this value can be used to defeat the feature). |
|
all - All of the resource strings will be initialized (this includes translatable and non-translatable strings). |
|
translatable - Only the translatable strings will be initialized. |
|
non-translatable - Only the non-translatable strings will be initializd. |
5.7.1.2. initWhen
The second portion of the initialization options specifies when (under which conditions) the strings should be initialized, and supports the following values...
never - The resource strings will never be initialized (this value can be used to defeat the feature). |
|
always - The resource strings will always be initialized. |
|
blank - The resource strings will only be initialized when they are blank. |
5.7.1.3. initTo
The third portion of the initialization options specifies to what values the strings should be initialized, and supports the following values...
primary - The resource strings will be initialized to the values for the strings from the primary language. |
|
blank - The resource strings will be initialized to blank values (empty strings). |
5.7.1.4. Examples
In a simple example, the value initialization feature can be used to initialize values of non-translatable strings.
To initialize the values of the non-translatable strings that are blank to the values from the primary language, one can use the following value for the initValues
parameter...
initValues=non-translatable,blank,primary
In another example, the value initialization feature can be used to initialize all the strings for the en-GB
(British English) language to the values from the primary language (presumably US English in this example), using the following parameters for the export
method...
languages=en-GB initValues=all,always,primary
Note that we are using the languages
parameter and specifying the value en-GB
in order to perform this initialization during export for just the British English language (refer to the section Performing Methods for a Filtered Set of Languages).
5.8. Different Classes of Files
The localization automation scripts work with four main classes of files: codebase resource files, master resource files, translation job files, and various kinds of report and info files.
5.8.1. Codebase Resource Files
Codebase resource files are the files inside the codebase of a project that contain the resource strings for the project.
The file format of the codebase resource files can vary from project to project and will depend on the technology used for the project.
5.8.1.1. Common Resource File Formats
Over the years, many resource string file formats have been standardized for different technologies.
Some examples of common resource file formats include...
Android OS Strings Files (.xml) | |
iOS Strings Files (.strings) | |
GNU gettext PO / Portable Object Files (.po) | |
Java Properties files (.properties) | |
.NET Resource Files (.resx) | |
Ruby on Rails YAML String Files (.yml) |
5.8.1.2. Proprietary Resource File Formats
In addition to the common (and somewhat standardized) resource file formats, projects can decide to use proprietary file formats.
5.8.2. Master Resource Files
.
5.8.3. Translation Job Files
.
5.8.4. Report and Info Files
Certain of the localization service methods (such as the metrics
method, for example) produce report and info files.
5.8.4.1. Metrics Report Files
Metrics report files are generated by executing the metrics
localization service method.
5.8.4.2. Strings Info Files
Strings info files are generated by executing the metrics
localization service method.
5.8.4.3. Translation Audit Report File
Translation audit report files are generated by executing the auditTranslations
localization service method.
5.8.4.4. Language Diff Report File
Language diff report files are generated by executing the diffLanguages
localization service method.
5.8.4.5. Usage Report Files
Usage report files are generated by executing the usage
localization service method.