Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

Technical Specifications

Lingotek XLIFF 2.0 Filter Configuration Options

Allows user to control filters (config file for filter). 

xxx.
Enhancements 

  1. If a unit has canResegment="yes" and the config option needsSegmentation.b=true is set, linguistic segmentation is applied to the content as if it were a paragraph. XLIFF 2.0 assumes all unit content is pre-segmented and are sentences. Generally XLIFF 2.0 segmentation can be changed (split/merge for example) but there is no specific option in the standard to mark the content as a paragraph. For more information, click here.
     
  2. The config option simplifyTags.b=true directs the system to not simplify inline codes. Otherwise ending and trailing codes are trimmed and any consecutive codes are merged to simplify the presentation to the translator.
     
  3. The config option mergeAsParagraph.b=true directs the system to forget any segmentation applied with needsSegmentation.b=true and merges the unit as one, single segment. This option is useful when the original segmentation is needed, not the segmentation introduced by Lingotek.
     
  4. The config option useCodeFinder.b=true allows regular expression based rules to be written that convert text into inline codes and protects these codes from translation in the workbench. For example, emoji substrings can be supported using custom codefinder rules. On merge the XLIFF wrapper elements are removed and only the original (protected) content remains.
     
  5. The option maxValidation.b=true enables stricter XLIFF 2.0 schema checking and reports errors to the user assuring compliant files.


For highly technical or creative content (where a degree of transcreation is needed), it may be desirable to segment on a paragraph (rather than a sentence). We’ve added the ability to turn segmentation on or off for XLIFF 2.0 files via the FPRM filter. Additionally, we’ve added the option to turn segmentation on/off for a specific unit ID.

To segment a document by something other than the XLIFF 2.0 defaults,

  1. Enable paragraph segmentation within the FPRM filter.
  2. Enable paragraph segmentation within the document. This can be controlled via each unit within the document.

Set the FPRM Filter

Adjust Segmentation (via the FPRM Filter Config)


By default, Lingotek's XLIFF 2.0 filter configuration segments documents into sentences. To adjust segmentation, use the following variables (mergeAsParagraphneedsSegmentation) in the FPRM filter config file (defined and outlined below).
 
 

mergeAsParagraph.b

In the filter config, use mergeAsParagraph.b to specify how segments will be treated when the file is downloaded again.

  • True: The file will be merged back with its original segmentation.
  • False: The file will be merged back with the new segmentation specified by Lingotek.

needsSegmentation.b

In the filter config, use needsSegmentation.b to determine whether segmentation can be adjusted on the XLIFF 2.0 file. (XLIFF by nature already has the segments/text units defined. The default behavior in the XLIFF 2 filter is to NOT resegment).

  • True: Further segmentation can be enabled on the XLIFF file.
  • False: Further segmentation is not enabled on the XLIFF file.


(warning) A bilingual XLIFF CANNOT have the filter config set to needsSegmentation.b="true" (this will cause it to error out).



Choose to segment by paragraph, sentence, or phrase.

Use the instructions below to adjust the FPRM filter config.

(tick) Tip: When uploading a document needing custom segmentation, apply the newly created FPRM filter config.

  1. To segment on paragraphs, add:
    1. mergeAsParagraph.b=true
    2. needsSegmentation.b=true 

       
  2. To segment on sentences, leave the FPRM filter's default settings.
     
  3. To segment on phrases (i.e. something shorter than a sentence), add:
    1. mergeAsParagraph.b=false
    2. needsSegmentation=true


Set the Document

After applying the appropriate FPRM filter, go into the XLIFF 2.0 document and adjust its segmentation. By default, the file will segment on sentences. If there is a paragraph that should not be segmented (i.e. its contents should be a single segment), you can set the canResegment variable.


canResegment

To combine all sentences within a unit into a single segment, go into the unit and set canResegment=no. This will bundle the entire paragraph into a single segment.

 


On this page:


Need Support?

Visit the Lingotek Support Center.



  • No labels