TMS Enhancements

We’ve completely rewritten Lingotek’s fuzzy match algorithm to provide a better representation of TM leveraging and the amount of work linguists will have to do if they use the TM matches returned. Below are some additional details on the algorithm and its implementation:

The base algorithm is now calculated by running a Levenshtein Distance on the segment text (i.e. base algorithm is no longer based on a text match percentage as before)
Penalties are subsequently deducted from the base score to determine the match percentage
It is now a centralized algorithm throughout TMS and Workbench:
Workbench TM Hits
Document Analysis
Project Analysis
Pre-fill TM (Actions dropdown)
Workflow Pre-fill

Exact 100%: The highest quality match currently in the Lingotek system. It is a 100% textual match, plus identical formatting between the document and TM source segments (either both have exactly the same formatting OR neither has any formatting at all).
Syntax 100%: A 100% textual (syntax) match, but the document and TM source segments have formatting differences. Format tags need to be applied.
Examples:

In the pre-fill configuration dialog, there is a new option allowing the user to set the pre-fill minimum to Syntax 100% matches. You can also lock pre-filled Syntax 100% matches so the translator/reviewer cannot edit the translation, but simply needs to apply tags.

We’ve simplified the TM Analysis report to reduce confusion, provide better readability and a better user-experience. The new report has a consistent color scheme that is persistent throughout TMS and the Workbench. The report includes two views of the data: Condensed and Detailed. The Condensed view summarizes the results into four groupings: Exact Match, High Fuzzy, Low Fuzzy and No Match. These groupings can be configured by a Community Administrator. The Detailed view includes all match-types as a full report.

The Pre-fill Report has also been redesigned to be consistent with the Analysis Report, including the same color scheme and usability enhancements. With these enhancements the Pre-Fill report now displays the results of a full analysis, plus a “Total Pre-filled” row. In addition, when doing a manual “Pre-fill TM" action on multiple documents (or an entire project), we now do a roll-up of the pre-fill data for each document in the set. This roll-up report is shown on the project’s “TM Statistics” tab and is currently only available for manual pre-fill actions (not yet available through a workflow pre-fill).

When exporting the new TM Analysis and Pre-fill reports, the selected view (Condensed or Detailed) will be exported. Users can choose between CSV, XLS, and XLSX formats for download. The first worksheet in the download is a summary page displaying the analysis results. Subsequent worksheets are included for each language-pair for the documents analyzed. The language-pair specific worksheets display the analysis results for each individual document that was analyzed.

The new Analysis and Pre-fill reports include two views of the data: Condensed and Detailed. A Community Administrator can customize the Condensed report groupings by changing how the match percentages map to the groupings. They can also change the names of the groupings themselves. This configuration can be done by navigating to: Community -> Customizations -> Translation Memory.

The Community Customization page has also been redesigned to provide a better user experience by grouping similar settings into tabs.

Workbench Enhancements

We’ve redesigned the way we display TM hits in the Workbench, showing the user the difference between the document source and the matching TM source segment. The diff highlighting shows the user the TM source text they would need to delete (red strikethrough) and add (green underline) if they were use that TM hit. This is a significant improvement over the previous highlighting of matching text in the Workbench TM hits.

Along with our new fuzzy match algorithm, we’ve also built a new Concordance Search algorithm, improving the relevancy of the search results. Concordance results are displayed in descending order, from the longest common sub-segment matches down to the least common.

Solved an intermittent issue where Workflows weren’t saving after configuration
Exact 100% matches with special characters are now being returned correctly
User is now prompted to save a workflow when navigating away from the phases tab
Firefox now correctly displays the scrollbar in the In-Context Review preview window

On this page:

Need Support?

Visit the Lingotek Support Center.