01-2015
- Copy Space for Confluence (Unlicensed)
TMS Enhancements
New Fuzzy TM Match Algorithm
We’ve completely rewritten Lingotek’s fuzzy match algorithm to provide a better representation of TM leveraging and the amount of work linguists will have to do if they use the TM matches returned. Below are some additional details on the algorithm and its implementation:
The base algorithm is now calculated by running a Levenshtein Distance on the segment text (i.e. base algorithm is no longer based on a text match percentage as before)
Penalties are subsequently deducted from the base score to determine the match percentage
It is now a centralized algorithm throughout TMS and Workbench:
Workbench TM Hits
Document Analysis
Project Analysis
Pre-fill TM (Actions dropdown)
Workflow Pre-fill
Exact Matches split into Exact 100% and Syntax 100%
Exact 100%: The highest quality match currently in the Lingotek system. It is a 100% textual match, plus identical formatting between the document and TM source segments (either both have exactly the same formatting OR neither has any formatting at all).
Syntax 100%: A 100% textual (syntax) match, but the document and TM source segments have formatting differences. Format tags need to be applied.
Examples:
Document Source | TM Source | Match Type |
The dog is brown. | The dog is brown. | Exact 100% |
The <b>dog<b/> is brown. | The <b>dog<b/> is brown. | Exact 100% |
The <b>dog<b/> is brown. | The dog is brown. | Syntax 100% |
The <b>dog<b/> is brown. | <b>The dog is brown.<b/> | Syntax 100% |
- New Pre-fill Configuration Option for Syntax 100% Matches
In the pre-fill configuration dialog, there is a new option allowing the user to set the pre-fill minimum to Syntax 100% matches. You can also lock pre-filled Syntax 100% matches so the translator/reviewer cannot edit the translation, but simply needs to apply tags.
Redesigned TM Analysis Report
We’ve simplified the TM Analysis report to reduce confusion, provide better readability and a better user-experience. The new report has a consistent color scheme that is persistent throughout TMS and the Workbench. The report includes two views of the data: Condensed and Detailed. The Condensed view summarizes the results into four groupings: Exact Match, High Fuzzy, Low Fuzzy and No Match. These groupings can be configured by a Community Administrator. The Detailed view includes all match-types as a full report.
Redesigned Pre-fill Report
The Pre-fill Report has also been redesigned to be consistent with the Analysis Report, including the same color scheme and usability enhancements. With these enhancements the Pre-Fill report now displays the results of a full analysis, plus a “Total Pre-filled” row. In addition, when doing a manual “Pre-fill TM" action on multiple documents (or an entire project), we now do a roll-up of the pre-fill data for each document in the set. This roll-up report is shown on the project’s “TM Statistics” tab and is currently only available for manual pre-fill actions (not yet available through a workflow pre-fill).
Redesigned TM Analysis and Pre-fill Report Exports
When exporting the new TM Analysis and Pre-fill reports, the selected view (Condensed or Detailed) will be exported. Users can choose between CSV, XLS, and XLSX formats for download. The first worksheet in the download is a summary page displaying the analysis results. Subsequent worksheets are included for each language-pair for the documents analyzed. The language-pair specific worksheets display the analysis results for each individual document that was analyzed.
Configurable TM Analysis Report Groupings
The new Analysis and Pre-fill reports include two views of the data: Condensed and Detailed. A Community Administrator can customize the Condensed report groupings by changing how the match percentages map to the groupings. They can also change the names of the groupings themselves. This configuration can be done by navigating to: Community -> Customizations -> Translation Memory.
Redesigned Community Customization page
The Community Customization page has also been redesigned to provide a better user experience by grouping similar settings into tabs.
Workbench Enhancements
Diff Highlighting on TM Hits in the Workbench
We’ve redesigned the way we display TM hits in the Workbench, showing the user the difference between the document source and the matching TM source segment. The diff highlighting shows the user the TM source text they would need to delete (red strikethrough) and add (green underline) if they were use that TM hit. This is a significant improvement over the previous highlighting of matching text in the Workbench TM hits.
New Concordance Search Matching Algorithm
Along with our new fuzzy match algorithm, we’ve also built a new Concordance Search algorithm, improving the relevancy of the search results. Concordance results are displayed in descending order, from the longest common sub-segment matches down to the least common.
Bugs
TMS
Solved an intermittent issue where Workflows weren’t saving after configuration
Exact 100% matches with special characters are now being returned correctly
User is now prompted to save a workflow when navigating away from the phases tab
Firefox now correctly displays the scrollbar in the In-Context Review preview window
Workbench
- Workbench now correctly displays TM hits from multiple vaults