TransPerfect Legal Solutions  
   
 
Home Solutions Practice Areas About TLS Resources Press Center Contact Us
 
         
 
 
 

Multilingual OCR –
Optical Character Recognition

At TLS, we provide state-of-the-art Optical Character Recognition – OCR – technology, which allows us to convert image and paper documents into editable digital characters (such as one would find in a Word document). This process is very useful for the kind of large scale document reviews that often accompany international litigation. Combined with our unparalleled language expertise, TLS can perform optical character recognition on documents in over 100 languages. Moreover, all of our technology is fully Unicode compliant, which means we are able to OCR a wide range of non-Latin scripts such as Arabic, Bengali, and Chinese (Traditional/Simplified).

TLS can then import those results directly into our online platform, Case Interactive, or any other hosting platform, to begin an online review by the client or our certified linguists. Because TLS includes an index function, clients can perform key word searches in the target language without running a time-consuming paper review. Our multilingual OCR technology also generates considerable savings in translation and staffing costs. By hosting the review process online and in the native language of the reviewer, the client can ensure faster, more accurate reviewing and a significant decrease in the number of personnel that would ordinarily be required for a paper review.

Multilingual OCR for Machine Translation
The Speedy Solution for Multilingual Data Discovery

When you need to translate extremely high volumes of text in a very short turnaround, machine translation – MT – may be the answer. By combining state-of-the-art optical character recognition – OCR – and translation memory – TM – technologies with the industry's best legal and linguistic experts, TLS can help you translate your documents quickly and without a significant reduction in quality.

How it Works

Using advanced OCR technology, we can convert high volumes of hard-copy or handwritten documents into machine-readable text and process them in conjunction with live data files to help you pinpoint where to spend your translation dollars most effectively. Then, taking into account the quality and complexity of the original documents/data as well as the end use of the translation, we will provide you with a Source Material Analysis report proposing the best-fit solution for your project. With this report, your organization has the opportunity to verify – before the process beings – that the strategy and file formats will suit your needs and fit within your discovery systems or other database-backed applications for seamless import and integration of translated data.

Once methodology and file formats have been determined, all text undergoes our robust machine translation process, which may be supplemented with varying levels of human review.

Diagram:

  1. Documents are collected in electronic and hard copy formats
  2. Front-end Source Material Analysis and Source Sampling help determine the ideal project strategy
  3. Optical Character Recognition (OCR) software is used on non-live text documents to create electronic files where possible
  4. All documents are run against existing translation memory – TM – for matching segments
  5. Remaining text is translated through advanced machine translation – MT – software
  6. If requested, reviewers comb through translated text, correcting basic spelling, grammar, and flow
  7. Final files are delivered to the client in electronic format

Our MT process supports the following languages:

  • Arabic
  • Dutch
  • English
  • French
  • German
  • Italian
  • Japanese
  • Korean
  • Polish
  • Portuguese
  • Russian
  • Simplified Chinese
  • Spanish
  • Swedish
  • Traditional Chinese
Why Machine Translation?

Ever since the Federal Rules of Civil Procedure were amended to regulate the preservation of ESI (Electronically Stored Information), an increased burden to preserve electronic information has led to an excess of data that must be collected during discovery. Using traditional human translation for all ESI can be time consuming and extremely costly. With MT technology, you can do a quick first pass at the documents, providing your reviewers with enough information to obtain a general overview of the data without spending extra time and money on exact translations of documents that will never make it to court.

While MT technology is a valuable tool, it is not always the best approach for each project, which is why TLS will work with you to determine the best possible solution on a case-by-case basis. One alternative to machine translation is an on-site document review – TLS can provide a linguist who works on-site with your case team to conduct a review of all documents in the target language, eliminating non-essential or irrelevant documents on the spot and allowing you to keep all data in-house. Please click here to learn more about our on-site document review options.

 
Featured Solutions
E-Discovery
We meet each client's need for an efficient, cost-effective e-discovery plan by leveraging our knowledge of their operations, preferences, and requirements to create tailored business solutions. From data collection to document production, TLS is the single solution for all of your e-discovery needs.
 
Learn More