IRIS PDF OCR plug-in trouble
Thread poster: Pavel Slama
Pavel Slama
Pavel Slama  Identity Verified
United Kingdom
Local time: 19:52
Member (2014)
English to Czech
+ ...
Aug 23, 2017

Good afternoon. I was quite excited about the new OCR feature, with which Trados’s OCR will support my languages, such as Czech.

However, I am probably doing something wrong: I’ve installed the plug-in and enabled it in options, however attempts to OCR a Czech document via Trados still result in illegible gibberish, as if though the software had not recognized the language of the document (and perhaps defaulted to English).

I first installed the plug-in, but when I
... See more
Good afternoon. I was quite excited about the new OCR feature, with which Trados’s OCR will support my languages, such as Czech.

However, I am probably doing something wrong: I’ve installed the plug-in and enabled it in options, however attempts to OCR a Czech document via Trados still result in illegible gibberish, as if though the software had not recognized the language of the document (and perhaps defaulted to English).

I first installed the plug-in, but when I was subsequently ticking it in Options, there was a message that I should download & install it.

Thanks for any advice.
Collapse


 
CafeTran Training (X)
CafeTran Training (X)
Netherlands
Local time: 20:52
Words glued togehter? Aug 23, 2017

Pavel Slama wrote:

Good afternoon. I was quite excited about the new OCR feature


I watched the video and I was wondering: are these words like "Itis rowthe mostLiked etc." really glued together?

Screen Shot 2017-08-23 at 18.43.06

If so, I'd say it's a rather poor result of Iris' OCR.


 
Pavel Slama
Pavel Slama  Identity Verified
United Kingdom
Local time: 19:52
Member (2014)
English to Czech
+ ...
TOPIC STARTER
Czech example Aug 23, 2017

OK, so to be more specific, I’ll give a very straightforward example.

Original:
Capture

Google Docs buildt in OCR (0 mistakes in this paragraph):
Capture2

Trados with IRIS OCR:
Capture3

But I’m still hoping there may be a human factor on my part.


 
José Henrique Lamensdorf
José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 16:52
English to Portuguese
+ ...
In memoriam
Butting in... Aug 23, 2017

Though my late parents were Polish, I don't speak any of it. Nor Czech, if that matters.

However I see that the ž (CZ) was OCR'd as ż (PL).
Is there any chance your program was set up for Polish (too)?

I had such experience with an ancient OCR program (can't recall its name), where ó (PT) was OCR'd as 6, until I realized that it was still set for EN, in spite of my insistent setting for PT.


 
RWS Community
RWS Community
United Kingdom
Local time: 20:52
English
You watched the video? Aug 23, 2017

CafeTran Training wrote:

If so, I'd say it's a rather poor result of Iris' OCR.


If that was IRIS I'd agree with you


 
Pavel Slama
Pavel Slama  Identity Verified
United Kingdom
Local time: 19:52
Member (2014)
English to Czech
+ ...
TOPIC STARTER
That’s what I’m wondering, whether my setup’s right Aug 23, 2017

José Henrique Lamensdorf wrote:
... any chance your program was set up for Polish (too)?


It’s not my programme, it’s the brand new plug-in made by SDL themselves, I believe. It is completely possible the setup is not right, and that's why I’m asking for help. I used it from a project set up as CS>EN.

Otherwise, well done, José, for recognizing Polish characters where Czech ones should be.


 
RWS Community
RWS Community
United Kingdom
Local time: 20:52
English
I copied your image... Aug 23, 2017

Pavel Slama wrote:

OK, so to be more specific, I’ll give a very straightforward example.

Original:
Capture

But I’m still hoping there may be a human factor on my part.


... with a screen capture and saved as a PDF. Then opened the PDF in Studio using IRIS. Doesn't look as bad as your test to me. Are you sure you used IRIS?

https://www.dropbox.com/s/byi088wuew3wcfq/cz_iris.jpg?dl=0

Regards

Paul
Why not try the new SDL Community


[Edited at 2017-08-23 22:36 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

IRIS PDF OCR plug-in trouble







Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »