https://www.proz.com/forum/office_applications/247145-how_to_get_rid_of_junk_ocr_character_leftover_in_word.html

How to get rid of junk OCR character leftover in Word
Автор темы: Susan Welsh
Susan Welsh
Susan Welsh  Identity Verified
США
Local time: 13:03
русский => английский
+ ...
Apr 16, 2013

I have converted a PDF to Word using ABBYY Finereader, and wherever there was a hyphen at a line ending, the Word version has put it a junk character than I cannot search and replace to get rid of. It looks like a horizontal line with a short vertical line hanging down from the back of it -- like an L rotated 90 degrees clockwise. I have copied it into my Find field, but Word can't find it.

There are hundreds of these things in this rather long document, and I would really like to
... See more
I have converted a PDF to Word using ABBYY Finereader, and wherever there was a hyphen at a line ending, the Word version has put it a junk character than I cannot search and replace to get rid of. It looks like a horizontal line with a short vertical line hanging down from the back of it -- like an L rotated 90 degrees clockwise. I have copied it into my Find field, but Word can't find it.

There are hundreds of these things in this rather long document, and I would really like to get a clean text to make translating easier.

Any suggestions?

Thanks in advance!
Collapse


 
Kevin Fulton
Kevin Fulton  Identity Verified
США
Local time: 13:03
немецкий => английский
Look under special characters Apr 16, 2013

If I recall correctly, this is for the optional hyphen ^-.

 
Sam Pinson
Sam Pinson  Identity Verified
США
Local time: 11:03
Член ProZ.com c 2011
русский => английский
Optional hyphens can be replaced in Word Apr 16, 2013

Hi, Susan.

Please see my blog article on how to replace these "optional hyphens".
http://pinsonlingo.com/blog/2011/05/27/tag-char-namesoftbreakhyphen-removed/.


 
LEXpert
LEXpert  Identity Verified
США
Local time: 12:03
Член ProZ.com c 2008
хорватский => английский
+ ...
Easy! Apr 16, 2013

This is very common in multi-column articles.
Open Word's Find&Replace dialog.
Under Find, click the button "More >>"
Place the cursor in the Find box, and from the Special drop-down menu select "optional hyphen".
Leave the Replace box blank.
Replace All.


That's it.


 
esperantisto
esperantisto  Identity Verified
Local time: 20:03
Член ProZ.com c 2006
английский => русский
+ ...
ЛОКАЛИЗАТОР САЙТА
Better take care of it in FR Apr 16, 2013

In FineReader, go to Tools → Options → 4. Save → Format Settings → RTF/DOC/Word XML and tick Remove Optional Hyphens and re-export your document.

[Edited at 2013-04-16 07:57 GMT]


 
Susan Welsh
Susan Welsh  Identity Verified
США
Local time: 13:03
русский => английский
+ ...
Автор темы
Thanks! Apr 16, 2013

I used Rudolf's solution, and it worked like a charm. (I didn't want to go back to FR, because I had already done some formatting work on the Word file, like moving footnotes around.)

Thanks to all.


 


To report site rules violations or get help, contact a site moderator:

Модератор(ы) этого форума
Maya Gorgoshidze[Call to this topic]
Prachya Mruetusatorn[Call to this topic]

You can also contact site staff by submitting a support request »

How to get rid of junk OCR character leftover in Word






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »