What procedure do you use to convert? Calibre or Amazon's auto convert?
I've found Calibre to give good results, but sometimes leaves pages numbers (footers in), whereas Amazon's isn't as good, but most times removes footers and stuff.
I actually run it through MOBI Pocket Creator. If there is text data imbedded in the PDF then that works, but you do have to remove unwanted info like headers, footers, and page numbers from many of the pages manually when you do that.
What you do is load the PDF into Mobi, and then cancel; close MOBI completely. What you'll have left over is a "New Folder" in your specified location. In that folder you'll find a raw HTML file that should have all the text from the book stripped out that you can manipulate in word.
Once you're happy with the file that you're working on in WORD, just save it somewhere else as an HTML file, drop it into Calibre, and convert it into a MOBI file as your final step. That is the best way to create a flawless PDF that I have found. The formatting is good, and you've got total control.
If it doesn't have imbedded text, then I run the PDF through ABBYY Finereader and then load the text into Word. I hope that helps. I wasn't trying to be super complicated, it just came out that way.
What procedure do you use to convert? Calibre or Amazon's auto convert?
ReplyDeleteI've found Calibre to give good results, but sometimes leaves pages numbers (footers in), whereas Amazon's isn't as good, but most times removes footers and stuff.
Well I did some Google searching, and stumbled upon using regular expressions in Calibre. Been able to remove the footers and mistakes.
ReplyDeleteThanks for this post - it motivated me to get to know Calibre better. :-)
I actually run it through MOBI Pocket Creator. If there is text data imbedded in the PDF then that works, but you do have to remove unwanted info like headers, footers, and page numbers from many of the pages manually when you do that.
ReplyDeleteWhat you do is load the PDF into Mobi, and then cancel; close MOBI completely. What you'll have left over is a "New Folder" in your specified location. In that folder you'll find a raw HTML file that should have all the text from the book stripped out that you can manipulate in word.
Once you're happy with the file that you're working on in WORD, just save it somewhere else as an HTML file, drop it into Calibre, and convert it into a MOBI file as your final step. That is the best way to create a flawless PDF that I have found. The formatting is good, and you've got total control.
If it doesn't have imbedded text, then I run the PDF through ABBYY Finereader and then load the text into Word. I hope that helps. I wasn't trying to be super complicated, it just came out that way.
Great tips. Really helpful.
ReplyDeleteOnly problem for me is I'm on a Mac, and MOBI Pocket is Win only. :-(
And it's too bad that Calibre (at least my version) doesn't have an option for converting PDF into HTML.
ReplyDeleteYou could try this and let me know how it works out:
http://labnol.blogspot.com/2005/12/convert-doc-xls-ppt-rtf-pdf-to-html.html