What is the easiest way to convert physical book to digital?

jonuno@lemmy.ml to Asklemmy@lemmy.ml – 30 points –

Title. I got a hold of a couple books (one text, other,images) that I would like to make it available to others and the don't exists in digital format. Photos to PDF? Or something that converts IMG to text?

26

First you need to get images of them. Better quality images make the ocr step better. Then image to text.

Make a frame to hold the camera at the right height above the page. Good lighting.

It's not quick unless you have the hardware.

Another option is to send the book to a place that will scan it for you. Google for options.

Our university library has a high performance book scanner. Maybe yours has one too. It costs like 1 Cent per 10 Pages, but it's so fast it might be worth checking out if yours has one too. You can extract then the text from the images with tesseract. There are some ready tools that will do this for you to make the PDF searchable

How does it flip between pages?

You do by hand, but it goes very fast. It's like a table, you lay it open (text upside), it scans, you flip it scans. You don't have to open/close anything. Just flip like every second. So for a book with 120 pages you would need like 60 seconds

Usually I think of scanner like a photocopier, this sounds more like it takes a picture?

Well a photocopier takes a picture too 😉

You know what I mean. So it's more of a camera than a scanner?

Well it still has this Bright light like advancing. But the sensor is like 1m above the table. Honestly I don't know if this makes it more a camera or a scanner

My old uni library has 2 of those too. It's basically a fixed camera, but the table you set the book on, has a sort of negative nook that lets you level out the book. So no matter on which page you are, the pages on the book will be on the same level.

I don't know how you're going to get a hold of the text from the images. But I do know that if you're trying to create a book file, PDFs are not the answer. EPUBs are far better, and an open standard. I recommend creating them using the Calibre EPUB editor.

The reason EPUBs are better is because they were designed specifically for books. They're reflowable (meaning the pages aren't fixed-size, and therefore can be read on devices of all sizes), whereas PDFs have fixed content, and are very difficult to read on small things like phones and e-readers, requiring zooming just to see the text. Also, EPUBs aren't very difficult to create. You just have to know how XML works. It's basically just a zipped directory containing markup files.

When I wanted to quickly scan two books to my kindle, I used vFlat. It wanted money but somehow I was able to scan both books without paying anything. I put my phone so that it can see the double page from top and then set the app to take a picture every X seconds (about 5 I think). The I just flipped a page and it took a picture of the double page, created two PDF pages from it, fixed aligning and perspective, removed fingers from corners and so on. Pretty good experience. I tried several FOSS alternatives before, but sadly none was as good as this.

there are book scanning services out there there are book scanning services out there

There are also book scanning services out there

There are also book scanning services out there

wtf, it was only one when i clicked send lol

I heard there are book scanning services out there

Funnily enough I have heard there are book scanning services out there

If the books aren't too obscure, you might just be able to find an EPUB of them online. It's sort of a moral grey-area, but considering you already own the books I assume, you can very likely find them here.

I don't own the books, they are from the library and they do not have digital versions available. I would like them more portable for me and to share with others who can't access or afford. I get a lot from zlibrary so I'd like to contribute when I can too.

I converted several books to PDF using 1dollarscan. I think OCR was an option, but I just split the PDF and used linux tools to OCR the resulting files that I wanted OCRed. > I don’t own the books,

Edited to add:

I don’t own the books,

Oh. the scanning service above is destructive

There are some great guides at https://www.diybookscanner.org/ that you can check out. Some of them are a bit outdated so I would recommend software like tesseract and phone cameras should be good enough for general use

I use the microsoft office lense for this kind of stuff. You take photos that is alligned. It can be saved as pdf and loads of other formats.