Book Scanning & Post-processing Manual

—based on the Public Library overhead scanner

Public Library & Multimedia Institute

Publishing by the Hybrid Publishing Consortium 2014

http://consortium.io/

Publishing Information

Book Scanning & Post-Processing Manual

Based on Public Library Overhead Scanner

Written by: Tomislav Medak, Dubravka Sekulić with the help of An Mertens

Published by the Hybrid Publishing Consortium 2014

Web version 978-3-95796-500-4 http://bookscanner.consortium.io

ISBNs

Web 978-3-95796-500-4

Print 978-3-95796-501-1

eBook 978-3-95796-502-8

PDF 978-3-95796-503-5

Revision 0.1

Copyright the Multimedia Institute

Creative Commons Attribution-ShareAlike 3.0 Germany (CC BY-SA 3.0 DE) http://creativecommons.org/licenses/by-sa/3.0/de/deed.en Legend: This deed is used in the absence of an intellectual property framework that represents the authors respective position on copyright.

The Hybrid Publishing Consortium is a project of the Hybrid Publishing Lab in collaboration with partners and associates. The Hybrid Publishing Lab is part of the Leuphana University of Lüneburg Innovation Incubator, financed by the European Regional Development Fund and co-funded by the German federal state of Lower Saxony.

Contact: mail@consortium.io | https://consortium.io | http://cdc.leuphana.com/structure/hybrid-publishing-lab/

Acknowledgements

Consortium multi-platform publishing using A-machine technology. Copy editing Simon Worthington and Christina Kral. Backend technology wrangling Johannes Amorosa. Template implementation by Creative Coop and Hybrid Publishing Consortium. Templates are published as CC BY-SA 3.0 DE and available at https://github.com/hybrid-publishing-lab/design-templates

Introduction

Book Scanning - From Paper Book to e-Book

Initial considerations when deciding on a scanning setup

Book scanning tends to be a fragile and demanding process. Many factors can go wrong or produce results of varying quality from book to book or page to page, requiring experience or technical skill to resolve issues that occur. Cameras can fail to trigger, components to communicate, files can get corrupted in the transfer, storage card doesn't get purged, focus fails to lock, lighting conditions change. There are trade-offs between the automation that is prone to instability and the robustness that is prone to become time consuming.

Your initial choice of book scanning setup will have to take those into consideration. If your scanning community is confined to your hacklab, you won't be risking much if technological sophistication and integration fails to function smoothly. But if you're aiming at a broad community of users, with varying levels of technological skill and patience, you want to create as much time-saving automation as possible on the condition of keeping maximum stability. Furthermore, if the time of your scanning community is limited, you might also want to divide some of the tasks between users and their different skill levels.

This manual breaks down the process of digitization into a general description of steps in the workflow leading from the printed book to a digital e-book, each of which can be in a concrete situation addressed in various manners depending on the scanning equipment, software, hacking skill and user skill that are available to your book scanning project. Several of those steps can be handled by a single piece of equipment or software, or you might need to use a number of them - your mileage will vary. Therefore, the manual will try to indicate the design choices you have in the process of planning your workflow and should help you make decisions on what design is best fit for you situation.

Introducing book scanner designs

The book scanning starts with the capturing of digital image files on the scanning equipment. There are three principle types of book scanner designs:

Conventional flatbed scanners are widely available. However, given that they require the book to be spread wide open and pressed down with the platen in order to break the resistance of the book binding and expose the inner margin of the text, it is the most destructive approach for the book, imprecise and slow.Therefore, book scanning projects across the globe have taken to custom designing improvised setups or scanner rigs that are less destructive and better suited for fast turning and capturing of pages. Designs abound. Most include one or two digital photo cameras of lesser or higher quality to capture the pages, a transparent V-shaped glass or Plexiglas platen to press the open book against a V-shape cradle, and a light source. The go-to web resource to help you make an informed decision is the DIY book scanning community at http://diybookscanner.org. A good place to start is their intro (http://wiki.diybookscanner.org/) and scanner build list (http://wiki.diybookscanner.org/scanner-build-list).

Book scanners with a single camera are substantially cheaper, but come with an added difficulty of dewarping the distorted page images due to the angle that pages are photographed at, which can sometimes be difficult to correct in the post-processing. Hence, in this introductory chapter we'll focus on two-camera designs where the camera lens stands relatively parallel to the page. However, with a bit of adaptation these instructions can be used to work with any other setup.

The Public Library scanner

The practical focus of this manual is the scanner built for the Public Library project, designed by Voja Antonić (see Illustration 1). The Public Library scanner was built with the immediate use by a wide community of users in mind. Hence, the principle consideration in designing the Public Library scanner was less sophistication and more robustness, facility and distributed post-processing.

The board designs can be found here: http://www.memoryoftheworld.org/blog/2012/10/28/our-beloved-bookscanner

The current iterations are using two Canon 1100 D cameras with the kit lens Canon EF-S 18-55mm 1:3.5-5.6 IS. Cameras are auto-charging.