Today I’m going to type a little about the actual process of scanning a book or what we scanners refer to as “rawing” a book (however the hell you are supposed to spell it). By this I mean the part of the process in which I prepare the book for scanning and produce the raw images I will later be editing and processing into the files I share here.
First things first, get comfortable! Set up your scanner so that it is easy for you to operate and so that you have plenty of room to move about. I like to put my scanner on a piano bench adjacent to my desk, so that I never have to lean too far to operate it. I’ll be putting up pictures of my setup as I go here to make this all a bit more clear and because a post is much more fun with pictures.
The scanning stage of the process is really my favorite step. I sit down to scan, put on an album or maybe click on a ballgame or a documentary (nothing that requires my full visual attention) and just zone out for a bit. The mundane task of flipping pages kind of allows me to zone out in a half-assed variety of meditation that I find very relaxing. This step goes most quickly if I avoid distractions. Surfing the web, reading scans, e-mailing, etc. can turn a 15 minute scan into an hours long scan, so I try, not always successfully, to keep a steady pace going.
First, you will want to set up your scanning software to send your images to a certain folder you set up on your computer, I’ve got mine named “scanner incoming” and all of my various scanners send their images to this single folder. When I scan a magazine or a comic, I do not want to fuss with the images as they come in because this is very distracting and inefficient. I might check my results in the first few pages just to make sure I like what I’m getting, but beyond that I only check from time to time that I haven’t skipped any pages.
What I want in my raw scans is an unadulterated image that looks exactly like the page it is coming from. Sure, you can set up your scanner software to alter the images as they come in, but this is often like using a hacksaw instead of a scalpel. Photoshop (or whatever image processing program you are using, more on this topic later) is a far superior tool for adjusting images than your scanner’s software, so I try to minimize alterations or automatic adjustments. For some scanners' software this will mean adjusting the pre-scan settings or auto-leveling. Most often this means tweaking down the brightness, contrast, and even saturation until the raw scan looks uniformly like what you see on the page itself. Sometimes it will even be necessary to adjust the white, black, or neutral in levels.
At this stage, you will want to pick a file format and dpi setting for your scans that you are comfortable with. The raw scans that you get directly from your scanner should be of the utmost quality, and the raw images we work with are much, much larger than compressed images that go out in the final scan. Personally, I like to go with a lossless format and scan to .tif. Other scanners I know scan to .bmp or .png, but at the very least I suggest going with an uncompressed .jpeg. While the size for a single image you get here might seem very large, I feel that it’s worth it, and an average computer will still slice and dice these images in the processing stage fairly quickly. I like to keep all of my raw scans for perpetuity in case I want to revisit a scan for a new or different edit and just to have an unadulterated archive around. Sure, this can take up a lot of room, but even a large scan might take up .25 or .50 worth of hard drive space, and, if I care enough to put a scarce or valuable book on my scanner that no other soul might ever care to scan again, this seems like a very small cost indeed. Other guys I know, though, will just toss their raws when they’ve gotten a final product and there’s nothing wrong with that especially for common publications. If you have the desire to revisit an old scan, often a fresh scan with a newer machine and up-to-the-minute techniques gets the best result.
As for dpi setting, there’s considerable debate on the optimum setting. Many professional or archival scanning guidelines I’ve seen suggest a 600 dpi setting, while there are many good scanners that get great results with settings as low as 200 dpi. 600 dpi most definitely is greater than the original printing setup of vintage (or even modern) material, but it also insures that line work is captured accurately and that small fonts are perfectly legible. Of course, scanning at 600 dpi yields an image of a much greater pixel width than the screens that scans will be viewed upon. Some scanners feel that scanning in higher dpi leads to a “grittier” look than is achieved with a smaller setting or that such a high setting is overkill. They might be right. On the other hand, if you scan at too low of a resolution, you are increasing the risk of moiré in the scan and can end up making small fonts illegible or making fine line work blocky. These days, I scan at 400 dpi (in the past I’ve scanned at 300dpi which is probably sufficient) which I find to be a nice compromise between speed and image quality. One other factor to consider here is that most OCR programs are going to work best with at least 300 dpi.
Moving on (geez I ramble), let’s continue to the scan itself. The number one enemy of a good scan is failing to get the page flat on the glass. Spine shadow is awfully annoying, and even the slightest curvature and waviness in the raw scan can make text illegible or distort artwork. When I told McCoy that I was doing a how-to, his first response was “tell em to pull the staples!!!” And while this no doubt makes most collectors cringe, it is indeed the best recipe for a good scan. But please do not let this stop you. If you have a rare book and are willing to scan it but do not want to destroy it, please do scan it even if you are not willing to take it apart, it’s much appreciated no matter what. Just make sure that you are using a heavy book or weights on the scanner lid to get your book as flat as possible. I’ve personally come to the conclusion that for most material, pulling the staples is the best way to go. In fact, for cheap pulps or squarebound magazines, sacrificing the book is really the logical choice. Even for valuable golden age comics, I find pulling the staples is the best choice. Besides yielding a flatter image and more page space to work with, the stress on a spine of folding it back and forth 36 times or pressing down on it with the scanner lid is greater than the risk of snapping a staple in the process or enlarging the staple holes. When I first started deconstructing books and magazines, I would sweat bullets in this stage, but really I do enjoy it. And because I’m something of an anti-collector (paper is only a vessel – unless we are talking about my girlie pulp collection :D) this comes naturally to me.
I scanned a pulp last night (which will be showing up here on Thursday, hopefully!, in a series of scans McCoy and I are doing for Veterans Day) and took some pictures of the process to show that you can do this with minimal damage to your pulp if you are careful. Let me make a disclaimer here that this works better with pages that aren’t so brittle. A pulp can look great, but if the pages are brittle, it’s not going to fare well in this process. This is one reason I’m skeptical of much of the grading that goes on with comics and pulps. I’ve gotten mid or high grade pulps that look really nice, but if the paper is brittle, I’d rather just have a well-read, browning-but-supple beater copy.
So this particular pulp is probably an issue that I’d normally be far rougher with, but I plan on giving it to my granny when I go home for Xmas (as it was given to me by a fellow scanner to have my way with) and I was thinking of this post, so I went through the process of pulling staples and reassembling. But before you start to take anything apart, get some scans of the covers and spine, just in case they suffer damage in the debind process.
A nice, solid reading grade pulp and pretty much the condition I like for a scanning copy. The cover is complete with a little creasing and tattering about the edges. The spine is near complete. The pages are browning but supple. Of course, we often have to go with the most affordable copies we can find, but a copy like this is going to come out of the other side of the process about like you see it here. High grade pulps usually don’t fare as well. But don’t let me talk you out of it! By any means necessary, I say, paper degrades, glory is forever! :p
Cough, but on with the program. Pulling staples can be a bit delicate, but just go slow and take your time and the patient will be O.K. First I open up the back cover so that I can get to the backside of the staples, taking care not to bend the back cover over too much.
Next, I use a set of needle-nose micro pliers to carefully bend the staples straight.
Then I turn the magazine up so that the front and back cover are lying flat. Now I can pull those puppies.
This step can be tricky. Sometimes, the staples come right out. Other times, they take some coaxing. You can use the micro pliers to push carefully from the back side or to grab them from the front if you aren’t able to manipulate them out by hand. Go slow and be patient, you can do it!
Set aside the staples in a manner so that you remember exactly how they came out of the book. For comics in particular this is important so that the staples go back in easily. If the staples are rusty, this is a great chance to spray them with a little WD40 and prevent further damage to the book. A caveat here on rusty staples, they sometimes snap! If a set of staples is really rusty on a pricey comic, you could skip the debind. If a staple snaps, I’ll usually replace with an extra staple of a similar vintage…
Now, you can pull out all of the loose pages, leaving only the pages that are actually glued in at the spine.
Now the book is ready to scan. The pages still attached to the magazine are a little trickier and may not end up perfectly flat, but there will be no spine shadow encroaching on any text in the scan. I use a book placed atop a thicker backer board to assist in getting the page flat.
Scanning the loose pages will go very fast and the images will be perfectly flat. In the second pic, you can see how I have one set of pages on the left of the scanner and that as I scan them I have another pile on the right. When I get to the middle leaf of each section, I move the pile from the right over to where the pile from the left was and continue as before. BTW that white strip you see on my scanner bed can help on some models make a scanner’s auto-leveling more even. Some scanner software levels every page by picking the lightest and darkest points, so having some true white on the bed can aid in getting truer colors.
When you are done, it’s time to reassemble, though you’ll want to double check that you have all of pages for your scan in hand. It is truly frustrating to put a book back together and realize that you’ve missed a couple of pages. When you put the sections back in the appropriate place in the book, you can use a backer board placed in the center of the section to help get the pages all the way to the spine. Re-inserting the staples can be a little tricky in a pulp, but with a little patience it’s easily done.
Here’s the old girl post-scan. Minus some pulp flakes here and there, she’s in exactly the same shape as when I started. I could almost say, in mock indignation, one of my granddad’s oft-used lines - “I never laid a hand on the broad!”:
The mention of pulp flakes reminds me, you will need to clean your scanner bed often when working with pulp, especially a brittle issue. I use a lens cloth from a camera store, but there are a variety of microfiber products out there that won’t scratch your glass. Keeping the glass clean is very important, one prominent but unnoticed smudge or hair can ruin a whole scan. Periodically, you will want to give your glass a more thorough cleaning with isopropyl alcohol. Be wary of using conventional glass cleaner because the ammonia can react with coating on some scanner beds. More occasionally than that, you might need to clean the underside of your glass as well as that damn pulp gets everywhere.
And I might as well post what one of these raw pages looks like. The raw 400 dpi .tif I get weighs in at 35.1 MB (the end result I will end up sharing will be probably about 800kb so you see the enormous size difference). I’ve got to shrink the thing way down just to get it hosted, but this pic will give you some idea of the color I’m after:
Once you have all of your raws for the issue you are working on, put them in a folder unto themselves. I use a renamer program to then tag each image with the page number and the issue it came out of. Some scanner software lets you do this pre-scan, but if you miss pages, have to rescan pages, etc., this is more trouble than it’s worth. Most renamer programs will let you name all of the files in a folder numerically, even when there are gaps in the sequence. Having each page named well is good archival policy and it prevents the possible loss of pages should many files with “image 0001” get thrown in together by accident.
And while I’m thinking about it, here are some links that are of using in the debinding process for newer glue-bound publications, thanks to these scanners for sharing their helpful tips:
I hope this has been a fairly coherent post. After a four hours of typing away, I’m not so sure I’ve kept my focus - a common occurrence after my time at the keyboard, nyuk, nyuk.
I will continue with this series on how to scan very soon with an installment on image processing and some different editing options a scanner has to choose from, but tomorrow I'll interrupt this series with some WWII material that I’m scrambling to get done in honor of Veterans Day.