Сканирование старых документов в формате TIFF. Стоит ли сканировать на 48-битном цвете?

2908
JakeGould

В настоящее время я сканирую старые бумаги с некоторыми заметками на них, используя сканер Epson V370 . Я хочу, чтобы выходные файлы были в формате TIFF, однако я не уверен, какую битовую глубину выбрать.

На одной из бумаг есть только несколько заметок черными чернилами и никаких других цветов. Я хочу высосать из этого сканера высочайшее качество, но есть ли смысл сканировать белую бумагу черными чернилами с такой большой глубиной цвета, как 48-битный, который является максимальным для моего сканера?

Также, если у меня есть бумага с синими чернилами, будет ли большая глубина цвета влиять на качество?

6
Даже для бумаг с синими чернилами сканирование в оттенках серого и регулировка уровней обеспечат превосходную читаемость. 16-битная шкала серого такая же, как и 48-битная, для черного текста, если только вы не хотите сохранить неяркие голубые линии на бумаге для заметок. Разрешение гораздо важнее Richie Frame 9 лет назад 3
Единственное известное мне приложение, в котором это могло бы стоить, было при сканировании исторических документов (писем автора), когда цветное сканирование с большой разрядностью (и пост-процессная фильтрация) могло видеть сквозь коричневато-черные чернила, которые имел литературный исполнитель. зачеркнул некоторые предметы, чтобы увидеть голубовато-черные чернила, на которых они были написаны. Ecnerwal 9 лет назад 0

4 ответа на вопрос

10
fixer1234

To answer how many bits of color depth you need and how it affects your results, let me start with a quick explanation of what color depth actually is.

What is color depth?

Color depth describes how many shades of color will be stored. If an image has extremely fine gradations of a color, scanning and storing an extremely high number of colors means that those fine distinctions will be coded differently in the stored image and can be differentiated when you do image manipulation. Storing a lower number of bits means that some of those gradations will be stored as the same color, so they won't be differentiated.

You may have seen this effect trying to store a photo containing fine gradients into a lower bit format, like an 8 bit GIF, which stores only 256 unique colors. Instead of a continuous gradient, you see bands because multiple shades must be condensed into fewer available colors, producing color "steps", like the comparison below.

enter image description here

enter image description here

How many bits are required?

The human eye can distinguish more than 256 shades of each primary color, but that is sufficient to render images in what appears to be photographic quality. That requires 8 bits for each primary color, or 24 bits. In combination, that's over 16 million colors. At 48 bits, or 16 bits per primary color, over 65,000 shades of each primary color can be differentiated. That is far beyond what the eye can distinguish.

48 bit color

So why bother with 48 bit color at all? Because it's useful for photographic work. Detail may be washed out in the brightest areas or hard for the eye to distinguish in the darkest areas. With image manipulation, these ranges can be stretched to put more distance between similar colors so this detail is better differentiated. However, that leaves holes in the color spectrum. Starting with 48 bits provides those in-between colors that would otherwise be missing.

When you stretch one range of colors, other colors get compressed, consolidating some colors. Other types of image manipulation cause similar loss of some of the color values. When you start out with 24 bits, the cumulative loss through successive processing steps can be a noticeable degradation. Starting with 48 bits, even substantial loss of colors still leaves far more than is required.

The result typically has to be down-sampled to 24 bits to display it normally or print it. So even for photographic work, 48 bits is special-purpose.

Color depth vs. ability to scan colors

The scanner has specific optical properties and every scan is captured at the color depth that the hardware produces. That information is processed by software to produce an image at a specified color depth. So if your scanner is capable of 48 bit color, that's what's captured. If you want only 24 bits, some of the colors are consolidated.

However, at any color depth, every color on the page will be stored as something. The difference is that at higher color depth, you will be able to tell more of them apart. So, for example, a higher color depth doesn't let you capture blue better.

Scanning text

If you are talking about text, there is absolutely no benefit from using 48 bits. It will just give you huge files that are slow to work with. But some amount of color depth can be helpful in cleaning up the scan.

Use of color information for cleanup

Consider a fax. It works with 1 bit, which gives you black or white. So every color on the page must be represented by one or the other. That's accomplished by selecting a threshold darkness. Anything lighter becomes white; anything darker becomes black (essentially the same process is used to convert 48 bit colors to 24 bit colors). With a fax, the result is often a mess -- blocky letters, a smudge becomes a grainy black blob, a fold in the paper becomes a black line.

That's because of what the scanner sees. The paper isn't pure white (and it might be yellowed in an uneven way). If there are any folds or wrinkles, you can see them because they introduce shading. The letters on the page aren't pure black, and often contain lighter areas. Dirt or smudges have darkness and color. Often, the darkest portions of artifacts are darker than the lightest portions of the content. This complicates trying to produce a clean scanned page.

Having some color information to work with allows you to use image manipulation tools to clean up the scan; to distinguish artifacts from content. After the artifacts have been removed, the scan can be made more readable by reducing the color depth. Forcing text to be dark and the background to be white more closely mimics what the original document looked like when it was freshly printed on white paper.

Bottom Line

Color depth won't improve your ability to capture colors, like blue, which don't scan as well as some other colors. However, it gives you the ability to improve the result. Scanning in 24 bit color is a good starting point if the originals are not pristine. Even if it was originally black ink on white paper, the color information will make it much easier to get rid of artifacts, which usually do have color.

Once you have removed the artifacts, the color information can be used to improve the appearance of the content. Blue ink that didn't scan well can be darkened without affecting colors that did scan well. Things like an embossed notary seal that might be barely visible can be darkened. Off-white paper can be whitened. Contrast between the content and the background can be improved.

Once all of this is done, a much smaller range of colors can be used to represent the page. So 24 bit color can be reduced to 8 bit color (or less), or grayscale. This allows the finished result to be stored in a much smaller file while looking better than the original.

Low color depth trick

If you are working with text and want the end result to look like clean black text on white paper, there is a trick you can do using low color depth. You start with a substantially higher resolution than what is needed for the result, say 800 to 1200 dpi, and 24 bit color. Use the color information to remove artifacts, improve contrast, etc. until it is as good as you can get it. Then convert the image to 1 bit color (black and white).

This will force the cleaned image to black on white while the high resolution will capture fine detail in the content. Then down-sample to the desired resolution (typically 200 to 300 dpi). Down sampling will convert the file to grayscale or 24 bit color. If this is not automatic, select grayscale as the output.

This will have a similar effect to ClearType (sub-pixel rendering). Detail that would have been totally lost scanning at high contrast and normal resolution will be preserved in a few bits of grayscale. The file can be saved in something like a 4 bit grayscale, which will be a very small file with high quality results.

Еще один фактор: яркость линейная, но это не то, как мы ее воспринимаем. 48-битное сканирование сохраняет детали в самых темных областях, где они будут потеряны с 24-битным сканером. Это актуально, только если вы собираетесь редактировать. Loren Pechtel 9 лет назад 1
7
JakeGould

I want to suck out the highest quality from that scanner, but is there a point in scanning white paper with black ink at such a high color depth such as the 48-bit that is the max on my scanner?

The shorter answer about color depth, DPI and document scanning:

In short? There is 100% no valid reason to scan white paper with black ink at such a high color depth. 48-bit color depth is mainly used for high resolution photos or color documents and not text. The reason why many scanners offer 48-bit color depth is simply because they can nowadays. But unless you are processing or outputting an image that needs 16 bits of color data per channel, it’s overkill at best. 24-bit color depth is more than adequate for normal usage and—in fact—I am fairly confident you are reading this text on some display that is operating at a 24-bit color depth; 8 bits per RGB channel = 24-bit.

The longer answer about scanning documents:

As far as document scanning goes, color depth is exactly that: depth of color. DPI (dots per inch) is a whole other metric and that is what you should pay attention to.

I scan in tons of documents manually and the way I deal with this is to consider the process a multi-step process. This is the basic steps I use for black and white documents without color images:

  1. Initial Scan: I consider the initial scan of a document to be just the first, raw scan which is done to get the image into a digital format. I do this typically at a DPI of 200-300 DPI and at standard RGB bit-depth which I believe is 24-bit. Not at a 48-bit color depth which is truly overkill for task of scanning in simple two-color documents.
  2. Scan Processing: After I scan in my pages I process them in Photoshop or Pixelmator. The goal when I do this is to adjust the image’s contrast so the white areas are really white and the black text/lines are really black and not grey.
  3. Convert to Grayscale: You can mostly do this during the whole scan processing phase, but I still consider it a separate stage. In Pixelmator you can convert the RGB to grayscale which will lower the final filesize tenfold. So a 40MB RGB scan now drops down to 4MB with little to no noticeable quality loss.
  4. Save Images: You’re choosing TIFF in your question, but after doing this type of production work for years, I can attest that saving that scanned TIFF image as a JPEG at 100% quality retains a similar overall quality and any “data loss” might technically be “real,” but visually imperceptible.

Now if somehow the pages you have contain images the scan processing task—item 2 in the list—might need some more work. In decent photo editing programs you can select—via drawing boxes basically—only the text you want to adjust and then adjust the embedded images separately if need be.

From my workflow perspective if I have a 20 page document and 3 pages have images I need to adjust this way, I punch through the 17 other black and white pages first and then leave the more complicated things for last.

5
Psycogeek

On the other hand, there are situations where the scan itself at higher bits could be useful; e.g., when there is very little available contrast or range on the scanned materials.

In my experience, the usual consumer scanners do not have any real adjustments in the hardware, for adjusting color, contrast, sensitivity, or lighting. All the adjustments are done in software with the raw data that was acquired from the scan (DPI not included).

If the documents or pictures have really subtle differences, and major correction needs to be applied, starting with more bits (even bits you cannot see on the monitor) means you can do a lot of range “expanding” of the collected data.

If your pictures are way too dark, if your text on the page is washed out, if you have to use levels, contrast or brightness in huge amounts in software after the fact, then having more variations to begin with comes in handy. This can even include scanning in color for a B&W document.

While I agree with what is said in the other answers, when I am working with really bad material, knowing that my scanner (at least mine) depends highly on software after a “stock”, “one size fits all” scan and lighting, I will shift it then to the higher bit rate (16 or 48) for the initial scan, and then feed that more directly without adjustments into the photo program for processing, instead of using the scanner software, finishing (saving out) at normal bit depth, after all the processing in the photo program is completed.

If you're really desperate to get something out of nothing, and have to do a lot of adjusting to achieve it, then worry about having way more possible levels to work with. For the normal items that utilize at least 50% of the data range possible, it is not necessary.

One last note: even after doing my best, manually tweaking it, having massive bit depth, it does not turn a pig’s ear into a silk purse.

1
phyrfox

As you reduce the bit quality, you'll begin to introduce artifacts as the system has to approximate color values. This will become noticeable at very low bit depths, such as 2- or 4-bit scans. For example, a 2-bit scan means every pixel must be pure black or pure white. A large crease in the paper might render as a black line across the image, and even the fibers in the paper could cause flecks of black to appear that are not visible to casual inspection.

In fact, that was an anti-copy device that was used on paper for checks, in the form of just barely noticeable striped lines that spelled VOID, which would appear as a solid black set of letters on low quality scans, such as original Xerox machines, due to aliasing. The moral here is that you want to stay away from the very low settings to avoid aliasing.

On the other hand, a 48-bit reproduction of the text is probably overkill. There's a point where throwing more sampling at something just doesn't have the same cost benefits. Jumping from 24-bit to 48-bit, for example, will probably be indistinguishable to casual observation, but will reduce the file size by half, and reduce the scan time per page significantly. I would recommend starting at 16-bit or 24-bit, and increasing the sampling only if you're not satisfied with a medium quality scan.

И битовая глубина также важна, где только белый лист бумаги и черные чернила? 9 лет назад 0
Бумага почти никогда не бывает «чисто черной на чистом белом». Вам нужен достойный уровень выборки, чтобы избежать наложения псевдонимов, вызывающих артефакты. Если вы когда-либо смотрели на бумагу, которая была отправлена ​​по факсу, даже на простую печать, вы увидите, как низкие настройки сканирования, используемые для экономии места, приводят к появлению темных пятен и линий, которые не были видны на оригинале. Бумага изготовлена ​​из волокон, которые немного различаются по цвету и толщине, а чернила обычно состоят из точек, которые оказываются очень близко друг к другу. phyrfox 9 лет назад 1
@phyrfox «Вы хотите достойный уровень выборки, чтобы избежать наложения псевдонимов, вызывающих артефакты». Да, но вы говорите о DPI и не имеете никакого отношения к глубине цвета. Сканирование с более высоким разрешением и затем понижающей дискретизацией является хорошей стратегией. Глубина цвета играет 100%, независимо от того, что вы описываете. И даже если бы это было так, глубина цвета сканеров по умолчанию в настоящее время составляет 24 бита, что в любом случае является полноцветным. JakeGould 9 лет назад 0

Похожие вопросы