software:unpaper_test
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Last revisionBoth sides next revision | ||
software:unpaper_test [2009/05/14 01:13] – admin | software:unpaper_test [2009/08/22 06:45] – admin | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Unpaper test ===== | + | ====== Unpaper test ====== |
In order to better understand the vast amount of options in unpaper, I've done several tests, which show differences between settings.\\ I scanned a lot of sheetmusic, processed them manually with another application to remove skew. All actions have been done in 8-bit gray at a resolution of 300 dpi.\\ This test should show which settings should be done with unpaper to get a decent automatic conversion to black and white with despeckle and noise removal. | In order to better understand the vast amount of options in unpaper, I've done several tests, which show differences between settings.\\ I scanned a lot of sheetmusic, processed them manually with another application to remove skew. All actions have been done in 8-bit gray at a resolution of 300 dpi.\\ This test should show which settings should be done with unpaper to get a decent automatic conversion to black and white with despeckle and noise removal. | ||
Line 83: | Line 83: | ||
===== Scripting with unpaper ===== | ===== Scripting with unpaper ===== | ||
+ | ==== gray2black ==== | ||
Automatic conversion from pdf files, containing gray images, to black and white images with unpaper and its noise reduction and cleaning up features can be done with the following script: | Automatic conversion from pdf files, containing gray images, to black and white images with unpaper and its noise reduction and cleaning up features can be done with the following script: | ||
<code bash> | <code bash> | ||
Line 101: | Line 102: | ||
# | # | ||
# Example: | # Example: | ||
- | # Process gray images in mozart.pdf: gray2black mozart.pdf -p 4 test.pdf | + | # Process gray images in mozart.pdf: gray2black mozart.pdf -b 0.23 test.pdf |
# | # | ||
# This script uses imagemagick to convert and center an image on an | # This script uses imagemagick to convert and center an image on an | ||
Line 132: | Line 133: | ||
# Only resize if absolutely neccessary. Values for resizing are for left, right, top and bottom border. | # Only resize if absolutely neccessary. Values for resizing are for left, right, top and bottom border. | ||
- | resize=0 | ||
hborder=33 | hborder=33 | ||
vborder=33 | vborder=33 | ||
lhorpix=$horpix | lhorpix=$horpix | ||
lverpix=$verpix | lverpix=$verpix | ||
+ | |||
+ | # Store input-file name temporarily, | ||
+ | arg1=$1 | ||
# Find number of parameters passed | # Find number of parameters passed | ||
Line 143: | Line 146: | ||
exit 1 | exit 1 | ||
fi | fi | ||
- | |||
- | # Store input-file name temporarily, | ||
- | arg1=$1 | ||
# Check OPTIONS | # Check OPTIONS | ||
Line 180: | Line 180: | ||
# Check for document resize neccessity. Iterating through all pages, makes sure, all pages get uniform resizing later. | # Check for document resize neccessity. Iterating through all pages, makes sure, all pages get uniform resizing later. | ||
- | for file in $(ls --sort=time -r / | + | resize=0 |
- | dimension=" | + | # Iterate through files. We need to sort files in a numerical order because pdftoppm generates files without a leading zero. |
+ | # However before the numbers, a ' | ||
+ | for filenum | ||
+ | dimension=" | ||
# width: " | # width: " | ||
# height: " | # height: " | ||
Line 193: | Line 196: | ||
fi | fi | ||
done | done | ||
- | |||
# Resize document if neccessary | # Resize document if neccessary | ||
if [ $resize -eq 1 ]; then | if [ $resize -eq 1 ]; then | ||
Line 213: | Line 215: | ||
echo Some images exceed the maximum size. Now scaling to " | echo Some images exceed the maximum size. Now scaling to " | ||
# iterate through all files and resize all with the same percentage, use 8 bits per pixel | # iterate through all files and resize all with the same percentage, use 8 bits per pixel | ||
- | for | + | for |
- | mogrify -depth 8 -resize " | + | mogrify -depth 8 -resize " |
- | echo resizing file: "$file" | + | echo resizing file: "/tmp/$ptmpf-$filenum" |
done | done | ||
fi | fi | ||
# apply unpaper onto each page | # apply unpaper onto each page | ||
- | for file in $(ls --sort=time -r / | + | for filenum |
- | # Parameter expansion: ${param%word} From the end removes | + | # Parameter expansion: ${param%word} From the end remov? Mis je nog iets? |
- | unpaper -b $b_threshold $filter $donot -t pbm $file ${file%.pgm}.pbm | + | unpaper -b $b_threshold $filter $donot -t pbm "/tmp/$ptmpf-$filenum" |
- | rm "$file" | + | rm "/ |
done | done | ||
# center pbm page on an a4 canvas, convert to pdf | # center pbm page on an a4 canvas, convert to pdf | ||
pdflst="" | pdflst="" | ||
- | for file in $(ls --sort=time -r / | + | for filenum |
- | convert -size $pdim xc:white miff:- | composite -density $pres -units PixelsPerInch -compose atop -gravity Center "$file" - miff:- | convert -monochrome -density 300 -units PixelsPerInch - ps:- | ps2pdf13 -sPAPERSIZE=a4 - "${file%.pbm}.pdf" | + | convert -size $pdim xc:white miff:- | composite -density $pres -units PixelsPerInch -compose atop -gravity Center "/ |
- | rm "$file" | + | rm "/ |
- | pdflst=" | + | pdflst=" |
done | done | ||
Line 241: | Line 243: | ||
</ | </ | ||
+ | ==== findblack ==== | ||
Find a value for black threshold with following script: | Find a value for black threshold with following script: | ||
<code bash> | <code bash> |
software/unpaper_test.txt · Last modified: 2015/04/22 21:51 by admin