This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
software:pdfprocessing:makepdf [2011/08/22 22:27] – created admin | software:pdfprocessing:makepdf [2012/05/31 05:55] (current) – [Source code] admin | ||
---|---|---|---|
Line 11: | Line 11: | ||
Once all files have been downloaded and installed, finally download: | Once all files have been downloaded and installed, finally download: | ||
- | * The application MakePDF: [[http:// | + | * The latest |
- | * crc32 code: 0x97b86e76 | + | * crc32: |
+ | * md5: 0x040362f61d6c37c369370103727af90d | ||
===== Usage ===== | ===== Usage ===== | ||
Either use drag and drop with your pdf or tiff file(s) onto the exe, or add the file(s) as argument.\\ \\ Nb. It is possible to run several instances of the same exe file. There will be no conflicts between them. With this you are able to process several files simultaneously, | Either use drag and drop with your pdf or tiff file(s) onto the exe, or add the file(s) as argument.\\ \\ Nb. It is possible to run several instances of the same exe file. There will be no conflicts between them. With this you are able to process several files simultaneously, | ||
Line 20: | Line 21: | ||
* To add automatically empty pages at the end of the document when the amount of pages is not a multiple of 4, add **-q** to your program name. For example if your exe file is called MakePDF -r600.exe, rename it to MakePDF -r600 -q.exe | * To add automatically empty pages at the end of the document when the amount of pages is not a multiple of 4, add **-q** to your program name. For example if your exe file is called MakePDF -r600.exe, rename it to MakePDF -r600 -q.exe | ||
+ | ==== Windows command line limitation workaround ==== | ||
+ | When selecting a large amount of tif files, windows may throw following error: " | ||
+ | This error appears due to limitations in Microsoft Windows (2000, XP and later), which can handle only a limited amount of characters on the command line.\\ \\ In order to convert a large amount of files, select the first and the last file of this range instead. The range should then be automatically recognized. | ||
===== History ===== | ===== History ===== | ||
+ | * Changes in v08.e, 30th of May 2012 | ||
+ | * Clean up of -partly- messy code. | ||
+ | * Added sequential file sorting as opposed to alphabetical file sorting which is useful for ranges of tif files which have filenames without leading zeros. If a range of tif files is detected, sequential file sorting will be automatically enabled. | ||
+ | * Command line length limitation (Workaround). If there are too many files as argument, drag/drop only the first and last file of a range of tif files. If software recognizes there is a sequence between these selected files (including possible ScanTailor 1L/2R format), it will take that sequence. There may be gaps between them, as long as basic filename matches with each other. Filenames should be either in the format NAME_nnn.tif or NAME_nnn_(1L|2R).tif format), where nnn can be any number and NAME any name. | ||
+ | * A fix for pages which are portrait but turn into landscape after processing is still pending. | ||
* Changes in v0.8c, 22nd of August 2011 | * Changes in v0.8c, 22nd of August 2011 | ||
* Changed program name from FlattenANDCompactPDF_A4Size into the more descriptive MakePDF | * Changed program name from FlattenANDCompactPDF_A4Size into the more descriptive MakePDF | ||
Line 36: | Line 45: | ||
===== Source code ===== | ===== Source code ===== | ||
Please note: the code is as is. No warranty is given, nor any support. | Please note: the code is as is. No warranty is given, nor any support. | ||
- | < | + | < |
; Alternatively pdf files with multiple layers can be supplied, which will be | ; Alternatively pdf files with multiple layers can be supplied, which will be | ||
; converted to pdf files with a single layer and black and white color. | ; converted to pdf files with a single layer and black and white color. | ||
; Output size will be exact a4 page(s) | ; Output size will be exact a4 page(s) | ||
- | ; Copyright (C) Februari 2010, July & August 2011 by Marc Nijdam | + | ; Copyright (C) Februari 2010, July & August 2011, May 2012 by Marc Nijdam |
; | ; | ||
; This program is free software: you can redistribute it and/or modify | ; This program is free software: you can redistribute it and/or modify | ||
Line 69: | Line 78: | ||
#Include < | #Include < | ||
- | Global Const $ThisProgramVersion = "0.8d" ; Current release of this software | + | Global Const $ThisProgramVersion = "0.8e" ; Current release of this software |
- | Global Const $ThisProgramDate = "07-08-2011" ; Release date | + | Global Const $ThisProgramDate = "27-05-2012" ; Release date |
Global Const $ThisProgramName = " | Global Const $ThisProgramName = " | ||
Line 105: | Line 114: | ||
; Error constants | ; Error constants | ||
Global Const $_ERROR_MissingImageMagickObject = -1 | Global Const $_ERROR_MissingImageMagickObject = -1 | ||
- | Global Const $_ERROR_PdfInputFileIsNotValid | + | Global Const $_ERROR_InputFileIsNotValid |
Global Const $_ERROR_ShowHelp = -3 | Global Const $_ERROR_ShowHelp = -3 | ||
Global Const $_ERROR_PdfImageSizeTooLarge = -4 | Global Const $_ERROR_PdfImageSizeTooLarge = -4 | ||
Line 143: | Line 152: | ||
Global Const $QQ = '"' | Global Const $QQ = '"' | ||
Dim $szDrive, $szDir, $szFName, $szExt | Dim $szDrive, $szDir, $szFName, $szExt | ||
- | Dim $i , $j, $k | + | Dim $i, $j, $k |
Dim $Pdf_ListOfFiles2Proces = $tempname & " | Dim $Pdf_ListOfFiles2Proces = $tempname & " | ||
Dim $Pdf_tmp_FileName = $tempname & " | Dim $Pdf_tmp_FileName = $tempname & " | ||
Line 156: | Line 165: | ||
$nEdit = CreateGUI($ThisProgramName) ; Create window with info | $nEdit = CreateGUI($ThisProgramName) ; Create window with info | ||
Dim $TotalPages ; Counted pdf pages for each document | Dim $TotalPages ; Counted pdf pages for each document | ||
- | Dim $tmp | ||
Dim $GetFileName | Dim $GetFileName | ||
Dim $LandscapePages[1] ; Array which holds per page landscape or portrait rotation | Dim $LandscapePages[1] ; Array which holds per page landscape or portrait rotation | ||
Dim $PageFitsExact[1] ; Array which holds per page whether it is exactly a4 dimension | Dim $PageFitsExact[1] ; Array which holds per page whether it is exactly a4 dimension | ||
- | Dim $FileListArray = SortIfArrayContainsOnlyTiff(GetFileListArray($CmdLine[0])) | + | |
- | Const $isPDF = (GetExtension($FileListArray[1]) == $pdf) ; check if first (and thus all the rest as well) file is pdf file. True means all the supplied files are pdf. | + | Dim $FileListArray = GetFileListArray($CmdLine[0]) |
- | ; differentiate | + | ; Sort Supplied array with files. If it contains only tiff files and conforms to scantailor pattern, sort sequentially, |
+ | ; otherwise alphabetically. Also check if first (and thus all the rest as well) file is pdf file. True | ||
+ | ; means all the supplied files are pdf. Differentiate | ||
+ | ; same code can't be simply used for either tiff or pdf files, since it contains many optimalizations | ||
+ | ; specifically for pdf. | ||
+ | Const $IsPDF = (AnalyzeAndSortArray($FileListArray) == $PDF) | ||
If $isPDF Then | If $isPDF Then | ||
- | For $i = 1 to UBound($FileListArray)-1 | + | For $i = 0 to UBound($FileListArray)-1 |
$GetFileName = $FileListArray[$i] | $GetFileName = $FileListArray[$i] | ||
- | GUICtrlSetData($nEdit, | + | GUICtrlSetData($nEdit, |
GUICtrlSetData($nEdit, | GUICtrlSetData($nEdit, | ||
GUICtrlSetData($nEdit, | GUICtrlSetData($nEdit, | ||
Line 179: | Line 192: | ||
Else ; process below tiff files | Else ; process below tiff files | ||
GUICtrlSetData($nEdit, | GUICtrlSetData($nEdit, | ||
- | $TotalPages = UBound($FileListArray)-1 ; number of suppplied tif files. | + | $TotalPages = UBound($FileListArray) ; number of suppplied tif files. |
GUICtrlSetData($nEdit, | GUICtrlSetData($nEdit, | ||
ProcessPages($TotalPages, | ProcessPages($TotalPages, | ||
Line 193: | Line 206: | ||
; number of supplied files and check if this program is invoked through Scite. | ; number of supplied files and check if this program is invoked through Scite. | ||
Func GetFileListArray(Const $count) | Func GetFileListArray(Const $count) | ||
- | If @Compiled <> 0 Then | + | If (@Compiled <> 0) Then |
- | If $count == 0 Then ShowError($_ERROR_ShowHelp) ; number of parameters | + | If ($count == 0) Then |
- | Return | + | ShowError($_ERROR_ShowHelp) ; number of parameters |
+ | EndIf | ||
+ | Local $tmp = $CmdLine; Array for temporary storing a copy of $CmdLine | ||
+ | _ArrayDelete($tmp, | ||
+ | Return $tmp | ||
Else ; When invoking this program from within the Scite compiler, use some preset variables for which file to use | Else ; When invoking this program from within the Scite compiler, use some preset variables for which file to use | ||
Local $dbg=$tif ; Choose either $tif or $pdf | Local $dbg=$tif ; Choose either $tif or $pdf | ||
Line 213: | Line 230: | ||
EndIf | EndIf | ||
Return $FileNames ; return amount of files. | Return $FileNames ; return amount of files. | ||
+ | EndIf | ||
+ | EndFunc | ||
+ | |||
+ | ; get a list of files from a given directory | ||
+ | Func getFileList(ByRef $path) | ||
+ | Local $search = FileFindFirstFile($path & " | ||
+ | If ($search > -1) Then | ||
+ | Local $fileList[1] | ||
+ | Local $foundObject | ||
+ | Do | ||
+ | $foundObject = FileFindNextFile($search) | ||
+ | If @error Then ExitLoop | ||
+ | If (not isDirectory($path & " | ||
+ | Until False | ||
+ | FileClose($search) | ||
+ | _ArrayDelete($fileList, | ||
+ | sortSequential($fileList) ; files are sorted from low to high | ||
+ | return $fileList | ||
+ | EndIf | ||
+ | ShowError(" | ||
+ | EndFunc | ||
+ | |||
+ | ; Select from $fileList all files which are sequentially | ||
+ | ; between $this[0] and $this[1] | ||
+ | ; from which the basepart resembles that of the basepart of an array element | ||
+ | Func selectFilesFromList(ByRef $this, ByRef $fileList) | ||
+ | Local $inRange = False ; If true, then capture the files which are in sequence. | ||
+ | Local $tmp[1] ; Empty array with one element | ||
+ | Local $i = 0 | ||
+ | Do | ||
+ | If (TifFilenameBasePart(RemovePath($this[0])) == TifFilenameBasePart($fileList[$i])) Then | ||
+ | If ($fileList[$i] == RemovePath($this[0])) Then ; start from range | ||
+ | $inRange = True | ||
+ | EndIf | ||
+ | If ($inRange) Then | ||
+ | _ArrayAdd($tmp, | ||
+ | EndIf | ||
+ | If ($fileList[$i] == RemovePath($this[1])) Then ; end from range | ||
+ | $i = Ubound($fileList) | ||
+ | EndIf | ||
+ | EndIf | ||
+ | $i = $i + 1 | ||
+ | Until ($i >= UBound($fileList)) | ||
+ | If (UBound($tmp)> | ||
+ | _ArrayDelete($tmp, | ||
+ | return $tmp | ||
+ | Else | ||
+ | ShowError(" | ||
+ | exit | ||
EndIf | EndIf | ||
EndFunc | EndFunc | ||
Line 374: | Line 440: | ||
Else | Else | ||
GUICtrlSetData($nEdit, | GUICtrlSetData($nEdit, | ||
- | $GetFileName = $FileListArray[$j] | + | $GetFileName = $FileListArray[$j |
$FilePagesProperties = IM_Get_Dims($GetFileName) ; get dimensions from tiff file | $FilePagesProperties = IM_Get_Dims($GetFileName) ; get dimensions from tiff file | ||
FileCopy($GetFileName, | FileCopy($GetFileName, | ||
Line 416: | Line 482: | ||
Next | Next | ||
EndIf | EndIf | ||
+ | |||
If (($q) AND ($TotalPages > 2) AND (CalcEmptyPages($TotalPages)> | If (($q) AND ($TotalPages > 2) AND (CalcEmptyPages($TotalPages)> | ||
Line 431: | Line 498: | ||
FileCopy($Pdf_tmp_FileName, | FileCopy($Pdf_tmp_FileName, | ||
Else | Else | ||
- | FileCopy($Pdf_tmp_FileName, | + | FileCopy($Pdf_tmp_FileName, |
EndIf | EndIf | ||
FileDelete($Pdf_tmp_FileName) | FileDelete($Pdf_tmp_FileName) | ||
Line 710: | Line 777: | ||
_PathSplit(RemoveDoubleQuotes($OutputFile), | _PathSplit(RemoveDoubleQuotes($OutputFile), | ||
Return $szFName & $szExt | Return $szFName & $szExt | ||
+ | Else | ||
+ | ShowError(" | ||
+ | EndIf | ||
+ | EndFunc | ||
+ | |||
+ | ; Remove filename from path | ||
+ | Func getPath(Const $OutputFile) | ||
+ | If FileExists(RemoveDoubleQuotes($OutputFile)) Then | ||
+ | Local $szDrive, $szDir, $szFName, $szExt | ||
+ | _PathSplit(RemoveDoubleQuotes($OutputFile), | ||
+ | Return $szDrive & $szDir | ||
Else | Else | ||
ShowError(" | ShowError(" | ||
Line 810: | Line 888: | ||
; Check if first array item is a tiff file, if yes, check all other files for validity as well. If all are tiff files, sort them. | ; Check if first array item is a tiff file, if yes, check all other files for validity as well. If all are tiff files, sort them. | ||
; Sort tif files only on filename, without path | ; Sort tif files only on filename, without path | ||
- | Func SortIfArrayContainsOnlyTiff(Const $FileListArray) | + | ; ---> ByRef Array[] ; filenames including full path, unsorted and unchecked for filetype |
- | Local $testfull | + | ; <--- FileExtension ; Extension of series of files |
- | If $testfull | + | Func AnalyzeAndSortArray(ByRef $this) |
- | if IM_Get_Type($FileListArray[1]) == $pdf Then Return | + | Local $FilesArePDF |
+ | Local $FilesAreTIF = True ; True means all extensions are tif | ||
+ | Local $i = 0 ; index variable going through array elements. | ||
+ | |||
+ | Do ; Check al file extensions. A series of files should be either tif or pdf | ||
+ | If ($FilesAreTIF == True) Then | ||
+ | If (StringRegExp(StringLower($this[$i]),' | ||
+ | $FilesAreTIF = False | ||
+ | EndIf | ||
+ | EndIf | ||
+ | If ($FilesArePDF == True) Then | ||
+ | If (StringRegExp(StringLower($this[$i]),' | ||
+ | $FilesArePDF = False | ||
+ | EndIf | ||
+ | EndIf | ||
+ | $i = $i + 1 | ||
+ | Until ($i > UBound($this)-1) ; A filetype test will only be done with checking the extension | ||
+ | If ($FilesAreTIF) | ||
+ | If (UBound($this) == 2) Then | ||
+ | If (CompareTwoFiles($this)) | ||
+ | sortSequential($this) ; file with lowest index is first element | ||
+ | $this = selectFilesFromList($this, | ||
+ | GUICtrlSetData($nEdit, | ||
+ | return $TIF | ||
+ | EndIf | ||
+ | EndIf | ||
+ | |||
+ | If (AreFilesSequential($this)) Then | ||
+ | sortSequential($this) | ||
+ | GUICtrlSetData($nEdit, | ||
+ | Else | ||
+ | _ArraySort($this) | ||
+ | EndIf | ||
+ | Return $TIF | ||
+ | ElseIf ($FilesArePDF) Then | ||
+ | _ArraySort($this) | ||
+ | Return $PDF | ||
Else | Else | ||
- | if StringLeft(GetExtension($FileListArray[1]),4) == $pdf Then Return $FileListArray | + | ShowError(" |
+ | Return -1 | ||
+ | EndIf | ||
+ | EndFunc | ||
+ | |||
+ | ; Verify | ||
+ | Func AreFilesSequential(ByRef $this) | ||
+ | Local $i = 0 | ||
+ | Local $FileCriteria = True ; Criteria are true if: valid scantailor output filename format: ' | ||
+ | if (UBound($this) <= 1) Then | ||
+ | Return | ||
+ | Else | ||
+ | Local | ||
+ | Do | ||
+ | $pnumList[$i] = find_tif_PgNr($this[$i]) | ||
+ | If ($pnumList[$i] == -1) Then ; Check if fileformat criteria are valid for sorting sequentially | ||
+ | $FileCriteria = False | ||
+ | EndIf | ||
+ | $i = $i + 1 | ||
+ | Until (($i > UBound($this)-1) Or ($FileCriteria == False)) ; stop after last item or if wrong filename appears | ||
EndIf | EndIf | ||
- | Local $i = 1 | + | If ($FileCriteria) Then |
- | Local $ft | + | _Arraysort($pnumList) ; Check if all page numbers are really unique |
- | Local $temp1[UBound($FileListArray)-1][2] | + | Local $isDifferent = True |
- | Local $temp2[UBound($FileListArray)] | + | Do |
- | If $testfull | + | If ($pnumList[$i-1] == $pnumList[$i-2]) Then |
+ | $isDifferent = False | ||
+ | EndIf | ||
+ | $i = $i - 1 | ||
+ | Until (($i <= 1) Or ($isDifferent == False)) | ||
+ | Return $isDifferent ; true if all pagenumbers are unique and false if there are two pages with the same number | ||
+ | Else | ||
+ | Return False | ||
+ | EndIf | ||
+ | EndFunc | ||
+ | |||
+ | ; Analyze path, return True if it is a directory | ||
+ | Func isDirectory($this) | ||
+ | If (StringInStr(FileGetAttrib($this)," | ||
+ | | ||
+ | EndFunc | ||
+ | |||
+ | ; Function to extract the number which comes after the underscore | ||
+ | ; in a string, immediatelly followed by a dot and the tif filetype. | ||
+ | ; This will cause a problem with filenames with a trailing _1L or _2R | ||
+ | ; Solution: If a filename is recognized with a trailing _1L or _2R it will multiply the pagenumber with 2, add 0 or 1, depending on 1L or 2R. | ||
+ | ; " | ||
+ | Func find_tif_PgNr(Const $tmp) | ||
+ | If (StringRegExp(StringLower($tmp),' | ||
+ | Return Number(StringRegExpReplace($tmp, | ||
+ | ElseIf (StringRegExp(StringLower($tmp),' | ||
+ | Return Number(StringRegExpReplace($tmp, ' | ||
+ | Else | ||
+ | Return -1 | ||
+ | EndIf | ||
+ | EndFunc | ||
+ | |||
+ | ; This function will sort a one dimensional array which has the | ||
+ | ; the following properties: | ||
+ | ; in a sequential manner. (In contrast to alphabetically) | ||
+ | ; It will return the same array. | ||
+ | Func sortSequential(ByRef $this) | ||
+ | Local $i ; Index for iterating through array elements | ||
+ | Local $SequentiallySorted | ||
Do | Do | ||
- | If $testfull Then | + | $SequentiallySorted = True |
- | GUICtrlSetData($nEdit, | + | For $i=0 to UBound($this)-2 |
- | $ft = IM_Get_Type($FileListArray[$i]) ; Comprehensive test | + | if (find_tif_PgNr($this[$i+1]) < find_tif_PgNr($this[$i])) Then |
- | GUICtrlSetData($nEdit, | + | _ArraySwap($this[$i+1], $this[$i]) |
- | Else | + | $SequentiallySorted=False |
- | $ft = StringLeft(GetExtension($FileListArray[$i]),4) ; Either select this line, or the following to check for tif validity. | + | EndIf |
- | EndIf | + | Next |
- | If $ft <> $tif Then ShowError("Not all files are tif files. Please supply either only pdf or only tif files." | + | Until |
- | $temp1[$i-1][0] = RemovePath($FileListArray[$i]) | + | EndFunc |
- | $temp1[$i-1][1] = $FileListArray[$i] | + | |
- | $i = $i + 1 | + | ; Compare an array with two elements, each containing a filename and verify if the non-changing part of the name is the same. |
- | Until | + | ; This is useful to check if two files are part of a sequence. |
- | _ArraySort($temp1) | + | Func CompareTwoFiles(ByRef $this) |
- | ; $temp2[0] = UBound($temp1) | + | Return (TifFilenameBasePart($this[0]) == TifFilenameBasePart($this[1])) |
- | For $i=1 to UBound($temp1) | + | EndFunc |
- | $temp2[$i] = $temp1[$i-1][1] | + | |
- | Next | + | ; Extract from a filename the non-changing part. For example: |
- | Return | + | ; Cb_1.tif ----> Cb_.tif |
+ | ; Cb_1_1L.tif ----> Cb__.tif | ||
+ | Func TifFilenameBasePart(ByRef | ||
+ | If (StringRegExp(StringLower($this),' | ||
+ | Return StringRegExpReplace($this, '(.*_)[\d]+(.*)', '$1$2') | ||
+ | ElseIf (StringRegExp(StringLower($this),' | ||
+ | Return StringRegExpReplace($this, '(.*_)[\d]+(_)(1L|2R)(.*)', | ||
+ | Else | ||
+ | Return | ||
+ | EndIf | ||
EndFunc | EndFunc | ||
Line 997: | Line 1177: | ||
; | ; | ||
; | ; | ||
- | Case $_ERROR_PdfInputFileIsNotValid | + | Case $_ERROR_InputFileIsNotValid |
- | MsgBox(0," | + | MsgBox(0," |
- | " | + | " |
Case $_ERROR_ShowHelp | Case $_ERROR_ShowHelp | ||
MsgBox(0," | MsgBox(0," | ||
Line 1015: | Line 1195: | ||
" | " | ||
" | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
" | " | ||
Case $_ERROR_PdfImageSizeTooLarge | Case $_ERROR_PdfImageSizeTooLarge |