OL Learn

Merging pdfs based on name of pdf

Hi - is it possible to take one pdf from a folder and based on the name, look in another folder and merge it with the matching pdf? For example, I have 500 pdf’s in folder A and in folder B i have 1500 pdf’s. I have a folder capture to pick up the pdf from folder A and then a script to get the name of the pdf and then I get stuck because I am not sure how to set it up to look in folder B for that same name and then merge.

Thanks

Hi,

EDIT: This does not work. See my post below for a working solution.

How about a simple Workflow like pictured below? Folder Capture grabs from A, a Run Script for getting the file names should they require scripting due to A/B having different file naming conventions. (The Set Job Infos and Variables was used in my test and sets %9 to Original filename without extension, aka %O. This assumes file names from A/B are identical. If not use your script.)

Then branch out and send the file to the Merged folder using a File Name such as “Merged %9.pdf”. Then next another folder capture that grabs files from B folder, gets the name using a Set Job info plugin as before and concatenates the file with the same name in the Merged folder using the same File Name format as above. Be sure to enable Concatenate Files in the final Send to Folder plugin and remove “\014” separation string.

Obviously in my simple sample I’m assuming file names are identical from folders A/B.

screen

(FYI, in my simple test where I assume all files are named the same I could have not used Set Job Info plugins for getting the name. %O would have had it already without setting %9 to that value)

Regards,
S

Sharne
Does a simple file concatenate work to merge pdfs? It didn’t used to in PP7, but I’m prepared to be impressed.

ok - I see where you are going with this. I set my workflow up like yours. The name of the files are similar so I have to use script. No problem. I got to the second folder capture it kept looping through folder b and adding pdf’s to the merge folder. What I need is it to merge/concatenate the two pdf’s that match and then go back up to folder A grab another pdf and move it to merge then go to folder b find the matching pdf and move it to the merge folder and then merge/concatenate and so on.

Thanks

Hi Stuart,

When using ready made PDF’s, yes a concat works fine. IIRC it did too when I used PP7.

Regards,
S

Hi ahaddad,

I simply tested it quick in debug/Workflow. I can check it again in the morning. That is if someone does not already have a solution. There are a few ways this can be done I’m sure.

EDIT: I know how to ‘fix’ it. Will get to it in the morning for you asap.

Regards,
S

ok - great.
Thank You !!

Hi ahaddad,

Looking at this I think it best if you supply some PDF names from folder A and B that you would like to merge. I need to see that before I can be sure I’m going in the right direction.

Regards,
S

Folder A would have pdf names like plan numbers- always 6 digits. For example - 123456.pdf, 564789.pdf, 965482.pdf and so on. Folder B would have pdf’s like 404a5_123456.pdf, 404a5_564789.pdf, 965482.pdf. Folder B would have more pdf’s than Folder A so not all pdf’s would match up with Folder A.

thanks

Hi,

Sorry for taking so long. Had a busy work day. I thought of doing it like this. Put files in folder B, these files will be added too. After that copy files to folder A, these will be picked up and concat to the files in folder B. (Folder A triggers the process to start)

Test on a smaller batch first.

This is the process.
screen1

Here are what the other plugins are set too.

Be sure to add a local variable called fileName to your process. And add this script.

dim objXMLDoc
dim NodeList
dim fileAName, fileBName

fileAName = watch.expandstring("%9")

'watch.log fileAName,2
Set objXMLDoc = CreateObject("Microsoft.XMLDOM")
'Load current workflow data into XML object
objXMLDoc.load(Watch.GetJobFilename)

'Grab XML nodes
Set NodeList = objXMLDoc.documentElement.selectNodes("folder/file")

'Loop through XML
for i = 0 to (NodeList.length-1)
    set ChildNodes = NodeList.item(i).childNodes

    fileBName = ChildNodes.item(0).text
    'watch.log "fileBName is " & fileBName,2
    if InStr(fileBName, fileAName) <> 0 then
        watch.setvariable "fileName", fileBName
        exit for
    end if
next

This should get you going in the right direction.

Regards,
S

1 Like

DISREGARD — I didn’t add the local variable.

Thank you!!! have it all set up and testing it. Getting error in script saying - [0006] W3602 : Error 0 on line 22, column 9: Script.NamedItemWatch: Variable does not exist.

This is ABSOLUTELY perfect !!! All I had to do is add another folder capture at the beginning so that Folder A would empty out. I as overthinking the process.

Thank you so much!!!

Glad to help. Apologies that I winged it before understanding your needs. I try to help out here when I can.

Regards,
S

Hi ahahhad,

I modified the script because I spotted a potential error. If folder A has a file that does not match any files in folder B, the file in question will concat to the previously concatenated file. This will cause mismatches on your side. So I made a boolean variable in the script that will check if this occurs and if so move the file to folder B with its original name.

dim objXMLDoc
dim NodeList
dim fileAName, fileBName, found

fileAName = watch.expandstring("%9")
found = false

'watch.log fileAName,2
Set objXMLDoc = CreateObject("Microsoft.XMLDOM")
'Load current workflow data into XML object
objXMLDoc.load(Watch.GetJobFilename)

'Grab XML nodes
Set NodeList = objXMLDoc.documentElement.selectNodes("folder/file")

'Loop through XML
for i = 0 to (NodeList.length-1)
    set ChildNodes = NodeList.item(i).childNodes

    fileBName = ChildNodes.item(0).text
    'watch.log "fileBName is " & fileBName,2
    if InStr(fileBName, fileAName) <> 0 then
        watch.setvariable "fileName", fileBName
        found = true
        exit for
    end if
next

if found = false then
    watch.setvariable "fileName", fileAName & ".pdf"
end if

Regards,
S

1 Like

Great catch!!! - I didn’t think of that either because in this particular example folder A all the pdf’s matched. But that wouldn’t always be the case.

Thank you so much!!!