OL Learn

Splitting files based on metadata and page count

I have a workflow (screenshot below) that I know could be optimised, and wanted to see what suggestions people might have. I’ve obsfucated anything that might identify the brand it relates to.

It creates PDFs one record at a time. It checks the metadata to see if it belongs to Part A or Part B. It then saves the file into a folder (concatenating). If Part B, it splits the first page into a separate file and the other pages into a second file.

Works fine although slowly and the resulting PDFs are not optimised. I could generate Part A and Part B separately, but I still need to split the first page of each record’s document into a “B1” file and the rest into “B2”. They are variable length and has to use page number not a detection area.

Any thoughts?

You’ve got the right idea in your title. You can do the metadata or page grouping splitting directly in your output/job creation and get rid of everything after your Create Output step. The Create output step would take care of making your multiple pdfs and storing them in the appropriate folder.

I’ll see if I can locate a concrete example for you.

I was able to do this easily enough when splitting the two types (Part A and B) using output/job creation presets. The difficulty was creating taking the first page of Part B and putting those first pages in a separate file to the rest. Any particular guidance around that if you can’t find a concrete example would be appreciated.

@AlbertsN Any chance you’ve found anything?

No, your last post indicated you’d already figured out what I was going to look for as an example (separation based Output). Now you’re trying to split individual pages from one of those sets.

If those pages themselves are not somehow directly related to records of their own, you can’t do the separation in Connect. The Separation settings are only going to allow you to split down to the record level, not section or page. For that, you’re going to have to do it directly in the Workflow.