Upland OL User community

Custom Datamapper record boundary based on metadata

Hello everyone.

We have PDF documents with embedded metadata at document level. We need to set record boundary, not necessarily at the document metadata level in the embedded metadata, but only when a couple of metadata fields values satisfy certain conditions.

For instance, I have a PDF with 10 document levels metadata. The datamapper rightly shows that there are 10 documents if I set the record boundary on the Document level; but this is incorrect as far as our real document boundaries are concerned. There are two metadata fields: ClientID and pageCount we want to use to set the record boundary.

So, if the first document in my metadata is ClientID 101 and has 10 pages, it should form two separate records: one with 6 pages and the other with 4 pages.

If the second document level in my metadata is ClientID 102 and has 3 pages, it should form a single record.

If the third (invoice with 3 pages) and fourth (credit note with 2 pages) documents in my metadata are for ClientID 103 and that their total number of pages is less than 6, then they should form a single record in the datamapper.

We cannot have more than 6 pdf pages in a single record for the same ClientID. This is to allow them to fit in C6 envelope when printed duplex.

Can this be done with the datamapper?


Just for clarity, are you referring to PlanetPress metadata?

Ignoring the metadata for a moment, I believe this is going to require a scripted approach.

You’ll have to read in the ClientID, then set a boundary every 6 pages or whenever a new Client ID is found.

So 101 would get split at 6, then continue on through 4. 102 is a new ID, so it resets, counts up to 3. 103 is another new ID, so it resets. Counts to 3, hits 103 again and continues since it’s not a new ID.

There are some examples here that might help get you started on this.