OL Learn

Enrich offline then get data back

Don‘t know how to begin…

Extracted datamappers output (as csv) shoud be enriched by an external process, then sent back to the same or another workflow.

Reading a pdf, extracting some data, sorting, adding some data and then „stamping“ some of that new data into those documents.


Doc (pdf) consisting of several adresses stored in extracted fields.
External (manually created) geo-coordinates for each document-id and along with that a geo.map image (and/or just plain text)

Next process should be to insert that image (or some text) into the original pdf.

How can I reference the original metadata in a subsequent process?


Not sure I correctly understand your request, but it would seem like a fairly simple procedure:

  • Make sure that in your initial DataMapper configuration, you have already created two fields (e.g. “geo_coordinates” and “geo_map”. Those fields will be empty in all records since the original data doesn’t contain those values, but that’s OK since you will be adding the values through Workflow.
  • In Workflow, after the data is extracted, use a script to store the geo coordinates and the name of the image file into those two fields in the metadata (this script obviously depends on where/how that information is stored externally)
  • When executing the Create Content task, make sure to tick the “Update Records from metadata” option, which will ensure that the values you added to the metadata are stored back in the database’s original records.

This is a very high-level view of how to achieve what you want, but it should hopefully get you going.

Phil, thx for your reply, seems that I’ll owe you more than one beer…

currently I’m working at:

  1. first step already done (that was my intention to do it that way)
  2. Extraction of data is done in datamapper, saving those values (IDs and addresses and 2 empty fields) to an external datasource along with the metadata
  3. Updating that empty fields in the external datasource with appropriate values
  4. In another workflow: re-reading the original file with its metadata from 2 and the updated datasource.
  5. Updating the metadata with values from the datasource (in a workflow script)
  6. doing the rest (Job Creation, Output Creation)