Back to all How-tos

PDF/A-3 and e-Invoicing

Original Author: Manuel Polling

e-Invoicing standards are emerging more and more across the globe, are seeing more use, and are increasingly being required. A number of these require PDF/A-3 with an XML attachment. Connect 2019.1 comes with features for creating PDF/A-3, and turning them into ZUGFeRD, and Factur-X conforming e-invoices. PDF/A-3 has of course other applications besides e-invoicing, it’s an archive format after all, and those applications become possible as well of course. This article explains where the new functionality is located in OL Connect, and how it can be used. We also briefly go into PDF/A-3 and its (e-invoicing) applications.

Why PDF/A-3

Anyone not yet familiar with PDF/A-3 might wonder what the benefits are beyond the older PDF/A-1, or regular PDF. These can be summarized as follows:

  • PDF/A is a standardized subset of PDF that focuses on archiving. A PDF/A file is intended to still be readable decades after it has been produced; for instance, by requiring all fonts to be embedded so there are no dependencies on other files.
  • PDF/A-3 allows transparency, OpenType fonts, and other things that were not yet allowed in PDF/A-1, because the A-3 version of the standard is based on a later version of PDF (as is PDF/A-2).
  • PDF/A-3 specifically allows embedding of any file as an attachment in a PDF/A file.

The ability to have a single file that is a human readable document (that will remain readable in the future), that can also contain a machine readable version, the original source, or any other related data, opens up interesting new possibilities. More detailed information about PDF/A-3 can be found below.

PDF/A-3 output in OL Connect

Connect 2019.1 will let you create PDF/A-3 conforming files from Output Creation. To embed files in these PDF’s, a new task “PDF/A-3 attachments” has been added to Workflow. Together, these functionalities allow you to produce PDF/A-3 files for archiving, e-invoicing, or any other application where it is convenient to combine PDF with other file formats.

Having the PDF/A-3 creation in Output Creation, and the attachment capability in Workflow, also means that PDF/A-3 can be created with any OL Connect product: PrintShop Mail Connect, PlanetPress Connect, and PReS Connect; while adding attachments to these PDF’s is restricted to PlanetPress Connect and PReS Connect. The PDF/A-3 Attachments task also requires an OL Connect Image license.

Create PDF/A-3 conforming files
To create a PDF/A-3 file with Connect 2019.1, is just as easy as it was to create a PDF/A-1 file. In the Output Preset Wizard, you choose “Generic PDF”, press Next, and then choose PDF/A-3b in the drop-down of the PDF Options page.pdfa3 1

Set PDF document information

Once on the PDF Options page, you’ll notice that it now also has controls to set the basic metadata for PDF’s. When creating PDF for archiving purposes, it can be important to have proper values for Title, Author, Description, and Keywords. This document information will work for any kind of PDF that can be created with Output Creation. You can use variables as well to set this information, including metadata from Job Creation.

Embed files in PDF/A-3 conforming files
pdfa3 2

Once you have a PDF/A-3, you can use Workflow to add one or more attachments to it with the new PDF/A-3 Attachments task (located in the Actions drawer of the Workflow Designer). The task will allow you to embed as many files as you like into your PDF/A-3 file, and it takes either the PDF, or one of the intended attachments as its input. The output will always be the modified PDF/A-3 with the attachments in it. Most settings in this task allow dynamic values, so it will be flexible to use.

This task also allows you add the proper information to create ZUGFeRD or Factur-X conforming PDF/A-3.

Using OL Connect's PDF/A-3 for e-Invoicing

ZUGFeRD 1.0, and Factur-X 1.0
Germany’s ZUGFeRD, and the French Factur-X (which is part of Chorus Pro) both essentially are PDF/A-3 with an XML file embedded in a specific way. Part of this is specifying metadata that makes it possible to identify the PDF/A-3 as an e-invoice according to those standards. The PDF/A-3 Attachments task has an additional tab (labeled e-Invoicing) that lets you simply choose the metadata for the standard you need, and set any configurable properties.

ZUGFeRD 2.0, and X-Rechnung

Both of these standards are identical to the French Factur-X standard, on which the French worked together with the Germans. So naturally, these are supported as well.

Other PDF-based e-invoicing standards

While the French and German standards are supported out of the box with OL Connect 2019.1, other similar standards are easy to add in future versions. Although the mechanism to add other standards is not documented, it should be no problem to add support for similar PDF-based e-Invoicing standards on a case by case basis. If you have a case like that, for instance a customer who wants to create Thai e-Tax conforming files, make sure to raise a ticket. Once an actual case has provided clarity on that particular standard, it will also be easier to reliably support it out of the box.

Create ZUGFeRD and Factur-X conforming files
When creating PDF/A-3’s conforming to Factur-X, or ZUGFeRD, you have to pick the right extension schema on the e-Invoicing metadata tab, and make sure to match the attachment name of your conforming XML file with the DocumentFileName property of the metadata. The screen shots below show all these settings.pdfa3 4
pdfa3 5

The last thing to set in the PDF/A-3 attachments task, is the conformance level of the attached XML. Each standard has specifies a number of different levels a file can conform to. For ZUGFeRD 1.0, these are BASIC, COMFORT, and EXTENDED. Factur-X has a few more choices in conformance levels: MINIMUM, BASIC WL (“WL” is short for Without Lines), BASIC, EN 16931, and EXTENDED.

How to create conforming XML files

With the ability to create PDF/A-3 files with Connect, and an easy way to embed XML files and turn the PDF’s into e-invoicing compliant files, only one challenge remains: how to create a standards conforming XML file. The answer is easy, although perhaps not that easy to execute: if you don’t get a conforming file to begin with, do your own transformation. When starting from the result of data mapping, this can be done with an XSLT transformation that we can run from Workflow using the Open XSLT task.

Even though creating XSLT transformations may not be trivial, this does give us a solution for creating an e-invoice from any input file that can be handled by the Connect Data Mapper.

One may have hoped for an out of the box transformation capability, but unfortunately that is simply not feasible with limited resources: between ZUGFeRD, and Factur-X there are potentially already 8 different formats due to the different conformance levels (although ZUGFeRD’s EXTENDED level could be identical to Factur-X’s), and other standards are likely to be different again (due to local legislation, taxes, etc.). And then of course the input data is likely to be different for every customer.

What about the existing ZUGFeRD plugin?
Since we now have a way to support all ZUGFeRD conformance levels in a more generic way, this plugin is no longer needed. In cases where it is in use, there is no reason to change anything at this time. In new cases, the recommendation is to use the new approach. Commercially, this means that a customer would be offered the OL Connect Imaging module, instead of the ZUGFeRD plugin.

Sample Workflow processes
Putting all this functionality together requires a relatively simple Workflow process. Given that the new PDF/A-3 Attachments task can work with a PDF/A-3 job file, or an intended attachment, a Workflow process for creating a PDF/A-3 with attachments can take two shapes. Below, we have examples of both.

A ZUGFeRD invoice with a PDF job data file
This process creates a single ZUGFeRD conforming PDF/A-3 from an input file. The input for the process is a single record that will be data mapped. This also means that this process is a recipe for processing any data that can processed by OL Connect’s Data Mapper.
pdfa3 6

  1. Pick up a file.
  2. The Data Mapper task returns a record and record set id that is needed for the subsequent Content Creation (step 8).
  3. The XML invoice data file is created first on a branch, so it will be available when the Output Creation is done.
  4. The record id is used to retrieve all record data as JSON.
  5. The JSON is converted to XML, so it can be processed with an XSLT transformation.
  6. The XML from the Connect Data Model is transformed into one that complies with a certain ZUGFeRD conformance level.
  7. The XML file is saved in Workflow’s temporary folder for this process, so it will get cleaned up automatically when the process is done.
  8. The All in One performs the Content Creation, Job Creation, and Output Creation, and results in a PDF/A-3 conforming output file.
  9. The intermediate XML file is embedded into the PDF/A-3 job file.
  10. The resulting ZUGFeRD file is saved.

 

A ZUGFeRD invoice with an attachment as the job data file
If the input file already is an XML file, it can be more convenient create the PDF/A-3 in a branch on the side. This is probably especially true if the incoming XML is already completely conforming, or, in the exact opposite case, if the XML needs lots of processing, while the PDF/A-3 creation requires is fairly simple, as it is in this example.
pdafa3 6

  1. Pick up an XML file
  2. Set the PDF file name and location because it is needed both in the branch and the main steps.
  3. Create the PDF/A-3 file on a branch.
  4. Create the PDF/A-3 file. In this case the All in One can handle this for us in a single step.
  5. Temporarily save the PDF in Workflow’s temporary folder for this process, so it will get cleaned up automatically when the process is done.
  6. Transform the incoming XML into one that complies with a certain ZUGFeR conformance level. As mentioned, this step maybe redundant if the incoming XML is already conforming.
  7. Pick up the intermediate PDF/A-3 file and embed the XML job file, resulting in a conforming ZUGFeRD PDF.
  8. Save the resulting file.

About PDF/A-3

The PDF/A-3 archive standard is a successor of PDF/A-1, and PDF/A-2. They are all standardized in the ISO 19005 standard, as Part 1, 2, and 3 respectively. Both PDF/A-2, and A-3 are based on PDF 1.7 (a.k.a. ISO 32000-1), which means that these allow things like transparency effects and layers, and embedding of OpenType fonts, among others.

PDF/A-3 differs from PDF/A-2, in that it allows embedding of arbitrary file formats. This opens up applications that can have the best of both worlds:

  • Hybrid archiving, where, for instance, an MS Excel source file is embedded in the PDF that is to be archived; the Excel file is likely to offer the best viewing experience in the short term, while the PDF version provides a stable view of the same data in the long term, when there may no longer be a compatible Excel version available (anything can happen in 10, 20, or 30 years). Having one embedded in the other underlines their relation, and reduces the risk of losing that connection.
  • Combining a human readable version of a document with a machine readable version. This is why e-Invoicing standards like the German ZUGFeRD, and French Factur-X use PDF/A-3 as their file format. Both specify XML-based standards for invoicing data, which is then embedded in a PDF/A-3 file. The XML can be used for straight-through processing of the invoice, while the PDF provides a human readable version of the invoice.

When adding a file as an attachment to a PDF/A-3, it is required to specify the relationship between the PDF, and the attachment. This relationship can be between the embedded file and the entire PDF document, but it can also be related to a specific part or object in the PDF. Connect 2019.1 only supports relationships between the entire PDF document and the attachments. The relation can be one of 5 possibilities:

Alternative

means that the attachment is an alternative representation of the PDF document itself. For instance, an XML version of the exact same invoice as shown by the PDF.

Data

is for files that contain the information to derive a visual representation in the PDF. For instance, a CSV with the detail lines of an invoice. Another example could be the data that was used to create a graph, although Connect 2019.1 will not let you associate that file with just that graph.

Source

is used when the embedded file is the source that the PDF was created from. In case of Connect content creation, you can debate whether the input data or the template itself would be the source. But if a PDF is created from Word, the Word file would be the source.

Supplement

is used when the file represents a supplemental representation of the PDF content that, for instance, may be easier to consume for some. A plain text version of an invoice might be a good example of this. Unspecified is a fallback to use when the relationship is not known or cannot be described using one of the other values.

What relationship to use, is mostly a question of semantics and user choice; it makes sense to only use Source if the PDF was actually created from that file, and with Alternative, one would assume the file to have the same information, so you can assume it’s fine to use just the attachment for further processing.

All PDF/A versions define different conformance levels. Level a requires the PDF to be structured with tags (e.g, Tagged PDF), and have full unicode information. Level b does not have these requirements, and is therefore easier to produce, especially when coming from print data. PDF/A-2 and A-3, also introduce conformance level u, which is level a without the tags (so only the unicode information).

 

Sample resources

my-invoice is a sample PDF/A-3 file with XML data embedded.

Note: this is not conforming to any e-Invoicing
standard.

Leave a Reply

Your email address will not be published. Required fields are marked *