OL Learn

DataMapper cannot read pdf

Hi All,

I got some errors when reading some pdf files. is there any method to fix below problem ?
Under the below cases, i can open pdf from browser and copy content to notepad correctly.
for case 2, cannot preview the pdf on pdf viewer under connect designer.

thanks.

case 1


case 2
got error “Unable to open the document 1 (DME000049)”
below is the detail error message:

ERROR [30 Nov 2019 12:29:48,876][main] com.objectiflune.datamining.ui.model.DataMiningModel.loadDocument(DataMiningModel.java:-1)[COMPONENT=Data Mapping][SOURCE=Internal] Unable to open the document 1 (DME000049)
java.lang.Exception: com.objectiflune.datamining.pdf.pdfengine.textextract.TextExtractorException: Error while retrieving character data (DME000165)
at com.objectiflune.datamining.ui.model.DataMiningModel.loadDocument(DataMiningModel.java)
at com.objectiflune.datamining.ui.model.DataMiningModel.setDocumentIndex(DataMiningModel.java)
at com.objectiflune.datamining.ui.model.DataMiningModel.updateDocumentCount(DataMiningModel.java)
at com.objectiflune.datamining.ui.model.f.run(f.java)
at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:37)
at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:182)
at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:4213)
at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3820)
at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine$5.run(PartRenderingEngine.java:1150)
at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:336)
at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine.run(PartRenderingEngine.java:1039)
at org.eclipse.e4.ui.internal.workbench.E4Workbench.createAndRunUI(E4Workbench.java:153)
at org.eclipse.ui.internal.Workbench.lambda$3(Workbench.java:680)
at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:336)
at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:594)
at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:148)
at com.objectiflune.application.Application.start(Application.java)
at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:388)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:243)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:653)
at org.eclipse.equinox.launcher.Main.basicRun(Main.java:590)
at org.eclipse.equinox.launcher.Main.run(Main.java:1499)
at org.eclipse.equinox.launcher.Main.main(Main.java:1472)
Caused by: com.objectiflune.datamining.pdf.pdfengine.textextract.TextExtractorException: Error while retrieving character data (DME000165)
at com.objectiflune.datamining.pdf.pdfengine.textextract.internal.WeaverExtractorEngine.analyze(WeaverExtractorEngine.java)
at com.objectiflune.datamining.pdf.data.PDFDocumentData.analyzePage(PDFDocumentData.java)
at com.objectiflune.datamining.pdf.data.PDFDocumentData.reset(PDFDocumentData.java)
at com.objectiflune.datamining.pdf.data.PDFDocumentData.open(PDFDocumentData.java)
at com.objectiflune.datamining.Document.open(Document.java)
at com.objectiflune.datamining.ui.model.d.run(d.java)
at org.eclipse.jface.operation.ModalContext$ModalContextThread.run(ModalContext.java:119)
Caused by: com.datalogics.PDFL.AdobeLibException: The encoding (CMap) specified by a font is missing.
at com.adobe.apdfl.AdobeJNI.PDPage_acquireContent(Native Method)
at com.adobe.apdfl.PDPage.acquireContent(PDPage.java)
at com.objectiflune.apdfl.binding.Page.getContent(Page.java)
at nl.edmond.workflow.components.readpdf.ReaderPage.acceptContentVisitor(ReaderPage.java)
at nl.edmond.workflow.components.extracttext.textextraction.ExtractorImpl.getPage(ExtractorImpl.java)
at com.objectiflune.extraction.client.local.internal.WeaverExtractorLocal.executeExtraction(WeaverExtractorLocal.java)
at com.objectiflune.extraction.client.local.ExtractionLocalClient.a(ExtractionLocalClient.java)
at com.objectiflune.extraction.client.local.ExtractionLocalClient.getPage(ExtractionLocalClient.java)
… 7 more

From the error, seems like you have an issue with the font inside the PDF. Adobe does OCR (Optical character recognition), so if the font is not included, or has a problematic encoding, it still allows you to copy paste the character that you see on screen. PlanetPress Connect doesn’t, it relies on a associating table (glyph VS code) that ususally comes with a fonts.

Also, it seems that your font as character for which it doesn’t find an equivalent in the table (previously mentioned).

I suggest you open a technical support ticket through our website. This way, a technician will look at your PDF(s) and let you know if there is a workaround or not.