OL Learn

Extract data from pdf with script

Trying to extract data from a pdf in a javascript

Why do I get an error when closing the pdf:

Script:
var inputFile = Watch.GetVariable(“inputFile”);
var inputPDF = Watch.GetPDFEditObject();
inputPDF.Open(inputFile,false)
var metaData
var pdfNumberOfPages = Watch.GetVariable(“Pages”);
for (var i = 0; i < pdfNumberOfPages; i++) {
metaData = inputPDF.Pages(i).ExtractText2(0,11.5208,6.875,11.6458);
Watch.log("Page: " + i, 3);
Watch.log("MetaData: " + metaData,3);
}
inputPDF.Close();

Error:

[0005] W3602 : Error 0 on line 33, column 1: AlambicEdit.AlambicEditPDF.1: Unable to close PDF file: there are still 3 pages opened.

That’s a known issue with JScript’s garbage collection procedure.

To circumvent it, wrap your code inside a ­try...catch...finally structure and explicitly call the CollectGarbage() method, like so:

var inputFile = Watch.GetVariable(“inputFile”);
var inputPDF = Watch.GetPDFEditObject();
try {
  inputPDF.Open(inputFile,false)
  var metaData
  var pdfNumberOfPages = Watch.GetVariable(“Pages”);
  for (var i = 0; i < pdfNumberOfPages; i++) {
    metaData = inputPDF.Pages(i).ExtractText2(0,11.5208,6.875,11.6458);
    Watch.log("Page: " + i, 3);
    Watch.log("MetaData: " + metaData,3);
  }
  CollectGarbage();
} catch (e) {
  throw e
} finally {
  try { 
    OverPDF.Close(); 
  } catch(e) {
    // do nothing
  }
}

Hope that helps.

Just used CollectGarbage();
Then it worked - thanks.

1 Like