First: using self-replication might help because each file will be processed independently and the process’ memory will be reclaimed after each file. When processing all files in a single process, memory may only be reclaimed at the very end of the process (i.e., after all 10 files have been processed).
Second: you can use an advanced XSLT splitter to split after 20 occurrences of your target element.
Granted, XSLT is no fun at all … unless you have some experience with it, but it is very powerful nonetheless. For instance, the following XSLT splits a file after every 20 occurrences of the element
CUSTOMER have been found:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:variable name="RecordsPerSplit" select="20" />
<xsl:variable name="NodeToSplit" select="//MyFile/CUSTOMER" />
<xsl:for-each select="$NodeToSplit[position() mod $RecordsPerSplit = 1]">
<xsl:for-each select=".|following-sibling::CUSTOMER[not(position() > $RecordsPerSplit -1)]" >
It is composed of two loops: the outer loop initiates a new chunk of data when (element_index modulus 20) is equal to 1.
The inner loop then looks at the following 19 siblings of the CUSTOMER node and adds them to the chunk.
Obviously, depending on the structure of your own XML file, you will have to make changes to the
select statements in the XSLT code, but perhaps the above example can get you going.