Replacing Text in pdf file

hashmatkhattak · July 8, 2014, 1:51am

try

{

// Here I want to to load Xml

File fXmlFile = new File("C:\\Users\\Sath Tech\\Desktop\\data.xml");

com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document("C:\\Users\\Sath Tech\\Desktop\\Onepager.pdf");

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();

javax.xml.parsers.DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

org.w3c.dom.Document doc = dBuilder.parse(fXmlFile);

doc.getDocumentElement().normalize();

org.w3c.dom.NodeList nList = doc.getElementsByTagName("fileSegments");

Map map = new java.util.HashMap();

for (int temp = 0; temp < nList.getLength(); temp++)

{

org.w3c.dom.Node nNode = nList.item(temp);

//System.out.println("\nCurrent Element :" + nNode.getNodeName());

if (nNode.getNodeType() == org.w3c.dom.Node.ELEMENT_NODE)

{

org.w3c.dom.Element eElement = (org.w3c.dom.Element)nNode;

String source = eElement.getElementsByTagName("source").item(0).getTextContent();

String translation = eElement.getElementsByTagName("translation").item(0).getTextContent();

com.aspose.pdf.TextFragmentAbsorber textFragmentAbsorber = new com.aspose.pdf.TextFragmentAbsorber(source);

//accept the absorber for first page of document

pdfDocument.getPages().accept(textFragmentAbsorber);

//get the extracted text fragments into collection

com.aspose.pdf.TextFragmentCollection textFragmentCollection = textFragmentAbsorber.getTextFragments();

//get first occurrence of text and replace

com.aspose.pdf.TextFragment textFragment = textFragmentCollection.get_Item(1);

//update text and other properties

textFragment.setText(translation);

// save updated PDF file

}

pdfDocument.save("C:\\Users\\Sath Tech\\Desktop\\Text_Updated.pdf");

}

catch (Exception ex)

{

System.out.println(ex.getMessage());

}

tilal.ahmad · July 9, 2014, 12:48am

Hi Hashmat,

Thanks for your inquiry. After initial investigation we have fount that your source text(XML file) has different formatting than text in PDF file, so API is not searching the text. E.g.

XML source text(single line): RESUME LT COL ® MUSSARAT NAEEM
PDF text(two lines): RESUME
LT COL ® MUSSARAT NAEEM

We have logged the issue as PDFNEWJAVA-34312 in our issue tracking system for further investigation and resolution. We will keep you updated about the issue resolution progress.

We are sorry for the inconvenience caused.

Best Regards,