Text Replacement issue with automatically re-arrange Page Contents

Hi,
I’m trying to do a text replacement of a small text with a large text where it can automatically re-arrange Page Contents. I’m facing two issues when trying to implement this using Aspose library.

Issues #1 : When trying to replace a small text with a large text , which is in the last row of the table. Due to some reason , it is not displaying the complete new text. The new text is cutting off as if there is not enough space in the table. I was expecting the new long text to go to next line but that’s not happening with table.

Issue #2 : This issue is sporadic in the nature. It is able to replace the small text with long text but overlapping some of the text over another text while adjusting the content and it is making difficult to read the text.

Below is my code snippet :

				TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(searchByRegExp);
				TextReplaceOptions textReplaceOptions = textFragmentAbsorber.getTextReplaceOptions(); 
				if(textReplaceOptions != null){
					System.out.println("setting  AdjustSpaceWidth");
					textReplaceOptions.setReplaceAdjustmentAction(TextReplaceOptions.ReplaceAdjustment.WholeWordsHyphenation);
				}

				textFragmentAbsorber.setTextReplaceOptions(textReplaceOptions);
				document.getPages().get_Item(Integer.parseInt(deidRequest.getPg())).accept(textFragmentAbsorber);
				// Replace each TextFragment
				Iterator<TextFragment> it = textFragmentAbsorber.getTextFragments().iterator();
				while (it.hasNext()) {
					TextFragment textFragment = it.next();
					LOGGER.info("Text Found >"+ searchRegExp);
					textFragment.setText(deidRequest.getReplaceTerm());
				}

I would appreciate if you can help me out in finding the solution for these issues.

My reference Aspose document : Replace Text in PDF|Aspose.PDF for .NET
Topic : Go to “Text Replacement should automatically re-arrange Page Contents” section

Regards,
AmitText Repalcement issue with Aspose.png (73.8 KB)

1 Like

@amitkum,

Please send us your source PDF document and complete code with regular expressions. We will investigate your scenario in our environment, and share our findings with you.

Thanks Imran for your immediate response ! This is one of the critical issue reported by our client , which need your attention to resolve this issue at the earliest. The regex that I’m using is not well tested yet. For my initial testing I’m searching the exact word and then trying to replace it with new text using Aspose library .

As you requested , please find the attached a copy of PDF and below is code snippet :

class TextReplacement {

public static void main(final String… args) throws Exception {
LOGGER.info("Started Run: ");
TestTextRedactionUsingAspose();

}

private static void TestTextRedactionUsingAspose() throws Exception {
com.aspose.pdf.Document document = new com.aspose.pdf.Document(“H:/redaction/output/RedactTextv8.pdf”);
String outUrl=“H:/redaction/output/Test.pdf”;
ArrayList deidRequests = new ArrayList();
TextRedaction1DLImpl rt = new TextRedaction1DLImpl();
DEIDRequest deidReq = new DEIDRequest();
deidReq.setMethod(“ds_replace”);
deidReq.setMatch(“1”);
deidReq.setText(“medication”);
deidReq.setPg(“4”);
deidReq.setReplaceTerm(“testing medication”);
deidRequests.add(deidReq);
rt.replaceText(document, deidRequests, outUrl, false);
}

@Override
public void replaceText(com.aspose.pdf.Document document, ArrayList<DEIDRequest> deidReqs, String outUrl,	boolean isProposal) throws Exception {
	FileInputStream fstream = new FileInputStream(		"H:\\deidProject\\deid_pdf_withAPDFL_04092018\\deid_pdf\\pdf_service_dl_aspose\\Aspose.Total.Java.lic");

	// Instantiate the License class
	License license = new License();

	// Set the license through the stream object
	license.setLicense(fstream);

	try {

		for (int i = 0; i < deidReqs.size(); i++) {
			DEIDRequest deidRequest = deidReqs.get(i);
			LOGGER.info(" ID: " + deidReqs.get(i).getId());
			LOGGER.info(" Size: " + deidReqs.size());
			LOGGER.info("markTextForRedaction: Match>" + deidReqs.get(i).getMethod() + " Text>"
					+ deidReqs.get(i).getText() + " PageNumber>" + deidReqs.get(i).getPg() + " Replace Text>"+ deidRequest.getReplaceTerm());

			if ("ds_replace".equals(deidReqs.get(i).getMethod())) {
				
				
				
			/*	TextReplaceOptions textReplaceOptions = textFragmentAbsorber.getTextReplaceOptions(); 

				if(textReplaceOptions != null)
				{

					textReplaceOptions.setReplaceAdjustmentAction(TextReplaceOptions.ReplaceAdjustment.WholeWordsHyphenation);

				}

				textFragmentAbsorber.setTextReplaceOptions(textReplaceOptions);*/

				String searchRegExp = deidRequest.getText();
				searchRegExp = searchRegExp.replace("+", "\\+");
				searchRegExp = searchRegExp.replace("(", "\\(");
				searchRegExp = searchRegExp.replace(")", "\\)");
				searchRegExp=searchRegExp.replace("+", "\\\\+");
				String regexForText = searchRegExp.replaceAll(" ", "(\\\\s*|\\\\n)");
		//		TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(searchRegExp);
                                    TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber( regexForText  ); 
				TextReplaceOptions textReplaceOptions = textFragmentAbsorber.getTextReplaceOptions(); 
				if(textReplaceOptions != null){
					System.out.println("setting  AdjustSpaceWidth");
					textReplaceOptions.setReplaceAdjustmentAction(TextReplaceOptions.ReplaceAdjustment.WholeWordsHyphenation);
				}

				textFragmentAbsorber.setTextReplaceOptions(textReplaceOptions);
				document.getPages().get_Item(Integer.parseInt(deidRequest.getPg())).accept(textFragmentAbsorber);
				// Replace each TextFragment
				Iterator<TextFragment> it = textFragmentAbsorber.getTextFragments().iterator();
				while (it.hasNext()) {
					TextFragment textFragment = it.next();
					LOGGER.info("Text Found >"+ searchRegExp);
					textFragment.setText(deidRequest.getReplaceTerm());
				}

			}

		}

		if (isProposal) {
			// TODO
		} else {
			document.save(outUrl.replaceAll("%20", " "));
			document.close();
		}

		LOGGER.info("PDF Redaction Proposal Completed: ");
	} catch (Exception e) {
		// LOGGER.info("Error markTextForRedaction: Match>" + deidReqs.getMatch() + "
		// Text>" + deidReq.getText() + " PageNumber>"+deidReq.getPg() );
		LOGGER.info(e.getMessage(), e);
	}

}

}

Thanks for your help!MP-2_CSR_FINAL_15Sep11.pdf (1.7 MB)

@amitkum,

We are unable to run your code snippet in our environment because there are unknown classes and objects (e.g. DEIDRequest and TextRedaction1DLImpl). Please review and simplify the code, and then send to us. Your response is awaited.

Here is a copy of my code.

package com.sample.pdf;

import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.Iterator;

import org.apache.log4j.Logger;

import com.aspose.pdf.License;
import com.aspose.pdf.TextFragment;
import com.aspose.pdf.TextFragmentAbsorber;
import com.aspose.pdf.TextReplaceOptions;
import com.sample.pdf.json.DEIDRequest;

public class TextReplacementTest {

private static final Logger LOGGER = Logger.getLogger(TextReplacementTest.class.getName());

public static void main(final String[] args) throws Exception {

	TestTextRedactionUsingAspose();

}

	private static void TestTextRedactionUsingAspose() throws Exception {
	com.aspose.pdf.Document document = new com.aspose.pdf.Document("H:/redaction/output/RedactTextv8.pdf");
	String outUrl="H:/redaction/output/Test.pdf";
	ArrayList<DEIDRequest> deidRequests = new ArrayList<DEIDRequest>();
	TextRedaction1DLImpl rt = new TextRedaction1DLImpl();
	DEIDRequest deidReq = new DEIDRequest();
	deidReq.setMethod("ds_replace");
	deidReq.setMatch("1");
	deidReq.setText("medication");
	deidReq.setPg("4");
	deidReq.setReplaceTerm("testing medication");
	deidRequests.add(deidReq);
	rt.replaceText(document, deidRequests, outUrl, false);
	}

	
	public void replaceText(com.aspose.pdf.Document document, ArrayList<DEIDRequest> deidReqs, String outUrl,	boolean isProposal) throws Exception {
		FileInputStream fstream = new FileInputStream("H:\\deidProject\\deid_pdf_withAPDFL_04092018\\deid_pdf"
				+ "\\pdf_service_dl_aspose\\Aspose.Total.Java.lic");

		// Instantiate the License class
		License license = new License();

		// Set the license through the stream object
		license.setLicense(fstream);

		try {

			for (int i = 0; i < deidReqs.size(); i++) {
				DEIDRequest deidRequest = deidReqs.get(i);
				LOGGER.info(" ID: " + deidReqs.get(i).getId());
				LOGGER.info(" Size: " + deidReqs.size());
				LOGGER.info("markTextForRedaction: Match>" + deidReqs.get(i).getMethod() + " Text>"
						+ deidReqs.get(i).getText() + " PageNumber>" + deidReqs.get(i).getPg() + " Replace Text>"+ deidRequest.getReplaceTerm());

				if ("ds_replace".equals(deidReqs.get(i).getMethod())) {
					
					
					
				/*	TextReplaceOptions textReplaceOptions = textFragmentAbsorber.getTextReplaceOptions(); 

					if(textReplaceOptions != null)
					{

						textReplaceOptions.setReplaceAdjustmentAction(TextReplaceOptions.ReplaceAdjustment.WholeWordsHyphenation);

					}

					textFragmentAbsorber.setTextReplaceOptions(textReplaceOptions);*/

					String searchRegExp = deidRequest.getText();
					searchRegExp = searchRegExp.replace("+", "\\+");
					searchRegExp = searchRegExp.replace("(", "\\(");
					searchRegExp = searchRegExp.replace(")", "\\)");
					searchRegExp=searchRegExp.replace("+", "\\\\+");
					String regexForText = searchRegExp.replaceAll(" ", "(\\\\s*|\\\\n)");
			//		TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(searchRegExp);
	                                    TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber( regexForText  ); 
					TextReplaceOptions textReplaceOptions = textFragmentAbsorber.getTextReplaceOptions(); 
					if(textReplaceOptions != null){
						System.out.println("setting  AdjustSpaceWidth");
						textReplaceOptions.setReplaceAdjustmentAction(TextReplaceOptions.ReplaceAdjustment.WholeWordsHyphenation);
					}

					textFragmentAbsorber.setTextReplaceOptions(textReplaceOptions);
					document.getPages().get_Item(Integer.parseInt(deidRequest.getPg())).accept(textFragmentAbsorber);
					// Replace each TextFragment
					Iterator<TextFragment> it = textFragmentAbsorber.getTextFragments().iterator();
					while (it.hasNext()) {
						TextFragment textFragment = it.next();
						LOGGER.info("Text Found >"+ searchRegExp);
						textFragment.setText(deidRequest.getReplaceTerm());
					}

				}

			}

			if (isProposal) {
				// TODO
			} else {
				document.save(outUrl.replaceAll("%20", " "));
				document.close();
			}

		} catch (Exception e) {
			// LOGGER.info("Error markTextForRedaction: Match>" + deidReqs.getMatch() + "
			// Text>" + deidReq.getText() + " PageNumber>"+deidReq.getPg() );
			LOGGER.info(e.getMessage(), e);
		}

	}

}


package com.gcesolutions.pdf.json;

public class DEIDRequest {

public void setId(String id) {
	this.id = id;
}

private String id;


/** The replace term. */
private String replaceTerm="";

/** The text. */
private String text="";

/** The method. */
private String method="";



/** The match. */
private String match="";


/** The di type. */
private String diType ="";

	
/** The pg. */
private String pg ="";







/**
 * Gets the replace term.
 *
 * @return the replace term
 */
public String getReplaceTerm() {
	return replaceTerm;
}

/**
 * Sets the replace term.
 *
 * @param replaceTerm the new replace term
 */
public void setReplaceTerm(String replaceTerm) {
	this.replaceTerm = replaceTerm;
}

/**
 * Gets the text.
 *
 * @return the text
 */
public String getText() {
	return text;
}

/**
 * Sets the text.
 *
 * @param text the new text
 */
public void setText(String text) {
	this.text = text;
}

/**
 * Gets the method.
 *
 * @return the method
 */
public String getMethod() {
	return method;
}

/**
 * Sets the method.
 *
 * @param method the new method
 */
public void setMethod(String method) {
	this.method = method;
}


/**
 * Gets the match.
 *
 * @return the match
 */
public String getMatch() {
	return match;
}

/**
 * Sets the match.
 *
 * @param match the new match
 */
public void setMatch(String match) {
	this.match = match;
}

/**
 * Gets the di type.
 *
 * @return the di type
 */
public String getDiType() {
	return diType;
}

/**
 * Sets the di type.
 *
 * @param diType the new di type
 */
public void setDiType(String diType) {
	this.diType = diType;
}


/**
 * Gets the pg.
 *
 * @return the pg
 */
public String getPg() {
	return pg;
}

/**
 * Sets the pg.
 *
 * @param pg the new pg
 */
public void setPg(String pg) {
	this.pg = pg;
}

public String getId() {
	// TODO Auto-generated method stub
	return id;
}

}

Thanks!

@amitkum,

We managed to replicate the said issues in our environment. An investigation has been logged under the ticket ID PDFNET-44814 in our bug tracking system. We have linked your post to this ticket and will keep you informed regarding any available updates.

1 Like

Hi Imram, glad to see that you were able to replicate the issue. Please keep us informed of the priority status and progress as this is the last remaining item preventing us from being able to license (OEM) and implement as part of our application. Thanks.

@kalfast,

Sure, we will notify you once it is fixed. Besides this, we recommend our clients to post their critical issues (or ticket IDs) in the paid support forum. Please refer to this helping link: Aspose support options

Hi Imran, it’s been a few weeks since we last touched base on this item. In regard to the paid support, would we get assured of having the issue resolved within a specific period of time? Need to plan out our release schedule and this is a priority item for our users. Thanks.

Imran, is this an issue for both the Java and the C++ versions of Aspose.PDF?

@kalfast

Thank you for getting back to us.

Please note that the Paid Support tickets are resolved on priority and sooner as compared to the tickets logged under Free Support model. At the moment, PDFNET-44814 is pending for analysis owing to previously logged tickets in the queue because free support tickets are scheduled under First Come First Serve policy and the resolution can take some more months.

Moreover, Java and C++ versions of Aspose.PDF API are ported from .NET version so we are afraid you may not avoid or workaround this issue with other versions of the API until it is resolved in Aspose.PDF for .NET API.

Hi Imran, it’s been more than 4 months since we last touched base on this item. As we mentioned earlier that this is a very important feature for our tool. Hence, we would like to ensure that Aspose can handle this particular capability before purchasing the license . We need to plan out our release schedule and this is a priority item for our client. Kindly share the status.

Thanks!

@amitkum

Thank you for getting back to us.

We are afraid that PDFNET-44814 has not been resolved yet owing to previously logged and critical tickets. However, we have recorded your concerns and the priority of this ticket has been escalated to next level. We will schedule it soon and will let you know as soon as some significant updates will be available. We appreciate your patience and comprehension in this regard.

The issues you have found earlier (filed as PDFNET-44814) have been fixed in Aspose.PDF for .NET 19.9.