Replace Checkboxes and Radio Buttons with Text in RTF Output

Hello,

I recently purchased the Aspose.Words product to convert PDF files to RTF documents. My PDF files contain checkboxes and radio buttons, and I would like the RTF output to reflect the same design but without using drawing objects.

Specifically, I’d like to replace checkboxes and radio buttons with plain text values. For example, if a PDF has two radio buttons for “Male” and “Female”, and “Female” is selected, I want the RTF output to look like this:

Male (Unselected) Female (Selected)

Is there a way to detect these drawing objects during parsing, extract their selected values, and then represent those values as plain text in the RTF?

Thanks

@AdibeSehha

Cause

To replace checkboxes and radio buttons with text in the RTF output, it is essential to detect these form elements during the parsing of the PDF document. The challenge lies in extracting the selected values and representing them as plain text without using drawing objects.

Solution

You can achieve this by iterating through the document’s elements and checking for checkbox and radio button controls. Once detected, replace them with the appropriate text representation based on their state (selected or unselected).

Code

Here’s a sample code snippet demonstrating how to replace checkboxes and radio buttons with text in Aspose.Words:

Document doc = new Document("input.pdf");
FindReplaceOptions options = new FindReplaceOptions();
options.setReplacingCallback(new IReplacingCallback() {
    public ReplaceAction Replacing(ReplacingArgs args) {
        if (args.getMatchNode() instanceof CheckBoxFormField) {
            CheckBoxFormField checkBox = (CheckBoxFormField) args.getMatchNode();
            args.setReplacement(checkBox.isChecked() ? "Checked" : "Unchecked");
        } else if (args.getMatchNode() instanceof OptionButtonFormField) {
            OptionButtonFormField radioButton = (OptionButtonFormField) args.getMatchNode();
            args.setReplacement(radioButton.isChecked() ? "Selected" : "Unselected");
        }
        return ReplaceAction.Replace;
    }
});

// Perform the replacement
doc.getRange().replace(new Regex("checkbox|radiobutton"), "", options);
doc.save("output.rtf");

Next Steps

  1. Test the code with your PDF documents to ensure it correctly identifies and replaces the checkboxes and radio buttons.
  2. Adjust the replacement text as needed to match your desired output format.
  3. If you encounter any issues, please reach out for further assistance.

I hope this solution meets your needs!

@AdibeSehha Could you please attach your input PDF and expected output RTF documents here for our refence? We will check the issue and provide you more information. You should note that PDF documents are fixed page document and checkbox or radio button can be represented differently, using special glyph, as an image as vector graphics etc.

@alexey.noskov : those are medical PDFs so won’t be possible to attach here. But you can use any PDF with checkboxes and radio buttons (with few selected values)

btw above code snippet giving error

FindReplaceOptions does not contain a definition for setReplacingCallback

I am using Aspose.words 25.5.0 using .net framework 4.8 + I am getting pdfs ae base64 string from HL7 messages so converting that to base64p[] and using that for RTF convertion

byte[] pdfBytes = Convert.FromBase64String(base64Pdf);

@AdibeSehha As I have mentioned checkbox and radio button in PDF documents can be represented differently. So please attach your sample input and expected output documents here for our refence. You can provide PDF with dummy data, just as an example of your input document.

@alexey.noskov : pls find attached actual PDF (with dummy data) template containing checkboxes and radio button.

Kindly share the .net 4.8 code to replace checkboxes and radio buttons with their selected/unselected values
Sample_Pdf.pdf (395.9 KB)

@AdibeSehha Thank you for additional information. Controls in your input PDF documents are represented as simple images. So they are represented the same in the output MS Word document generated by Aspose.Words

@alexey.noskov : Is there any way to detect those during paring from Pdf to Rtf and replace with their selected values like

Present (Unselected) Absent (Selected)

@AdibeSehha Unfortunately, there is no such method in Aspose.Words. You can try using some customer method to detect such images in the document generated by Aspose.Words and then replace them with other content. Image recognition is out of Aspose.Words scope.