Remove equations from the document causes issue

Hi team,

I am trying to remove equation for extracting the images.I am using the following code to remove equation.but some equations cannot be removed .please let me know how to solve.(aspose.words in java)

Equation.zip (611 Bytes)
Thanks & regards
priyanga G

@priyanga,

Thanks for your inquiry. Please ZIP and attach your input Word document here for testing. We will investigate the issue on our side and provide you more information.

Hi @tahir.manzoor

Thank you very much.I have enclosed the input document .
Equations.zip (429.9 KB)
The output folder having images with equations.
output folder.zip (359.9 KB)

Thanks & regards ,
Priyanga G

@priyanga,

Thanks for sharing the detail. We have tested the scenario using latest version of Aspose.Words for Java 17.7 with shared code example. We have noticed that all shapes (equations) are removed from the document. Please check attached output document. 17.7.zip (58.0 KB) Please use Aspose.Words for Java 17.7.
.

Hi @tahir.manzoor,

Thank you very much.It’s working.

once i was integrate that equation removing class with another class it was not removed.

I have attached the code and document for reference.proc.zip (14.1 KB)
and Equations_Uplink.zip (429.9 KB)

Thanks & Regards ,
priyanga.G

@priyanga,

Thanks for your inquiry. As per my understanding, you are using PageSplitter utility to export each page into separate document and removing equation from output document. We have tested the scenario and have not found the shared issue.

Your input document contains equations on page 3. Could you please share the problematic output document for page 3 along with expected output document for the same page? We will then provide you more information about your query along with code.

Hi @tahir.manzoor

Thank you very much.Thanks for your timely help

The requirement is each images from the document can be saved in each document.Now I am separate the images as
section A-single images,section B-inline images,section C-table images and section-D handle other images.In section D extract both images and equations .Now,I am want to remove equations.
The expected output folder isFeifeiShen-REVISE-2017.zip (1.5 MB)
The input document isFeifeiShen-REVISE-2017_input.zip (2.6 MB).please,help me to resolve it.

Thanks & Regards,
priyanga G.

@priyanga,

Thanks for sharing the document. Unfortunately, your requirement is not clear enough. In your previous post, you shared different input document. You want to remove the equation from output document. However, your expected output document contains the equation.

The method DSMT4 removes the equations successfully. If you want to remove the equation along with paragraph’s text, please use ShapeBase.ParentParagraph.Remove method.

Hi @tahir.manzoor

Thanks for understanding.

The requirement is to read the word document.
Extract the images using paragraph node .each image can be stored in each document in a separate folder .I am having separate methods to handle extraction of images as single images ,inline images and table images .some of the images is handled in section D(not under above three categories ) in source code.In the section D extract the images and also the equations extracted along with the output.Let me know how to remove equations permanently.

Thanks You very much.
priyanga

@priyanga,

Thanks for sharing the detail. You can use the following code snippet in DSMT4 method to remove the equation from the document. Hope this helps you.

if (shape.getOleFormat().getProgId().startsWith("Equation"))
{
    shape.getParentParagraph().remove();
}

Hi @tahir.manzoor

Thank you very much for giving solution.

Once i was use the parent paragraph remove method in DSMT4 method it show only the extraction failed message.i think it can remove all paragraph nodes.please let me know how to solve it.

Thanks & regards,
priyanga G

@priyanga,

Please open your input document “FeifeiShen-REVISE-2017_input.docx” in MS Word. There are equations on page 5 and 6. Please share problematic and expected output documents for these pages. We already requested for these documents. If you cannot supply us with this information we will not be able to investigate your issue.

Please manually create your expected Word documents using Microsoft Word for page 5 and 6. Please ZIP and attach them here for our reference. We will then provide you code example according to your requirement.

Hi @tahir.manzoor

Thank you very much.

The input document is FeifeiShen-REVISE-2017_input.zip (2.6 MB)

The expected output isFeifeiShen-REVISE-2017_old.zip (1.5 MB)

regards
priyanga

@priyanga,

Thanks for sharing the detail. Perhaps, you are not using the DSMT4 method and extract contents code correctly. Please use DSMT4 method to remove the equations (shapes). This method works without any issue at our end. After removing the shape, please use the same code shared with you earlier to extract the shapes.

In case you are using older version of Aspose.Words, we suggest you please upgrade to the latest version of Aspose.Words for Java 17.8.

We have tested the scenario using following code example and have not found the shared issue. Please check the output document. output documents.zip (2.1 MB)

DSMT4(MyDir + "FeifeiShen-REVISE-2017_input.docx");

/** SECTION D START **/ int i = 1; 
Document interimdoc = new Document(MyDir + "FeifeiShen-REVISE-2017_input.docx");
NodeCollection shapes_otherimg = interimdoc.getChildNodes(NodeType.SHAPE, true);

for (Shape shape : (Iterable<Shape>) shapes_otherimg) {
    if (shape.hasImage() && shape.getParentParagraph().getNextSibling() != null
            && shape.getParentParagraph().getNextSibling().getNodeType() == NodeType.PARAGRAPH) {

        ArrayList nodes1 = ExtractContents.extractContent(shape.getParentParagraph(), shape.getParentParagraph(), true);

        ExtractContents.generateDocument(interimdoc, nodes1).save(MyDir + "output"+i+".docx");

        Paragraph fig = (Paragraph) shape.getParentParagraph();
        /**
         * REMOVAL OF NODE(START,END) FROM SOURCE WORD DOC START
         **/
        shape.getParentParagraph().insertBefore(new BookmarkStart(interimdoc, "Image_" + i), shape);
        fig.appendChild(new BookmarkEnd(interimdoc, "Image_" + i));
        i++;
        for (Bookmark bookmark : interimdoc.getRange().getBookmarks()) {
            if (bookmark.getName().startsWith("Image_")) {
                bookmark.setText("");
            }
        }

    }
}

Hi @tahir.manzoor

Now I am able to get exact output .Thanks a lot.

Thanks
&
regards,
priyanga G

@priyanga,

Thanks for your feedback. It is nice to hear from you that your issue has been solved. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

Hi @tahir.manzoor,

We have a group shape in this document.but that images cannot be recovered.It will show the error message only.
Already we are include the NodeCollection Gshapes = interimdoc1.getChildNodes( NodeType.GROUP_SHAPE, true);method still the images cannot be able to recovered.please let me know how to resolve it.

The input document is Test.zip (431.2 KB)

The output isOutput.zip (837.1 KB)

Thanks
&
Regards
priyanga G

@priyanga,

Thanks for your inquiry. Please use the following code example to export the group shape into new document. Hope this helps you.

Document srcDoc = new Document(MyDir + "Test.doc");
NodeCollection groupShapes = srcDoc.getChildNodes(NodeType.GROUP_SHAPE, true);
int i = 1;
for (GroupShape groupShape : (Iterable<GroupShape>) groupShapes) {

    Document doc = new Document();
    NodeImporter imp = new NodeImporter(srcDoc, doc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
    Node impNode = imp.importNode(groupShape, true);
    doc.getFirstSection().getBody().getFirstParagraph().appendChild(impNode);
    doc.save(MyDir + "output"+i+".docx");
    i++;
}

Hi @Tahir,

Thank you very much.

Thanks
&
regards

priyangaG

@priyanga,

Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.