Extract charts from word document

priyanga · June 2, 2017, 6:56am

Hi Team,

With regard to Extract charts from word document using figure caption .Finally remove the figure caption.

tahir.manzoor · June 2, 2017, 11:15am

Hi Priya,

Thanks for your inquiry. We have already answered your query here in this post. Please follow that thread for further proceedings.

priyanga · June 4, 2017, 11:42pm

Hi Team,

I am able to extract the images using the following code .but not able to extract the chart objects.Now My query is to extract the chart objects from word document.

NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true); </div>

for (Shape shape : (Iterable) shapes) {
if (shape.hasImage() && shape.getParentParagraph().getNextSibling() != null
&& shape.getParentParagraph().getNextSibling().getNodeType() == NodeType.PARAGRAPH) {

				if <span class="Apple-tab-span" style="white-space:pre">		</span>(shape.getParentParagraph().getNextSibling().toString(SaveFormat.TEXT).startsWith("Fig")

||shape.getParentParagraph().getNextSibling().toString(SaveFormat.TEXT).startsWith(“Sch”)) {
caption = shape.getParentParagraph().getNextSibling().toString(SaveFormat.TEXT);
name = null;

Thanks in advance,

kind regards,

priyanga G

priyanga · June 5, 2017, 4:12am

Hi Team,

Thank you for your quick response. I am able to extract the images using the following code .but not able to extract the chart objects.Now My query is to extract the chart objects from word document and each chart will be stored in each document. finally remove the figure caption under the charts.

NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true); \

for (Shape shape : (Iterable) shapes) { if (shape.hasImage() && shape.getParentParagraph().getNextSibling() != null && shape.getParentParagraph().getNextSibling().getNodeType() == NodeType.PARAGRAPH) { if (shape.getParentParagraph().getNextSibling().toString(SaveFormat.TEXT).startsWith("Fig") ||shape.getParentParagraph().getNextSibling().toString(SaveFormat.TEXT).startsWith("Sch")) { caption = shape.getParentParagraph().getNextSibling().toString(SaveFormat.TEXT); name = null;

Thanks in advance,

kind regards,

priyanga G

tahir.manzoor · June 5, 2017, 5:10am

Hi Priya,

Thanks for your inquiry. Could you please share your input and expected output documents here for our reference? We will then share the code example according to your requirement.

priyanga · June 5, 2017, 5:43am

Hi Tahir,

Thank you for your quick reply.i have enclosed the input document and output files.I hope this will help you .

Thank you

kind regards,

Priyanga.G

tahir.manzoor · June 5, 2017, 11:19am

Hi Priyanga,

Thanks for sharing the documents. Please use the following code example to achieve your requirements. Hope this helps you.

We suggest you please read about Aspose.Words' DOM.

Aspose.Words Document Object Model

Document doc = new Document(MyDir + "test+(4).docx");
DocumentBuilder builder = new DocumentBuilder(doc);
int i = 1;
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
for (Shape shape : (Iterable) shapes)
{
    if(shape.hasChart())
    {
        Document dstDoc = new Document();

        NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
        Node newNode = importer.importNode(shape, true);
        dstDoc.getFirstSection().getBody().getFirstParagraph().appendChild(newNode);
        dstDoc.save(MyDir + "output"+i+".docx");
        i++;
    }
}

priyanga · June 6, 2017, 5:21am

Hi Team,

Thank you for your help.Now I am able to extract chart objects from document but having some issues in bar chart .The issue is “<span style=“font-size:10.0pt;font-family:
“Cambria”,“serif”;mso-fareast-font-family:Calibri;mso-fareast-theme-font:minor-latin;
mso-bidi-font-family:“Times New Roman”;mso-ansi-language:EN-IN;mso-fareast-language:
EN-IN;mso-bidi-language:AR-SA”>Y-axis text cut in the Master PDF” and text also collapsed with images .

tahir.manzoor · June 6, 2017, 10:41am

Hi Priyanga,

Thanks for your inquiry. Could you please share your input document and code example to reproduce this issue at our end? We will investigate the issue on our side and provide you more information.

priyanga · June 6, 2017, 11:16pm

Hi team ,

Thank you for your help.Now I am able to extract chart objects from document but having some issues in bar chart .The issue is in bar chart“Y-axis text cut in the Master PDF” and text also collapsed with images .Here I have enclosed my document for reference. I am using the following code.

<pre style=“font-family: “Courier New”; font-size: 9pt;”>Document doc = new Document(MyDir + “test+(4).docx”);
DocumentBuilder builder = new DocumentBuilder(doc);
int i = 1;
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
for (Shape shape : (Iterable) shapes)
{
if(shape.hasChart())
{
Document dstDoc = new Document();

NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
Node newNode = importer.importNode(shape, true);
dstDoc.getFirstSection().getBody().getFirstParagraph().appendChild(newNode);
dstDoc.save(MyDir + “output”+i+".docx");
i++;
}
}

Thanks in advance,

Priyanga G

tahir.manzoor · June 7, 2017, 10:38am

Hi Priyanga,

Thanks for sharing the detail. We have tested the scenario using latest version of Aspose.Words for Java 17.6 and have not found the shared issue. Please use Aspose.Words for Java 17.6. We have attached the output PDF with this post for your kind reference.

priyanga · June 14, 2017, 5:53am

Hi Tahir,

Apologize me for the late reply .Nice output .It was working fine.Thank you aspose team also.

Thanks and kind regards,

Priyanga G

tahir.manzoor · June 14, 2017, 11:37am

Hi Priyanga,

Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

priyanga · June 15, 2017, 11:20pm

Hi Tahir,

Iam successfully extract the chart objects. Thank you for your kind help.My task is to extract the images betweeen paragraphs.Now Iam successfully extract the png,jpeg,chart objects are extracted and saved as pdf. but some of the images are not extract and not able to saved as pdf. I have enclosed the input and output documents .

Thanks & regards

priyanga G

tahir.manzoor · June 16, 2017, 6:21am

@priyanga,

Thanks for your inquiry. Please use following modified code example to get the desired output. Hope this helps you.

Document doc = new Document(MyDir + "test+(14).docx");
DocumentBuilder builder = new DocumentBuilder(doc);
int i = 1;
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
for (Shape shape : (Iterable<Shape>) shapes)
{
    if(shape.hasChart() || shape.hasImage())
    {
        Document dstDoc = new Document();

        NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
        Node newNode = importer.importNode(shape, true);
        dstDoc.getFirstSection().getBody().getFirstParagraph().appendChild(newNode);
        dstDoc.save(MyDir + "output"+i+".docx");
        i++;
    }
}

priyanga · June 28, 2017, 3:49am

Hi Tahir ,
I am having hope in your solutions .sorry,I will need the solution in java.

Thanks and kind regards,
Priyanga G

priyanga · June 28, 2017, 4:22am

Hi tahir,
Thank you very much for your solution.Sorry for the confusion .please ignore my previous message

Thanks & kind regards,
Priyanga G

tahir.manzoor · June 28, 2017, 7:46am

@priyanga,

Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.