Convert WORD to PDF containing visio objects using Aspose.Words(Java)

I am using your company “Aspose.Words
For Java” in the word to pdf conversion product, there are two problems encountered here, I would like to consult your company.
If the word document contains visio objects, does your company have a way to get the page number of visio images in word?
Or can your company directly convert the visio object in word into a picture in pdf instead of path
My file:visio.docx

@serendipity.zhq

Yes, you can use LayoutCollector class to get page where the particular node is located:

Document doc = new Document("C:\\Temp\\in.docx");
LayoutCollector collector = new LayoutCollector(doc);
        
// Get the shape.
Shape shape = (Shape)doc.getChild(NodeType.SHAPE, 0, true);
        
// You can check whether the shape is embedded Visio object:
boolean isVisioObject = shape.getOleFormat()!=null && shape.getOleFormat().getProgId().equals("Visio.Drawing.11");
System.out.println(isVisioObject);
        
// Get page where the shape is located:
System.out.println(collector.getStartPageIndex(shape));

Aspose.Words renders the EMF image which is used for visual representation of the embedded object. So the following code produces the correct output from your DOCX document:

Document doc = new Document("C:\\Temp\\in.docx");
doc.save("C:\\Temp\\out.pdf");

out.pdf (87.2 KB)

Thanks for your answer!
1.The first answer can only get the page number of the first or last page where the visio object appears. If there are multiple pages in the word document that contain the visio object, can you get the page numbers of these pages?
My file:visio.docx
2.This conversion method is being used, can the conversion effect of the following document be achieved, the visio object is converted into a picture
My file:visio.pdf

@serendipity.zhq

  1. In MS word document a shape cannot span several pages. So Visio object is always on one page. LayoutCollector provides method to get page index where node starts and page index where node ends. For example table in MS Word document can span several pages. If you need to get page numbers in your document where Visio objects are placed you can use code like this:
Document doc = new Document("C:\\Temp\\in.docx");
LayoutCollector collector = new LayoutCollector(doc);

// Get all shapes.
Iterable<Shape> shapes = doc.getChildNodes(NodeType.SHAPE, true);
for (Shape s : shapes)
{
    // You can check whether the shape is embedded Visio object:
    boolean isVisioObject = s.getOleFormat() != null && s.getOleFormat().getProgId().equals("Visio.Drawing.11");
    if (isVisioObject)
    {
        System.out.println("Page " + collector.getStartPageIndex(s) + " contains Visio object.");
    }
}

  1. Sure, you can specify MetafileRenderingMode.BITMAP in PdfSaveOptions:
Document doc = new Document("C:\\Temp\\in.docx");
PdfSaveOptions opt = new PdfSaveOptions();
opt.getMetafileRenderingOptions().setRenderingMode(MetafileRenderingMode.BITMAP);
doc.save("C:\\Temp\\out.pdf",opt);

Thanks for your reply, your answer solved my problem.

1 Like