HI Team
I am facing an issue to extract the images which I mentioned below.Here a document contain tables each table consists of 2 images with separate fig caption while i am trying to extract the image it extracting 2 images in same file.My requirement is to extract the image separately .For your reference i had attached sample input and required output .Kindly provide solution for mentioned scenario it’s very needful.
Input::Input.zip (96.0 KB)
Output::output.zip (593.4 KB)
Thanks.
@jan.kathir,
You can use the following C# Code of Aspose.Words for .NET API that extracts images from one DOCX Word document and saves those images into separate PDF files:
Document doc = new Document(@"E:\Temp\input\\List of figures .docx");
foreach (Shape shape in doc.GetChildNodes(NodeType.Shape, true))
{
Cell cell = (Cell)shape.GetAncestor(NodeType.Cell);
if (cell != null)
{
int cellIdx = cell.ParentRow.Cells.IndexOf(cell);
string fileName = ((Row)cell.ParentRow.NextSibling).Cells[cellIdx].ToString(SaveFormat.Text).Trim();
DocumentBuilder builder = new DocumentBuilder();
builder.InsertNode(builder.Document.ImportNode(shape, true));
builder.Document.Save(@"E:\Temp\input\\" + fileName + ".pdf");
}
}
@awais.hafeez
Thanks for your reply.
I need Aspose words in java not in .NET API .Kindly provide the solution.
Thanks
@jan.kathir,
Please try using the following Aspose.Words for Java equivalent code:
Document doc = new Document("E:\\Temp\\input\\List of figures .docx");
for (Shape shape : (Iterable<Shape>) doc.getChildNodes(NodeType.SHAPE, true)) {
Cell cell = (Cell) shape.getAncestor(NodeType.CELL);
if (cell != null) {
int cellIdx = cell.getParentRow().getCells().indexOf(cell);
String fileName = ((Row) cell.getParentRow().getNextSibling()).getCells().get(cellIdx).toString(SaveFormat.TEXT).trim();
DocumentBuilder builder = new DocumentBuilder();
builder.insertNode(builder.getDocument().importNode(shape, true));
builder.getDocument().save("E:\\Temp\\input\\" + fileName + ".pdf");
}
}
@awais.hafeez
Thank you it’s working fine.