I want to expot my tables present in word document as an image file(jpeg and png). I get all tables present in the document but yet not found how this table is save as an image file.
Hi there,
Thanks for your inquiry. Sure, you can convert all the tables i.e. found in Word document to individual images by using the following code snippet:
Document doc = new Document(@"C:\temp\tables.docx");
List <Image> images = new List <Image> ();
NodeCollection tables = doc.GetChildNodes(NodeType.Table, true);
foreach(Table table in tables)
{
images.Add(RenderNode(table, new ImageSaveOptions(SaveFormat.Png)));
}
Also, RenderNode method is attached with this post.
I hope, this will help.
Thanks for your response but what i needed is in Java Language. So please send me solution in Java aspose word
Hi Tahir Manzoor,
Can you please provide me the same solution in Java.
Hi there,
Thanks for your inquiry. Please use the following Java code to convert table node to image. Hope this helps you. Please let us know if you have any more queries.
Document doc = new Document(MyDir + "table.docx");
Table table = (Table) doc.getChild(NodeType.TABLE, 0, true);
render_Node(table, new ImageSaveOptions(SaveFormat.PNG));
public static void render_Node(Node node, ImageSaveOptions imageOptions) throws Exception
{
if (node == null)
throw new Exception("Node cannot be null");
// If no image options are supplied, create default options.
if (imageOptions == null)
imageOptions = new ImageSaveOptions(SaveFormat.PNG);
imageOptions.setPaperColor(new Color(0, 0, 0, 0));
// There a bug which affects the cache of a cloned node. To avoid this we instead clone the entire document including all nodes,
// find the matching node in the cloned document and render that instead.
Document doc = ((Document) node.getDocument()).deepClone();
node = doc.getChild(NodeType.ANY, node.getDocument().getChildNodes(NodeType.ANY, true).indexOf(node), true);
// Create a temporary shape to store the target node in. This shape will be rendered to retrieve
// the rendered content of the node.
Shape shape = new Shape(doc, ShapeType.TEXT_BOX);
Section parentSection = (Section) node.getAncestor(NodeType.SECTION);
// Assume that the node cannot be larger than the page in size.
shape.setWidth(parentSection.getPageSetup().getPageWidth());
shape.setHeight(parentSection.getPageSetup().getPageHeight());
shape.setFillColor(new Color(0, 0, 0, 0)); // We must make the shape and paper color transparent.
// Don't draw a surronding line on the shape.
shape.setStroked(false);
Node currentNode = node;
// If the node contains block level nodes then just add a copy of these nodes to the shape.
if (currentNode instanceof InlineStory || currentNode instanceof Story)
{
CompositeNode composite = (CompositeNode) currentNode;
for (Node childNode: (Iterable) composite.getChildNodes())
{
shape.appendChild(childNode.deepClone(true));
}
}
else
{
// Move up through the DOM until we find node which is suitable to insert into a Shape (a node with a parent can contain paragraph, tables the same as a shape).
// Each parent node is cloned on the way up so even a descendant node passed to this method can be rendered.
// Since we are working with the actual nodes of the document we need to clone the target node into the temporary shape.
while (!(currentNode.getParentNode() instanceof InlineStory || currentNode.getParentNode() instanceof Story || currentNode.getParentNode() instanceof ShapeBase || currentNode.getNodeType() == NodeType.PARAGRAPH))
{
CompositeNode parent = (CompositeNode) currentNode.getParentNode().deepClone(false);
currentNode = currentNode.getParentNode();
parent.appendChild(node.deepClone(true));
node = parent; // Store this new node to be inserted into the shape.
}
// Add the node to the shape.
shape.appendChild(node.deepClone(true));
}
// We must add the shape to the document tree to have it rendered.
parentSection.getBody().getFirstParagraph().appendChild(shape);
shape.getShapeRenderer().save(MyDir + "Out.png", imageOptions);
BufferedImage renderedImage = ImageIO.read(new File(MyDir + "Out.png"));
// Extract the actual content of the image by cropping transparent space around
// the rendered shape.
Rectangle cropRectangle = FindBoundingBoxAroundNode(renderedImage);
BufferedImage out = renderedImage.getSubimage(cropRectangle.x, cropRectangle.y, cropRectangle.width, cropRectangle.height);
File outputfile = new File(MyDir + "Out.png");
ImageIO.write(out, "png", outputfile);
}
public static Rectangle FindBoundingBoxAroundNode(BufferedImage originalBitmap)
{
Point min = new Point(Integer.MAX_VALUE, Integer.MAX_VALUE);
Point max = new Point(Integer.MIN_VALUE, Integer.MIN_VALUE);
for (int x = 0; x <originalBitmap.getWidth(); ++x)
{
for (int y = 0; y <originalBitmap.getHeight(); ++y)
{
int argb = originalBitmap.getRGB(x, y);
if (argb != new Color(0, 0, 0, 0).getRGB())
{
min.x = Math.min(x, min.x);
min.y = Math.min(y, min.y);
max.x = Math.max(x, max.x);
max.y = Math.max(y, max.y);
}
}
}
return new Rectangle(min.x, min.y, (max.x - min.x) + 1, (max.y - min.y) + 1);
}
Thanks Tahir Manzoor
Hi there,
Thanks for your feedback. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.
Hi Tahir,
Is there any unique identifier in table/Picture/char which remains same and has no effect of componenet added or deleted from actual document.
Hi there,
Thanks for your inquiry. It would be great if you please share some more detail about your query. What exact you want to achieve by using Aspose.Words? We will then provide you more information on this along with code.
I want to uniquely identify all tables/pictures/charts present in the word file.
What my system do:
It will receive word document and save all component as image file and create link to that image file.
Problem:
Let say there are 3 tables in word file. I generate 3 images for them and link 3rd image to some entity. next time user edit document and remove 2nd table and add a new table after 3 table what happen now all previous 3 images is deleted and new 3 images is created and link that is refering to 3rd image is now start refering to 4th one(newly creted).
So is there is any id, anme or any unique identifier so that i can identtfy that table/chart or picture as unique entity
Hi,
Thanks for your inquiry and sorry for the delayed response.
First of all, please note that Aspose.Words is quite different from the Microsoft Word’s Object Model in that it represents the document as a tree of objects more like an XML DOM tree. If you worked with any XML DOM library you will find it is easy to understand and work with Aspose.Words. When you load a Word document into Aspose.Words, it builds its DOM and all document elements and formatting are simply loaded into memory. Please read the following articles for more information on DOM:
https://docs.aspose.com/words/java/aspose-words-document-object-model/
https://docs.aspose.com/words/java/logical-levels-of-nodes-in-a-document/
Tables/shapes are represented by collections. There is no unique ID for each table inside Aspose.Words DOM. In your case, I suggest you please add unique bookmark inside table (e.g in first cell of table) to identify each table.
Document doc = new Document("in.doc");
// Get bookmark
Bookmark bk = doc.getRange().getBookmarks().get("bookmark");
if (bk != null)
{
// Get table, where bookmark is located.
Node table = bk.getBookmarkStart().getAncestor(NodeType.TABLE);
if (table != null)
{
// Work with table.
}
}
This Code works fine. but some time this will produce black backgroup image.
I run test on different different machine but sometimes I got the exact table image and sometimes the black image with only table data is visible. So can you please lock at this issue . I used latest version of aspose word
Hi Ashish,
Thanks for your inquiry. It
would be great if you please share following details for testing
purposes. I will investigate the issue on my side and provide you more
information.
- Please create a simple application (for example a Console Application Project) that helps us reproduce the same problem on our end and attach it here for testing.
- Please supply us with the input document that is causing the issue
- Please supply us with the output document showing the undesired behavior
- Please supply us with the expected document showing the desired behavior (You can create this document using Microsoft Word).
Hi
Can you please attach the C# code file again. I can’t find the attachment.
Thanks for your inquiry. Please check the following code example. Hope this helps you.
// Renders any node in a document to the path specified using the image save options.
public static Image RenderNode(Node node, ImageSaveOptions imageOptions)
{
// Run some argument checks.
if (node == null)
throw new ArgumentException("Node cannot be null");
// If no image options are supplied, create default options.
if (imageOptions == null)
imageOptions = new ImageSaveOptions(SaveFormat.Png);
// Store the paper color to be used on the final image and change to transparent.
// This will cause any content around the rendered node to be removed later on.
Color savePaperColor = imageOptions.PaperColor;
imageOptions.PaperColor = Color.Transparent;
// There a bug which affects the cache of a cloned node. To avoid this we instead clone the entire document including all nodes,
// find the matching node in the cloned document and render that instead.
Document doc = (Document)node.Document.Clone(true);
node = doc.GetChild(NodeType.Any, node.Document.GetChildNodes(NodeType.Any, true).IndexOf(node), true);
// Create a temporary shape to store the target node in. This shape will be rendered to retrieve
// the rendered content of the node.
Shape shape = new Shape(doc, ShapeType.TextBox);
Section parentSection = (Section)node.GetAncestor(NodeType.Section);
// Assume that the node cannot be larger than the page in size.
shape.Width = parentSection.PageSetup.PageWidth;
shape.Height = parentSection.PageSetup.PageHeight;
shape.FillColor = Color.Transparent;
// We must make the shape and paper color transparent.
// Don't draw a surronding line on the shape.
shape.Stroked = false;
Node currentNode = node;
// If the node contains block level nodes then just add a copy of these nodes to the shape.
if (currentNode is InlineStory || currentNode is Story)
{
CompositeNode composite = (CompositeNode)currentNode;
foreach (Node childNode in composite.ChildNodes)
{
shape.AppendChild(childNode.Clone(true));
}
}
else
{
// Move up through the DOM until we find node which is suitable to insert into a Shape(a node with a parent can contain paragraph, tables the same as a shape).
// Each parent node is cloned on the way up so even a descendant node passed to this method can be rendered.
// Since we are working with the actual nodes of the document we need to clone the target node into the temporary shape.
while (!(currentNode.ParentNode is InlineStory || currentNode.ParentNode is Story ||
currentNode.ParentNode is ShapeBase || currentNode.NodeType == NodeType.Paragraph))
{
CompositeNode parent = (CompositeNode)currentNode.ParentNode.Clone(false);
currentNode = currentNode.ParentNode;
parent.AppendChild(node.Clone(true));
node = parent; // Store this new node to be inserted into the shape.
}
// Add the node to the shape.
shape.AppendChild(node.Clone(true));
}
// We must add the shape to the document tree to have it rendered.
parentSection.Body.FirstParagraph.AppendChild(shape);
// Render the shape to stream so we can take advantage of the effects of the ImageSaveOptions class.
// Retrieve the rendered image and remove the shape from the document.
MemoryStream stream = new MemoryStream();
shape.GetShapeRenderer().Save(stream, imageOptions);
shape.Remove();
Bitmap croppedImage;
// Load the image into a new bitmap.
using (Bitmap renderedImage = new Bitmap(stream))
{
// Extract the actual content of the image by cropping transparent space around
// the rendered shape.
Rectangle cropRectangle = FindBoundingBoxAroundNode(renderedImage);
croppedImage = new Bitmap(cropRectangle.Width, cropRectangle.Height);
croppedImage.SetResolution(imageOptions.HorizontalResolution, imageOptions.VerticalResolution);
// Create the final image with the proper background color.
using (Graphics g = Graphics.FromImage(croppedImage))
{
g.Clear(savePaperColor);
g.DrawImage(renderedImage, new Rectangle(0, 0, croppedImage.Width, croppedImage.Height), cropRectangle.X, cropRectangle.Y, cropRectangle.Width, cropRectangle.Height, GraphicsUnit.Pixel);
}
}
return croppedImage;
}
///
/// Finds the minimum bounding box around non-transparent pixels in a Bitmap.
///
public static Rectangle FindBoundingBoxAroundNode(Bitmap originalBitmap)
{
Point min = new Point(int.MaxValue, int.MaxValue);
Point max = new Point(int.MinValue, int.MinValue);
for (int x = 0; x < originalBitmap.Width; ++x)
{
for (int y = 0; y < originalBitmap.Height; ++y)
{
// Note that you can speed up this part of the algorithm by using LockBits and unsafe code instead of GetPixel.
Color pixelColor = originalBitmap.GetPixel(x, y);
// For each pixel that is not transparent calculate the bounding box around it.
if (pixelColor.ToArgb() != Color.Empty.ToArgb())
{
min.X = Math.Min(x, min.X);
min.Y = Math.Min(y, min.Y);
max.X = Math.Max(x, max.X);
max.Y = Math.Max(y, max.Y);
}
}
}
// Add one pixel to the width and height to avoid clipping.
return new Rectangle(min.X, min.Y, (max.X - min.X) + 1, (max.Y - min.Y) + 1);
}
if i have big table then how can i convert into images.
Input: below attached file is input file
WTR SUSTAINABLE INDEX CONSTITUENTS.docx (46.4 KB)
output: below attached document is getting output
Out1.jpeg (223.5 KB)
problems: not getting all the tables data as images. only getting a single page as images .
requirements: is to get all table data as images , if table is large
@aelum This is expected behavior. Aspose.Words layouts the document into pages while saving document to image. In order to get the whole big table as a single image, you should enlarge page height using PageSetup.PageHeight. But you should note that the maximum allowed height of the page in MS Word document is 1584 points.
I tried to convert tables into images, and I would like to ask how to replace the original table nodes with images on this basis. This way, I can convert all tables in a document into image nodes instead of exporting images
@zhengkai You can use the following code to achieve this:
Document doc = new Document(@"C:\Temp\in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
// Get all top level tables.
List<Table> tables = doc.GetChildNodes(NodeType.Table, true).Cast<Table>()
.Where(t => t.GetAncestor(NodeType.Table) == null).ToList();
// Render tables to images and replace.
foreach (Table table in tables)
{
byte[] tableImage = RenderTable(table);
// insert paragraph before the table.
Paragraph p = new Paragraph(doc);
table.ParentNode.InsertBefore(p, table);
// Move document builder to the crete paragraph and insert the image.
builder.MoveTo(p);
builder.InsertImage(tableImage);
// Remove the table
table.Remove();
}
doc.Save(@"C:\Temp\out.docx");
private static byte[] RenderTable(Table tableToRender)
{
Document oneTableDoc = (Document)tableToRender.Document.Clone(false);
oneTableDoc.EnsureMinimum();
oneTableDoc.FirstSection.Body.PrependChild(oneTableDoc.ImportNode(tableToRender, true, ImportFormatMode.UseDestinationStyles));
// Set maximum allowed page height
oneTableDoc.FirstSection.PageSetup.PageHeight = 1584;
LayoutCollector collector = new LayoutCollector(oneTableDoc);
LayoutEnumerator enumerator = new LayoutEnumerator(oneTableDoc);
Table table = oneTableDoc.FirstSection.Body.Tables[0];
// Calculate table size.
// For demonstration purposes the example purposes the while table is on the same page.
enumerator.Current = collector.GetEntity(table.FirstRow.FirstCell.FirstParagraph);
int startPageIndex = enumerator.PageIndex;
// Move enumerator to a row.
while (enumerator.Type != LayoutEntityType.Row)
enumerator.MoveParent();
double top = enumerator.Rectangle.Y;
double left = enumerator.Rectangle.X;
// Move enumerator to the last row.
enumerator.Current = collector.GetEntity(table.LastRow.FirstCell.FirstParagraph);
int endPageIndex = enumerator.PageIndex;
// Move enumerator to a row.
while (enumerator.Type != LayoutEntityType.Row)
enumerator.MoveParent();
double bottom = enumerator.Rectangle.Y + enumerator.Rectangle.Height;
double right = enumerator.Rectangle.X + enumerator.Rectangle.Width;
// Reset margins
PageSetup ps = oneTableDoc.FirstSection.PageSetup;
ps.PageWidth = ps.PageWidth - ps.LeftMargin - ps.RightMargin;
ps.LeftMargin = 0;
ps.RightMargin = 0;
ps.PageHeight = ps.PageHeight - ps.TopMargin - ps.BottomMargin;
ps.TopMargin = 0;
ps.BottomMargin = 0;
// Set calculated width
ps.PageWidth = right - left;
// Do not set page height if table spans several pages.
if (startPageIndex == endPageIndex)
ps.PageHeight = bottom - top;
oneTableDoc.UpdatePageLayout();
// Render table to image and return image bytes
using (MemoryStream tableImageStream = new MemoryStream())
{
oneTableDoc.Save(tableImageStream, SaveFormat.Png);
return tableImageStream.ToArray();
}
}
Thank you very much. Could you please try again using the Java