Hi,
We have taken an Aspose licence.We are trying to convert word files to HTML. In these conversions, we have some files which contain excel objects embedded within them. The issue is that the embedded excel sheets are
converted to images. So I am not able to manipulate the content of
those excel sheets in the converted HTML file.
What we need is some way to extract the embedded excel sheets in a word file. Can Aspose help with that?
The
best scenario would be to convert the excel object and have it saved
within the converted HTML file itself. But if this is not possible, is it
possible to extract the excel object and store it as a separate html
file?
Thanks and Regards,
Nikhil
Hi Nikhil,
Thanks for your inquiry. Please try using the following code snippet:
Document doc = new Document(@"C:\test\in.docx");
// Get collection of shapes
NodeCollection shapes = doc.GetChildNodes(NodeType.Shape, true);
int i = 0;
// Loop through all shapes
foreach(Shape shape in shapes)
{
if (shape.OleFormat != null)
{
if (!shape.OleFormat.IsLink)
{
// Extract OLE Word object
if (shape.OleFormat.ProgId == "Word.Document.12")
{
MemoryStream stream = new MemoryStream();
shape.OleFormat.Save(stream);
Document newDoc = new Document(stream);
newDoc.Save(string.Format(@"C:\test\outEmbeded_{0}.html", i));
i++;
}
// Extract OLE Excel object
if (shape.OleFormat.ProgId == "Excel.Sheet.12")
{
// Here you can use Aspose.Cells component
// to be able to convert MS Excel files to separate HTML files**
}
}
}
}
doc.Save(@"C:\test\out.html");
Moreover, you can download latest Aspose.Cells release version from the following link:
https://releases.aspose.com/cells/net/
I hope, this will help.
Best Regards,
Hi Awais,
I tried using your code snippet but I am getting an exception. PFA the screen-shot for the same. Do let me know how to work around this.
Thanks,
Nikhil
Hi Nikhil,
Thanks for your inquiry. Could you please attach your input Word document here for testing? I will investigate the issue on my side and provide you more information.
Best Regards,
Hi Awais,
I have had a discussion with the Cells team and the issue has been resolved. Thanks.
Regards,
Nikhil
Hi
Nikhil,
Thanks for your inquiry. It is perfect that you managed to resolve the problem on your side. Please let us know any time you have any further queries.
Best Regards,
Hi Awais,
As suggested by the cells team, here is the code I am using to add a link to the image of the Excel sheet embedded in the word doc.
int i = 0;
foreach(Aspose.Words.Drawing.Shape shape in shapes)
{
if (null != shape.OleFormat && shape.OleFormat.ProgId.Equals("Excel.Sheet.12"))
{
MemoryStream stream = new MemoryStream();
shape.OleFormat.Save(stream);
stream.Seek(0, SeekOrigin.Begin);
Aspose.Cells.Workbook newExcelDoc = new Aspose.Cells.Workbook(stream);
destinationFilePath = @"D:\Excel Embed Test\Output\" + Destinationfilename + "_" + i + ".html ";
newExcelDoc.Save(destinationFilePath);
shape.HRef = destinationFilePath;
++i;
}
}
This works perfectly. The excel object is an image in the converted file, but since it’s href is set, clicking on it takes me to the excel file for modification.
However, the requirement is that I want to replace the image by the anchor tag. Is that possible? Could you please provide a code snippet for the same?
Thanks and regards,
Nikhil
Hi
Nikhil,
Thanks for your inquiry. Could you please do the following:
- Attach your input Word document that contains your embedded excel worksheets.
- Attach your current output HTML files that shows images instead of hyperlinks
- Share your target (expected) output HTML files.
We will investigate possible ways to achieve what you’re looking for and provide you code snippet.
Best Regards,
Hi Awais,
PFA the required documents. The expected output isn’t perfect since i have replaced image references manually. Simple thing is that instead of image, I want that location in the html file to contain a link to the html file corresponding to that image.
Thanks and Regards,
Nikhil
Hi
Nikhil,
Thanks for the additional information. I am working over your most recent queries and will get back to you soon.
Best Regards,
Hi Nikhil,
Thanks for your patience. Please use the following code snippet to achieve what you’re looking for:
Document doc = new Document(@"c:\test\Image+issue\Image issue\ SampleExcelEmWord.docx ");
DocumentBuilder builder = new DocumentBuilder(doc);
Node[] shapes = doc.GetChildNodes(NodeType.Shape, true).ToArray();
int i = 0;
foreach (Shape shape in shapes)
{
if (shape.OleFormat != null)
{
if (!shape.OleFormat.IsLink)
{
if (shape.OleFormat.ProgId == "Excel.Sheet.12")
{
MemoryStream stream = new MemoryStream();
shape.OleFormat.Save(stream);
stream.Seek(0, SeekOrigin.Begin);
Aspose.Cells.Workbook newExcelDoc = new Aspose.Cells.Workbook(stream);
string destinationFilePath = @"C:\test\Image+issue\Image issue\Output\excel_" + i + ".html";
newExcelDoc.Save(destinationFilePath);
builder.MoveTo(shape);
builder.InsertHyperlink("excel_" + i + ".html", destinationFilePath, false);
shape.Remove();
++i;
}
}
}
}
doc.Save(@"C:\test\Image+issue\Image issue\outImg.html");
I hope, this will help.
Best Regards,