Free Support Forum - aspose.com

Read a docx file in Office Open XML (OOXML)

Hi
How to read a docx file in Office Open XML (OOXML) format(without images). And then manipulate data <!–[if gte mso 9]>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-IN</w:LidThemeOther>
<w:LidThemeAsian>X-NONE</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:DontVertAlignCellWithSp/>
<w:DontBreakConstrainedForcedTables/>
<w:DontVertAlignInTxbx/>
<w:Word11KerningPairs/>
<w:CachedColBalance/>
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
<m:mathPr>
<m:mathFont m:val=“Cambria Math”/>
<m:brkBin m:val=“before”/>
<m:brkBinSub m:val="–"/>
<m:smallFrac m:val=“off”/>
<m:dispDef/>
<m:lMargin m:val=“0”/>
<m:rMargin m:val=“0”/>
<m:defJc m:val=“centerGroup”/>
<m:wrapIndent m:val=“1440”/>
<m:intLim m:val=“subSup”/>
<m:naryLim m:val=“undOvr”/>
</m:mathPr></w:WordDocument>
<![endif]–><!–[if gte mso 10]>

/* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";}

<![endif]–><span style=“font-family:“Courier New”;mso-fareast-font-family:“Courier New”;
mso-ansi-language:EN-US” lang=“EN-US”>(<span style=“font:7.0pt “Times New Roman””>
Insert
element before/after an element

), then write the docx file. If you can just give me an example for this it will be really helpful to me…

Hi Rinku,


Thanks for your inquiry.

You can load DOCX file into Aspose.Words Document instance and then remove all images inside that document by using the following code snippet:

Document
doc = new Document(@“C:\test\in.docx”);

NodeCollection shapes = doc.GetChildNodes(NodeType.Shape,
true, false);
foreach (Shape shape in
shapes)
{
if
(shape.HasImage)
{
shape.Remove();
}
}

Also, please visit the following link to learn how to extract images from a Document by visiting the following link:
http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/howto-extract-images-from-a-document.html

Once the document is loaded and all images are removed/extracted, you would be able to easily insert elements after reading the article suggested below:
http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/inserting-document-elements.html

Finally, I would suggest you to please read the following API page for a variety of saving options:
http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/aspose.words.document.save_overloads.html

Please let us know if you need more information, We are always glad to help you.

Best Regards,

But I need to read the docx without any images from the beginning. So that I will not store any images in memory(Sometimes user can upload big size images in template)… Is this possible with aspose???

Hello

Thanks for your request. I'm afraid there is no way to achieve what you need using Aspose.Words, without loading the document to Aspose.Words Document Object Model.

Best regards,

Hi there,


Thanks for your inquiry.

We have an interface where you can choose to load or skip images during document open: http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/aspose.words.loading.iresourceloadingcallback.html.

However I’m afraid this only takes effect when HTML based formats are loaded, and not OOXML documents. We will look into extending this behavior for all load formats so you can skip loading images. I have linked your request to the appropriate issue. We will inform you as soon as there are any developments.

I’m afraid in the mean time you will need to use the work around described by Awais.

Thanks,

Hi
Thanx for your quick replies. So if I remove images after loading ,whether I have to load all the images back while saving the document after processing or Aspose will take care of that???

Hello

Thanks for your inquiry. If you remove all images from the document, they will be lost.

Please see the following link to learn more about Aspose.Words features:

http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/feature-overview.html

Best regards,

Hi
What ever links and example you are giving are in .net. If you can give some java examples it will be a great help. Because I am working on Aspose.words java.

Thank You

Hello Rinku,

Thanks for your inquiry. The same for Java

You can load DOCX file into Aspose.Words Document instance and then remove all images inside that document by using the following code snippet:

Document doc = new Document("C:\\test\\in.docx");

NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true, false);

for (int i = 0; i < shapes.getCount(); i++)

{

Shape shape = (Shape)shapes.get(i);

if (shape.hasImage())

{

shape.remove();

}

}

Also, please visit the following link to learn how to extract images from a Document by visiting the following link:

http://www.aspose.com/documentation/java-components/aspose.words-for-java/howto-extract-images-from-a-document.html

Once the document is loaded and all images are removed/extracted, you would be able to easily insert elements after reading the article suggested below:

http://www.aspose.com/documentation/java-components/aspose.words-for-java/inserting-document-elements.html

Finally, I would suggest you to please read the following API page for a variety of saving options:

http://www.aspose.com/documentation/java-components/aspose.words-for-java/com/aspose/words/saveoptions.html

Please let us know if you need more information, We are always glad to help you.

Best Regards,

The issues you have found earlier (filed as WORDSNET-5600) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(51)