Free Support Forum - aspose.com

Page-wise conversion of documents using Aspose.Words .Net COM Interop

Hi,

I'm using the Aspose.Words.Net component to convert docx documents to epub, pdf, rtf, etc. I am also converting docx to TIFF and JPEG formats.

I'm calling the component from VC++ via the COM Interop provided by Aspose.Words.Net.

I first open the document, I want to convert, by calling "open" on the ComHelper pointer with the file name of the file I am trying to convert. I then call "save" on the resultant Document pointer with the name, location and extension of the target file. If the target format is different from the source format, the file gets converted to the new format.

Can I do this kind of conversion page-by-page? Can I load the entire document by calling "open" and then only convert it to the target format one page at a time? The reason is, I am trying to avoid having to convert a docx of hundreds or thousands of pages if I only need one or a few pages converted to a new format.

Is this possible?

Thanks!

Hi Saif,


Thanks for your query. Please read the article : Use Aspose.Words for .NET via COM Interop

You can convert each page of MS Word document to PDF/Image by using PdfSaveOptions and ImageSaveOptions. Please check the following C# code example for your kind reference.

http://www.aspose.com/docs/display/wordsnet/How+to++Save+Document+as+a+Multipage+TIFF

<span lang=“EN-GB” style=“font-family:“Courier New”;color:green;mso-ansi-language:EN-GB;
mso-no-proof:yes”>//Convert each page to PNG file<span lang=“EN-GB” style=“font-family:“Courier New”;color:#2B91AF;mso-ansi-language:EN-GB;
mso-no-proof:yes”>

Document<span lang=“EN-GB” style=“font-family:“Courier New”;mso-ansi-language:
EN-GB;mso-no-proof:yes”> doc = new Document(MyDir + “in.docx”);<o:p></o:p>

ImageSaveOptions options = new ImageSaveOptions(SaveFormat.Png);

options.PageCount = 1;

for (int pageIndex = 0; pageIndex < doc.PageCount; pageIndex++)

{

string outputFileName = MyDir + string.Format("{0}_{1}.png", "Test", pageIndex + 1);

options.PageIndex = pageIndex;

doc.Save(outputFileName, options);

}

//Convert each page to PDF file
Document doc = new Document(MyDir + "in.docx");

PdfSaveOptions options = new PdfSaveOptions();

options.PageCount = 1;

for (int pageIndex = 0; pageIndex < doc.PageCount; pageIndex++)

{

string outputFileName = MyDir + string.Format("{0}_{1}.pdf", "Test", pageIndex + 1);

options.PageIndex = pageIndex;

doc.Save(outputFileName, options);

}


Regarding conversion of MS Word document to Epub/RTF for each page, There is no direct way to convert each page of Doc/Docx to Epub/RTF. However, you can use the PageFinder utility to convert each page to Epub/RTF. Please find PageFinder utility in attachment. You can use this utility by using following C# code snippet.

Document doc = new Document(MyDir + "in.docx");

Document dstDoc = ExtractContentBetweenPages(doc, 1, 1);

dstDoc.Save(MyDir + "AsposeOut.rtf");

public Document ExtractContentBetweenPages(Document srcDoc, int fromPage, int toPage)

{

// Set up the document which pages will be copied to. Remove the empty section.

Document dstDoc = new Document();

dstDoc.RemoveAllChildren();

PageNumberFinder finder = new PageNumberFinder(srcDoc);

// Split nodes which are found across pages.

finder.SplitNodesAcrossPages(true);

// Copy all content including headers and footers from the specified pages into the destination document.

NodeImporter importer = new NodeImporter(srcDoc, dstDoc, ImportFormatMode.UseDestinationStyles);

for (int page = fromPage; page <= toPage; page++)

{

ArrayList pageSections = finder.RetrieveAllNodesOnPages(fromPage, toPage, NodeType.Section);

foreach (Section section in pageSections)

{

dstDoc.AppendChild(importer.ImportNode(section, true));

}

}

return dstDoc;

}



You need to create wrapper class to achieve your requirements. Please read following forum link for your kind reference.

http://www.aspose.com/community/forums/permalink/411516/412991/showthread.aspx#412991

Tahir,

Thank you for your quick reply.

I have a follow up question:

Can I use this method to convert individual pages from documents with formats other than docx to formats like epub, PDF, TIFF, JPEG, docx, etc.?

Thanks,

Saif Faruqui

I just finished reading your reply.

So are PDF and image files the only ones I can convert to page-wise?

Also, is docx the only source format I can convert page-wise to PDF and image files? Are there other source formats I can convert?

Thanks,

Saif Faruqui

Tahir,

I just finished reading your reply.

So can I convert docx files, page-wise, to image formats like TIFF and JPEG? Are any other output formats, besides PDF and image formats, supported?

Can I, page-wise, convert other source formats, besides docx, to PDF and image formats? For example PDF, ePub, etc.

Thanks,

Saif Faruqui

Hi Saif,


sfar:
So are PDF and image files the only ones I can convert to page-wise?

Yes, only PDF and Image files can be converted page wise by using PdfSaveOptions and ImageSaveOptions.

sfar:
Also, is docx the only source format I can convert page-wise to PDF and image files? Are there other source formats I can convert?

Aspose.Words.Document loads many other formats like Doc, RTF, HTML etc. Please read LoadFormats of Aspose.Words.Document.

sfar:
So can I convert docx files, page-wise, to image formats like TIFF and JPEG? Are any other output formats, besides PDF and image formats, supported?

Yes, you can convert Docx to images like Tiff and JPEG page wise. Please read LoadFormatsof Aspose.Words.Document.

sfar:
Can I, page-wise, convert other source formats, besides docx, to PDF and image formats? For example PDF, ePub, etc.

Please read following documentation links for your kind reference.


Please read documentation of Aspose.Words for .NET and let us know if you have any more queries.

A question about Aspose.PDF for .Net:

I looked through the programmer's guide for Aspose.PDF for .Net and did not find any information about using it through COM Interop.

I am calling the Aspose libraries from C++ code.

Can I call Aspose.PDF for .Net via COM Interop?

Thanks,

Saif Faruqui

Hi Saif,


Thanks for your inquiry. Your question is related to Aspose.PDF component. I am moving this thread to Aspose.PDF forum and my colleagues from Aspose.PDF team will reply you shortly.

Hi Saif,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Please check the reply to your query regarding Aspose.Pdf here.

Thank You & Best Regards,