We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

OutOfMemoryException when calling SplitToPages

I am receiving an OutOfMemory exception when calling PdfFileEditor.SplitToPdf after converting from some Word documents. Not sure where the issue lies right now.


I have attached the Word document and snippit of code I am using to reproduce the error. The Word file is only 6.73mb and once it is converted to Pdf it is 8.48mb.

Words.Document wordDoc = new Words.Document("input.docx");
MemoryStream asPdf = new MemoryStream();
wordDoc.Save(asPdf, Words.SaveFormat.Pdf);
Pdf.Facades.PdfFileEditor fileEditor = new Pdf.Facades.PdfFileEditor();
MemoryStream[] splitPages = fileEditor.SplitToPages(asPdf);

Attachments


input.docx : This is the file I am trying to convert to Pdf and split into pages.
output.pdf : This is a dump of the MemoryStream that is being fed into the SplitToPages function.

Hi John,


Thanks for contacting support.

I have tested the scenario and I am able to
notice the same problem. For the sake of correction, I have logged this issue
as PDFNEWNET-35085 in our issue tracking system. We will
further look into the details of this problem and will keep you updated on the
status of correction. Please be patient and spare us little time. We are sorry
for this inconvenience.

Hi John,


Thanks for your patience.

We have further investigated the issue reported earlier as PDFNEWNET-35085 and have been able to resolve this issue. Please note that SplitToPages is very memory demanding method because it returns a data of all resultant documents in memory.

The document which you have shared has 2000+ pages which causes extra memory requirement. Saved documents have total size ~1Gb. In order to avoid this problem, the following methods were implemented in PdfFileEditor:

  • SplitToPages(string inputFile, string outFileTemplate)
  • SplitToPages(Stream inputStream, string outFileTemplate)
These methods save every page of the document in separate document. Because files are saved sequentially and no need to store all documents in memory, this will demand less memory then old method. Output template must contain substring %NUM% which will be replaced with appropriate page number.

For example, if template is “page-%NUM%.pdf”, then resultant files will have names page-1.pdf, page-2.pdf… etc.

[C#]

Aspose.Pdf.Facades.PdfFileEditor
fileEditor = new Aspose.Pdf.Facades.PdfFileEditor();<o:p></o:p>

fileEditor.SplitToPages(TestSettings.GetInputFile(“35085.pdf”),
TestSettings.GetOutputFile("/35085/Page-%NUM%.pdf"));



You may also consider using the Document Object Model of Aspose.Pdf namespace to accomplish your requirement.

[C#]

Document doc = new
Document(“c:/pdftest/WordConversion
(1).pdf”
);<o:p></o:p>

for (int i = 1; i <= doc.Pages.Count; i++)

{

//create new document

Document newDoc = new Document();

//copy page from the source document into resultant document

newDoc.Pages.Add(doc.Pages[i]);

newDoc.Save("c:/pdftest/page-" + i + ".pdf");

}

The issues you have found earlier (filed as PDFNEWNET-35085) have been fixed in Aspose.Pdf for .NET 8.0.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.