Hi,
I have a large xml document (90MB). When I try to convert this XML to PDF using Aspose.Words (21.1.0.0) it takes years to respond.
Following is my code base.
Aspose.Words.Document Document = new Aspose.Words.Document(GetExtendedFilePath(InputFileName));
Document.FieldOptions.FieldUpdateCultureSource = FieldUpdateCultureSource.FieldCode;
using (FileStream CompressedFileStream = new FileStream(tempfile, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None))
{
Document.Save(CompressedFileStream, Aspose.Words.SaveFormat.Pdf);
CompressedFileStream.Seek(0, System.IO.SeekOrigin.Begin);
OutputFileBytes = new byte[CompressedFileStream.Length];
CompressedFileStream.Read(OutputFileBytes, 0, (int)CompressedFileStream.Length);
}
Time is taken on line Document.Save(CompressedFileStream,Aspose.Words.SaveFormat.Pdf);
Kindly update me on this if the document of this size is feasible to test with or not.
You may find document here.
https://ascertia0-my.sharepoint.com/:u:/g/personal/muhammad_hasnain_ascertia_com/EYkATGG-JaZAqzZ8x529BwcBGMzB3IEENlXO84W70QXnmA?e=4Mi1xo
@Wahaj_Khan Unfortunately, I have no access to the shared file. Could you please zip it and attach here? We will check the issue and provide you more information.
@alexey.noskov unfortunately, the document is larger than 50MBs after compressing. Can you please try following link?
https://ascertia0-my.sharepoint.com/:u:/g/personal/muhammad_hasnain_ascertia_com/EYkATGG-JaZAqzZ8x529BwcBGMzB3IEENlXO84W70QXnmA?e=cCmwNG
@Wahaj_Khan Unfortunately, I still cannot access the document. Could you please share it through dropbox or google drive?
@alexey.noskov Sorry for inconvenience, here is the Dropbox link.
https://www.dropbox.com/s/9xa7fd9fiw29gg7/Alexander%20Kafka-SEA%20ABC%2006.10.17-B.zip?dl=0
@Wahaj_Khan I have checked your document and it is not MS Word xml document. Aspose.Words detects it as TXT document and loads as simple text.
The syntax of the your XML file looks like XPS or something like this. What application do you use to produce the attached XML document and what is the expected output of conversion? Do you expect to see XML in the output document?
@alexey.noskov
This file is converted from PDF to XML using online software. Since its extension is XML I expect this file to be uploaded on system.
So far as I know Aspose has two methods to convert from XML to PDF. One is using Aspose.Words and other is Aspose.PDF. We use Aspose.Words because, we need to show XML output after conversion.
@Wahaj_Khan It is still not clear what is your expected output. Could you please attach a simple (small) input and expected output documents here? Do you expect to see something like this in the output? image.png (27.1 KB)
The screenshot is made from DOCX document produced by Aspose.Words from your XML. Such document is extremally huge - about 25k pages.
@alexey.noskov Sorry for inconvenience. Please find the document in attachment, this is how we convert XML to PDF using Aspose.Words.dll
Following is code.
Demo_test.pdf (34.3 KB)
Aspose.Words.Document Document = new Aspose.Words.Document(GetExtendedFilePath(InputFileName));
Document.FieldOptions.FieldUpdateCultureSource = FieldUpdateCultureSource.FieldCode;
using (FileStream CompressedFileStream = new FileStream(tempfile, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None))
{
Document.Save(CompressedFileStream,Aspose.Words.SaveFormat.Pdf);
CompressedFileStream.Seek(0, System.IO.SeekOrigin.Begin);
OutputFileBytes = new byte[CompressedFileStream.Length];
CompressedFileStream.Read(OutputFileBytes, 0, (int)CompressedFileStream.Length);
}
@Wahaj_Khan Thank you for additional information. Actually, I can convert your document to PDF using Aspose.Words. it takes about 780seconds on my side. By the way it took about 15 minutes to convert the document to PDF using MS Word, so the result is comparable with MS Word.
The problem with your XML file is that there is resource in base64 format, which is represented by one very very long string without line breaks. Such long strings are difficult to layout. For example the following code converts your XML file abut 2 times faster:
Document doc = new Document(@"C:\temp\Alexander Kafka-SEA ABC 06.10.17-B.xml");
doc.FirstSection.PageSetup.Orientation = Aspose.Words.Orientation.Landscape;
doc.FirstSection.PageSetup.PaperSize = Aspose.Words.PaperSize.A3;
doc.Save(@"C:\Temp\out.pdf");
As you can see I have enlarged page size so layout engine have to make less line breaks calculations.
Another option to make the conversion faster is breaking long strings. for example see the following code:
string veryLongString = File.ReadAllText(@"C:\temp\Alexander Kafka-SEA ABC 06.10.17-B.xml");
veryLongString = Regex.Replace(veryLongString, "(\\S{74})", "$1\r\n");
File.WriteAllText(@"C:\temp\modified.xml", veryLongString);
Document doc = new Document(@"C:\temp\modified.xml");
doc.Save(@"C:\Temp\out.pdf");
In this code preprocessing of your XML file take about 2 seconds, document loading about 13 seconds and rendering to PDF about 265 seconds.
@alexey.noskov
Thanks for detailed reply. Much appreciated. I will check it and conclude on my side.
you may close this issue. We can conclude that the format of this XML will taking time to process.
@Wahaj_Khan Thank you for letting us know. Please feel free to ask in case of any other issues. We will be glad to help you.