Problems with file sizes on trial version of PDF and Words

Using Visual Studio 2013 with VB.net

I am currently evaluating the product and am keen to know whether I can expect the same file sizes after conversion with the full version as I’m getting at the moment?

I have tried converting a number of PDF documents with file sizes of 30-50k and the corresponding .docx files are coming out at nearly 500k each. I am thinking that this might be some form of in-built failsafe to stop people poaching the product however if this is the reality after purchase it won’t be a real consideration for me.

I wondered if someone could possibly comment?

Hi Matthew,


Thanks for your interest in Aspose.Pdf.

We will appreciate it if you please share your sample input PDF and output DOCX here, we will look into it and will guide you accordingly.

We are sorry for the inconvenience caused.

Best Regards,

Hi

I had managed to sort out my problem (hence the length of time replying) however I am stuck on a new conversion and wondered if you might be able to help.

I am currently attempting to convert from PDF to Word 2013. The PDF file is 9kb but is being converted to a word document of 500kb. Unfortunately, we store millions of records and this is causing a little bit of concern.

I am currently using the optimization code to reduce the PDF via the following;

optimization.LinkDuplcateStreams = True
optimization.RemoveUnusedObjects = True
optimization.RemoveUnusedStreams = True
optimization.CompressImages = True
optimization.ImageQuality = 1
PDFDoc.OptimizeResources(optimization)

I wondered if you could highlight which parts of the conversion are causing problems, and if there are any remedies that you can think of.

Thanks.

Hi Matthew,


Thanks for contacting support.

The PDF file is 9kb but is being converted to a word document of 500kb

I have tested the scenario with following code snippet using Aspose.Pdf for .NET 17.3.0, and the generated document file is of 143 KB. I have attached the converted document file for your reference.

C#
<span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>
<span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>var pdfDoc = <span class=“kwrd” style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”> Document(<span class=“kwrd” style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”> FileStream(dataDir + “ACoAAAK6S54BWFLYsu4Lx8Pr6cwJ4p212QMDdxo.pdf”, FileMode.Open));
var saveOptions = new Aspose.Pdf.DocSaveOptions
{
Mode = DocSaveOptions.RecognitionMode.Textbox,
RelativeHorizontalProximity = 2.5f,
RecognizeBullets = true,
Format = DocSaveOptions.DocFormat.Doc
};

pdfDoc.Save(dataDir + “finaldoc12.doc”, saveOptions);

If you still face any issues, please share which Aspose.Pdf API version you are using and please tell your environment as well. It will help us to understand the problem exactly and address it accordingly.

We are sorry for the inconvenience.

Best Regards,

Hi,

Thanks for the reponse. Unfortunately, this is still huge compared to the original size.

As a temporary measure I have started using Interop.Word and word itself opens PDF and can save roughly at the same size.

Why does Aspose save the word object at 150kb, when opening in word directly and saving as PDF only comes out at 20kb? Unfortunately, these files size are far too large for me to consider using Aspose.PDF to convert work documents if its adding so much extra to the files.

Is there any way that proceduraly I can iterate through the PDF document and recreate the word document this way. I am very keen to use Aspose in my project as the objects are all in memory rather than actually creating an actual instance of word however this is making it very difficult.

Hi Matthew,


Thanks for sharing further details.

Unfortunately, this is still huge compared to the original size.
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial; -webkit-text-stroke: #000000} span.s1 {font-kerning: none}

I have logged a ticket PDFNET-42477 in our issue tracking system that the resultant doc file size should be roughly similar to as of PDF file. We will further look into the details of this problem and will keep you posted on the status of correction. Please be patient and spare us little time.


We are sorry for this inconvenience.


Best Regards,

I wondered if you had any further information on this ticket as yet please?

Hi,

Thanks for your patience.

The earlier reported issue is still pending for review and is not yet resolved. However as soon as we have some definite updates regarding its resolution, we will let you know. Please be patient and spare us little time.

Hi,

I wondered if there had been any further update?

Regards

Hi Matthew,


Thanks for your patience.

I am afraid the earlier reported issue is pending for review and is not yet resolved. However the product team will surely consider investigating/fixing it as per their development schedule and as soon as we have some definite updates regarding its resolution, we will let you know. Please be patient and spare us little time. We are sorry for this delay and inconvenience.