How to convert documents to pdf without saving to disk


#1

Hi,

I’m having trouble converting multiple types of documents (txt,doc,docx,xls,xlsx,bmp,jpeg,jpg,tif,csv) to pdf without saving to the harddrive. Is there a way to do this in memory? Do they have to be saved as a XMLDocument before converting to pdf? Also, do I need to use multiple different objects to convert (Pdf.Document for Word docs)? Thanks for your help!
Krista


#2

Hi Krista,

You can use Aspose.Words.Save, Aspose.Cells.Save and Aspose.Pdf.Save methods to save your documents to streams. Also, Aspose.Words.Document, Aspose.Cells.Workbook and Aspose.Pdf.Document allow you to load the documents from streams so you can load and save documents using streams instead of saving to files on disk.

As far as using Aspose.Pdf.Document to load Word documents is concerned, it is not possible because Aspose.Pdf and Aspose.Words are different APIs and can be used to handle different types of documents so you cannot load Word document using Aspose.Pdf and PDF documents using Aspose.Words.

Best Regards,


#3

Thanks! I have this working for Excel spreadsheets and Word documents, but not for .txt files or for images. It will convert to pdf format, but there is no data in it. These are my two methods.
public byte[] convertDoctoPDF(Stream docToConvertStream)
{
//set Aspose license
Aspose.Pdf.License license = new Aspose.Pdf.License();
license.SetLicense(“Aspose.Total.lic”);
//license.Embedded = true;

        // Create a new memory stream.
        System.IO.TextReader txtReader = new StreamReader(docToConvertStream);

        //Aspose.Pdf.Document doc = new Aspose.Pdf.Document(docToConvertStream);
        Aspose.Pdf.Generator.Pdf doc = new Aspose.Pdf.Generator.Pdf();
        //Create a new section in the Pdf object
        Aspose.Pdf.Generator.Section sec1 = doc.Sections.Add();

        //Create a new text paragraph and pass the text to its constructor as argument
        Aspose.Pdf.Generator.Text t2 = new Aspose.Pdf.Generator.Text(txtReader.ReadLine());
        sec1.Paragraphs.Add(t2);

        MemoryStream outStream = new MemoryStream();

        doc.Save(outStream);

        // Convert the document to byte form.
        byte[] docBytes = outStream.ToArray();

        docToConvertStream.Close();
        outStream.Close();

        return docBytes;
    }

    public byte[] convertImagetoPDF(Stream imageToConvertStream)
    {
        //set Aspose license
        Aspose.Pdf.License license = new Aspose.Pdf.License();
        license.SetLicense("Aspose.Total.lic");
        //license.Embedded = true;

        //Instantiate Pdf instance by calling its empty constructor
        Aspose.Pdf.Generator.Pdf pdf1 = new Aspose.Pdf.Generator.Pdf();
        //Add a section into the pdf document
        Aspose.Pdf.Generator.Section sec = pdf1.Sections.Add();

        // Create a FileStream object to read the imag file
        //FileStream fs = File.OpenRead(@"d:\pdftest\Aspose.jpg");
        // Read the image into Byte array
        byte[] data = new byte[imageToConvertStream.Length];
        imageToConvertStream.Read(data, 0, data.Length);

        // Create a MemoryStream object from image Byte array
        MemoryStream ms = new MemoryStream(data);
        //Create an image object in the section 
        Aspose.Pdf.Generator.Image imageht = new Aspose.Pdf.Generator.Image(sec);
        //Set the type of image using ImageFileType enumeration
        imageht.ImageInfo.ImageFileType = Aspose.Pdf.Generator.ImageFileType.Tiff;

        // Specify the image source as MemoryStream
        imageht.ImageInfo.ImageStream = ms;
        //Add image object into the Paragraphs collection of the section
        sec.Paragraphs.Add(imageht);

        //Save the Pdf
        // Create a new memory stream.
        MemoryStream outStream = new MemoryStream();
        pdf1.Save(outStream);
        // Close the MemoryStream Object
        ms.Close();

        // Convert the document to byte form.
        byte[] docBytes = outStream.ToArray();

        outStream.Close();

        return docBytes;
    }

How to convert doc/docx to pdf without saving to disk
#4

Hi Krista,


Thanks for sharing the details.

I have tested the scenario using Aspose.Pdf for .NET 11.1.0 in Visual Studio 2010 project with .NET Framework 4.0 running over Windows 7 (x64) and I am unable to notice any issue. As per my observations, both code snippets are properly being executed and I am able to get Stream length at the end of code. Furthermore, the approaches shared above are based on legacy Aspose.Pdf.Generator namespace and we recommend using new Document Object Model of Aspose.Pdf namespace. For further details, please visit

[C#]

//Instantiate Pdf instance by calling
its empty constructor
<o:p></o:p>

Aspose.Pdf.Generator.Pdf pdf1 = new Aspose.Pdf.Generator.Pdf();

//Add a section into the pdf document

Aspose.Pdf.Generator.Section sec = pdf1.Sections.Add();

// Create a FileStream object to read the imag file

FileStream fs = File.OpenRead(@"c:/pdftest/10583850_539766389514997_2697150297945263029_n.jpg");

// Read the image into Byte array

// Create a MemoryStream object from image Byte array

// MemoryStream ms = new MemoryStream(data);

//Create an image object in the section

Aspose.Pdf.Generator.Image imageht = new Aspose.Pdf.Generator.Image(sec);

//Set the type of image using ImageFileType enumeration

imageht.ImageInfo.ImageFileType = Aspose.Pdf.Generator.ImageFileType.Tiff;

// Specify the image source as MemoryStream

imageht.ImageInfo.ImageStream = fs;

//Add image object into the Paragraphs collection of the section

sec.Paragraphs.Add(imageht);

//Save the Pdf

// Create a new memory stream.

MemoryStream outStream = new MemoryStream();

pdf1.Save(outStream);

// Convert the document to byte form.

byte[] docBytes = outStream.ToArray();

Console.WriteLine(docBytes.Length);

outStream.Close();


#5

Hi,

I have a similar need I want to convert bytes arrays of doc an then convert them to Pdf .
I tried the same approach , I am able to view the pdf but the format and content is kind of weird.
I want to do the same using Aspose.Pdf , Please suggest a possible way to do it without saving the file on disk.


#6

@PatrickCook,

You can meet this requirement by using the following code of Aspose.Words for .NET.

Stream stream = File.OpenRead("D:\\temp\\Sample.docx");
// bytes of a DOCX file
byte[] docxBytes = new byte[stream.Length];
stream.Read(docxBytes, 0, docxBytes.Length);
MemoryStream srcStream = new MemoryStream(docxBytes);
// Load the entire document into memory.
Document doc = new Document(stream);
// You can close the stream now, it is no longer needed because the document is in memory.
stream.Close();

// Convert the document to PDF format and save to stream.
MemoryStream dstStream = new MemoryStream();
doc.Save(dstStream, SaveFormat.Pdf);
// Rewind the stream position back to zero so it is ready for the next reader.
dstStream.Position = 0;
byte[] pdfBytes = dstStream.ToArray();

#7

I tried this already , but Aspose.pdf.document throws error “Startxref not found”.


#8

@PatrickCook,

Aspose.PDF for .NET API does not support reading DOC or DOCX formats, and after loading the Word document, Aspose.PDF API is unable to find the starting element. You can convert Word documents with Aspose.Words API as narrated in the previous post.