PDF Document Does Not Preserve Spaces when Converted to .txt

Hi,

We have printed a rtf report to PDF format using Aspose.words dll. And then we have converted this PDF repport to .txt file, then the spaces are not coming.

You can see the attached file.

Please help us.

Hi Vandana,

Thanks for your inquiry. You can convert your RTF to text file format using Aspose.Words. Please use following code example to to convert RTF to text file format. Hope this helps you.

Document doc = new Document(MyDir + "in.rtf");
doc.Save(MyDir + "Out.txt", SaveFormat.Text);

If you do not want to convert RTF to text file directly, please share the steps which you are using to convert Pdf to text. We will then provide you more information about your query.`

We are using aspose.words dll to print a report in PDF format. But after that the clients were converting this PDF report to text file.

Steps:

  1. I have converted a Aspose genereated PDF report to text file in online http://www.convertmypdf.net/#
  2. the text file does not preserver spaces.

If we follow the above same steps by converting normal PDF report to text file, then the spaces are coming properly. But we don’t want to use the conversion to text since we are using ActivePDF converter.

Even, I have tried to convert the PDF file to DOCX file using Aspose.pdf dll and then convert the DOCX file to Text file using Aspose.words dll. The text file is not coming in the correct format. you can find the output in attached documents(30tmp.txt).

Hi Vandana,

Thanks for your inquiry. Could you please attach your input RTF document here for testing? We will investigate the issue on our side and provide you more information.

Hi,

Please find the input file in the attachment. For the below two steps I am using the attached input PDF file.

  1. If you convert to txt file in online, then the spaces are not coming .
  2. If you convert to txt file using Aspose dlls, then the spaces are coming but format is not coming properly.

Hi Vandana,

Thanks for your inquiry. As per my understanding you are converting RTF to Pdf using Aspose.Words and then convert the output Pdf to text file online (http://www.convertmypdf.net/).

We need your input RTF for testing purposes. Unfortunately, it is difficult to say what the problem is without the input document.

If you are using Aspose.Pdf to convert Pdf to text, please let us know. We will move this forum thread to Aspose.Pdf forum. Aspose.Pdf team will investigate the issue at their end.

Hi,

I am using the below code to print the rtf file to PDF using PDF Creator.And then converting PDF to text file. You can find the input rtf file in attachments.

PrinterSettings printerSettings = new PrinterSettings();
System.Windows.Forms.PrintDialog printDialog = new System.Windows.Forms.PrintDialog();
Aspose.Words.Document doc = new Aspose.Words.Document("D:\\Convert_rtf_to_text_file\\rtf files\\30tmp.rtf");
var result = printDialog.ShowDialog();
if (result.Equals(DialogResult.OK))
    doc.Print(printDialog.PrinterSettings);

Find all files in the attachments.
Input file is “30tmp.rtf”
Printed PDF file using Aspose.words dll is document_4.pdf
Converted text file is “Converted textf ile.jpg”

I don’t want to use aspose.pdf. My concern is related to only Aspose.words.

Hi Vandana,

Thanks for sharing the detail. Please note that the Aspose.Words Print method uses System.Drawing and standard .NET printing classes. Aspose.Words prints your document without any issue. It seems that the issue is related to PDF Creator.

In your case, we suggest you please convert RTF to Pdf using Aspose.Words as shown below. Please use Out.pdf to convert to text file format online (http://www.convertmypdf.net/). We have attached the output Pdf and converted text file from http://www.convertmypdf.net/ with this post for your kind reference.

Document doc = new Document(MyDir + "30tmp.rtf");
doc.Save(MyDir + "Out.pdf");