Input3MB.zip (156.4 KB)
This is 3MB text file which takes complete 5 min 31 sec for conversion to PDFAF 3B formatt,
we read the stream in memory and then convert to pdfa 3B format we are using Apose .ppdf 24.5.1 version
we want to know that why its taking so long and what is the exact processing time at your end when you read stream and then convert to PDFA3B format ??
It takes so long because you use a text file with 1 line and no delimiters.
Kindly provide Code based solution, The File is just 3MB and taking 6min 45 sec to convert ,
The Final output is just of 361 pages and the consumer is annoyed looking at page count and time taken for conversion.
Sending Input and Out put file both here . See both the file and advice how to deal with the small file of 3MB taking 6.45 min for conversion
Input3MB.zip (156.4 KB)
341ab67e-c0cd-4ad8-9cea-79dc055b01c3 (3).pdf (852.2 KB)
Also may i request you to share the Time conversion that it takes at your end Please.
@Gayatri_Naik, please share your code to convert the text file to PDF/A.
As I wrote, your sample file is one line with ~3000000 chars. It means you have a very long line, and formatting this line takes a long time.
Processing time depends on parameters like page size, font size, line spacing, etc.
Below is an example of how to process text files differently. It helps you to make the rought estimation of processing time.
internal class Program
{
static void Main()
{
var licensePath = @"Aspose.PDF.NET.lic";
new Aspose.Pdf.License().SetLicense(licensePath);
//Convert 1
var watch = System.Diagnostics.Stopwatch.StartNew();
Convert1();
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
Console.WriteLine(elapsedMs / 1000);
//Convert 2
watch.Restart();
Convert2();
watch.Stop();
elapsedMs = watch.ElapsedMilliseconds;
Console.WriteLine(elapsedMs / 1000);
//Convert 3 (font size 8pt)
watch.Restart();
Convert3(8);
watch.Stop();
elapsedMs = watch.ElapsedMilliseconds;
Console.WriteLine(elapsedMs / 1000);
// Convert 3 (font size: 12pt)
watch.Restart();
Convert3(12);
watch.Stop();
elapsedMs = watch.ElapsedMilliseconds;
Console.WriteLine(elapsedMs / 1000);
}
private static void Convert1()
{
// Read the text file as array of string
var lines = System.IO.File.ReadAllLines("Input3MB.txt");
// Instantiate a Document object by calling its empty constructor
Document document = new();
// Add a new page in Pages collection of Document
Page page = document.Pages.Add();
// Set left and right margins for better presentation
page.PageInfo.Margin.Left = 20;
page.PageInfo.Margin.Right = 10;
page.PageInfo.DefaultTextState.Font = FontRepository.FindFont("Courier New");
page.PageInfo.DefaultTextState.FontSize = 12;
foreach (var line in lines)
{
TextFragment text = new(line);
page.Paragraphs.Add(text);
}
document.Convert(new MemoryStream(), PdfFormat.PDF_A_3B, ConvertErrorAction.Delete);
// Save updated document
document.Save("Sample-Document-01.pdf");
}
private static void Convert2()
{
// Read the text file as array of string
var content = System.IO.File.ReadAllText("Input3MB.txt");
int chunkSize = 80;
List<string> lines = [];
for (int i = 0; i < content.Length; i += chunkSize)
{
if (i + chunkSize > content.Length)
lines.Add(content.Substring(i));
else
lines.Add(content.Substring(i, chunkSize));
}
// Instantiate a Document object by calling its empty constructor
Document document = new();
// Add a new page in Pages collection of Document
Page page = document.Pages.Add();
// Set left and right margins for better presentation
page.PageInfo.Margin.Left = 20;
page.PageInfo.Margin.Right = 10;
page.PageInfo.DefaultTextState.Font = FontRepository.FindFont("Courier New");
page.PageInfo.DefaultTextState.FontSize = 12;
foreach (var line in lines)
{
TextFragment text = new(line);
page.Paragraphs.Add(text);
}
document.Convert(new MemoryStream(), PdfFormat.PDF_A_3B, ConvertErrorAction.Delete);
// Save updated document
document.Save("Sample-Document-01.pdf");
}
private static void Convert3(float fontSize)
{
// Read the text file as array of string
var content = System.IO.File.ReadAllText("Input3MB.txt");
content = content.Replace("</Item>", $"</Item>{Environment.NewLine}");
var lines = content.Split(Environment.NewLine);
// Instantiate a Document object by calling its empty constructor
Document document = new();
// Add a new page in Pages collection of Document
Page page = document.Pages.Add();
// Set left and right margins for better presentation
page.PageInfo.Margin.Left = 20;
page.PageInfo.Margin.Right = 10;
page.PageInfo.DefaultTextState.Font = FontRepository.FindFont("Courier New");
page.PageInfo.DefaultTextState.FontSize = fontSize;
foreach (var line in lines)
{
TextFragment text = new(line);
page.Paragraphs.Add(text);
}
document.Convert(new MemoryStream(), PdfFormat.PDF_A_3B, ConvertErrorAction.Delete);
// Save updated document
document.Save("Sample-Document-01.pdf");
}
}
Hey I Know I can add trace to see the Time elapsed, but i want to know how long its taking at your end.
so that we can tell consumer accordingly
Could you please clarify what you would like to know and what your real issue is?
Issue is why this file is taking soooo long. and ohw can i deal with such situation in code ?? what is the time taken at your end ?
I’m sorry, but I answered the question.
Your file is not a standard text file. Your file contains one long string, which is unusual. This line causes lengthy processing.
Also, I provided several ways to split your one-line text into multiple-line text.
Your second question is not related to the Aspose.PDF library.