Can not Concatenate PDF files from previous concatenation


#1

I use this Pdf.Kit 1.1.2 version to concatenate few PDF files. Then use the result file to concatenate few more PDF files and got following error:

MerrillLynch.Bank.Application.DBTest.FrameworkTest.TestPdfFileMerger : System.IO.IOException : Rebuild failed: Value cannot be null.
Parameter name: args; Original message: PDF startxref not found.

at Aspose.Pdf.Kit.PdfFileEditor.Concatenate(Stream[] inputStreams, Stream outputStream)
at MerrillLynch.Bank.Application.DBTest.FrameworkTest.TestPdfFileMerger() in C:\home\proj\GBGArchitecture\GBTFramework1.1\GBTFrameworkExamples\DBTest\FrameWorkTest.vb:line 691

Any idea?

Thanks

JiaJin Zhuang


#2

Dear Zhuang,

According to the exception info, there may be an error in the previous result stream.
1. Make sure the previous result stream is not null.
2. Set the pionter of steam at the beginning.
3. Try to write the output stream into a file on the disk, the output stream is corrupted if the stream can’t be wroten in a file correctly.

If all the above ways aren’t helpful, could you please send me the code and testing file?

Best regards.


#3

Here is my test code and two original pdf files.

If you change “Const m As Integer = 2” to “Const m As Integer = 1”, you will see the file “UsageReport_TopPerf.pdf” is generated (concatenation just once).

Thanks

JiaJin Zhuang

==============================================================

<Test()> Public Sub TestPdfFileMerger()

System.Console.WriteLine(“TestPdfFileMerger Start: " & Now())

Dim inStr As FileStream

inStr = File.Open(”…\UsageReport.pdf", FileMode.Open, FileAccess.Read, FileShare.Read)

Dim in1() As Byte = Array.CreateInstance(GetType(Byte), inStr.Length)

inStr.Read(in1, 0, inStr.Length)

inStr.Close()

inStr = File.Open("…\TopPerf.pdf", FileMode.Open, FileAccess.Read, FileShare.Read)

Dim in2() As Byte = Array.CreateInstance(GetType(Byte), inStr.Length)

inStr.Read(in2, 0, inStr.Length)

inStr.Close()

Const m As Integer = 2

Const n As Integer = 2

Dim inStreams As Stream() = Array.CreateInstance(GetType(System.IO.Stream), n)

Dim inStreams2 As MemoryStream() = Array.CreateInstance(GetType(System.IO.MemoryStream), m)

Dim pdfEditor As PdfFileEditor = New PdfFileEditor

Dim i, j As Integer

For j = 0 To m - 1

For i = 0 To n - 1

inStreams(i) = New MemoryStream(in1)

i = i + 1

inStreams(i) = New MemoryStream(in2)

Next

Dim tmpout As MemoryStream = New MemoryStream

pdfEditor.Concatenate(inStreams, tmpout)

For i = 0 To n - 1

inStreams(i).Close()

Next

inStreams2(j) = New MemoryStream(tmpout.GetBuffer())

Next

Dim outStream As MemoryStream = New MemoryStream

pdfEditor.Concatenate(inStreams2, outStream)

Dim outBuf() As Byte = outStream.GetBuffer()

Dim sRptSave As New System.IO.FileStream("…\UsageReport_TopPerf.pdf", IO.FileMode.Create, IO.FileAccess.Write)

sRptSave.Write(outBuf, 0, outBuf.Length)

sRptSave.Close()

System.Console.WriteLine("TestPdfFileMerger End: " & Now())

Return

End Sub


#4

Hi,
I tested the upper code. An error is introduced in transfering the data from the previous stream to the new memory stream, to void the bug, I mended the code,

Public Sub TestPdfFileMerger()

System.Console.WriteLine("TestPdfFileMerger Start: " & Now())

Dim inStr As FileStream

inStr = File.Open("..\UsageReport.pdf", FileMode.Open, FileAccess.Read, FileShare.Read)

Dim in1() As Byte = Array.CreateInstance(GetType(Byte), inStr.Length)

inStr.Read(in1, 0, inStr.Length)

inStr.Close()

inStr = File.Open("..\TopPerf.pdf", FileMode.Open, FileAccess.Read, FileShare.Read)

Dim in2() As Byte = Array.CreateInstance(GetType(Byte), inStr.Length)

inStr.Read(in2, 0, inStr.Length)

inStr.Close()

Const m As Integer = 2

Const n As Integer = 2

Dim inStreams As Stream() = Array.CreateInstance(GetType(System.IO.Stream), n)

Dim inStreams2 As MemoryStream() = Array.CreateInstance(GetType(System.IO.MemoryStream), m)

Dim pdfEditor As PdfFileEditor = New PdfFileEditor

Dim i, j As Integer

For j = 0 To m - 1

For i = 0 To n - 1

inStreams(i) = New MemoryStream(in1)

i = i + 1

inStreams(i) = New MemoryStream(in2)

Next

Dim tmpout As MemoryStream = New MemoryStream

pdfEditor.Concatenate(inStreams, tmpout)

For i = 0 To n - 1

inStreams(i).Close()

Next

' Using the following code instead of inStreams2(j) = New MemoryStream(tmpout.GetBuffer()),
' I can not point out the exact difference between them, maybe the detail information lies
' in MS development manual.
inStreams2(j) = New MemoryStream
tmpout.WriteTo(inStreams2(j))

Next

Dim outStream As MemoryStream = New MemoryStream

pdfEditor.Concatenate(inStreams2, outStream)

Dim outBuf() As Byte = outStream.GetBuffer()

Dim sRptSave As New System.IO.FileStream("..\UsageReport_TopPerf.pdf", IO.FileMode.Create, IO.FileAccess.Write)

sRptSave.Write(outBuf, 0, outBuf.Length)

sRptSave.Close()

System.Console.WriteLine("TestPdfFileMerger End: " & Now())

Return

End Sub


#5

That change resolved the issue. Thanks!

Do you consolidate the dup images? It saves lot of space.

JiaJin


#6

hi,

What does consolidating the dup images mean?The kit only is designed to process the pdf file, nothing can it do on other format files.
But I am happy to answer any question appeared on your using and testing the kit.

Thanks for considering Aspose.


#7

What I meant is when you concatenating two or more pdf files in one. Many images (such as the company logo) are always the same in every page. Those images can be consolidated to only one copy remains in the final PDF file.

If you open the result file from concatenation in Adobe Acrobat (6 or above) Standard, and press “save”, you will see the size is been reduced (could be a lot).

Hope this clarifies.

Thanks

JiaJin Zhuang


#8

Thanks for your enthusiasm very much!

I believe it is a good idea and may be useful for some particular application, but I am afraid that the feature can not be implemented in a short time.