Combining pdf files in sharepoint libraries

In a sharepoint site I have several lists that contain pdf files. After gathering related files I want to merge them into one pdf file. It seems the pdffileeditor can do this, but I am having trouble with the resulting file - I get an "Error processing page". I've pasted some code below which I pieced together from the documentation and posts in the forum. I am using version 4.4.0 of the pdf.kit. Can you tell me where I am going wrong? thanks

sAttID = myItem[sAttachmentIDField].ToString();

SPListItemCollection myFileAttachments = GetAttachmentList(strFormAttLib, sAttachmentIDField, sAttID); //this returns files related to spDestinationFile (which is a SPFile obtained prior to this snippet)

int i = myFileAttachments.Count;

if (i > 0)

{

Stream[] pdfStreams = new Stream[i + 1];

Stream fsOrig = spDestinationFile.OpenBinaryStream();

pdfStreams[0] = fsOrig;

int m = 1;

foreach (SPListItem myAtt in myFileAttachments)

{

SPFile myAttFile = myAtt.File;

Stream myAttStream = myAttFile.OpenBinaryStream();

pdfStreams[m] = myAttStream;

m = m + 1;

}

MemoryStream combStream = new MemoryStream();

PdfFileEditor pdfEditor = new PdfFileEditor();

pdfEditor.Concatenate(pdfStreams, combStream);

spDestinationFile = destinationWeb.Files.Add(destinationFilePath, combStream,true);

combStream.Close();

}

Hi Dan,

Please share the sample problematic PDF file with us, so we could test the issue at our end. You’ll be updated accordingly.

We’re sorry for the inconvenience.
Regards,

not sure if it is the pdf files. I rolled back the pdf.kit to 4.3.0.0 and updated the code a little as noted below. The resulting file is ok. I've attached a few sample pdf files. In this code snippet I've also added code to put a blank page between each file by just using a pdf file that has one blank page. I'd like to add some text to that blank page but haven't figured out how to that - if you can point me in the right direction I'd appreciate it. thanks

SPListItemCollection myFileAttachments = GetAttachmentList(strFormAttLib, sAttachmentIDField, sAttID);

int i = myFileAttachments.Count;

if (i > 0)

{

SPFile blankPage = getBlankPage(); // mySites.AllWebs[txtProjectSite.Text].GetFile("Documents/blankpage.pdf");

Stream blankStream = blankPage.OpenBinaryStream();

Stream[] pdfStreams = new Stream[2 *i + 1];

Stream fsOrig = spDestinationFile.OpenBinaryStream();

pdfStreams[0] = fsOrig;

int m = 1;

foreach (SPListItem myAtt in myFileAttachments)

{

SPFile myAttFile = myAtt.File;

Stream bp = blankPage.OpenBinaryStream();

//how do I add text to the Stream bp?

pdfStreams[m] = bp;

m = m + 1;

Stream myAttStream = myAttFile.OpenBinaryStream();

pdfStreams[m] = myAttStream;

m = m + 1;

}

MemoryStream combStream = new MemoryStream();

PdfFileEditor pdfEditor = new PdfFileEditor();

pdfEditor.Concatenate(pdfStreams, combStream);

byte[] finalArray = combStream.GetBuffer();

spDestinationFile = destinationWeb.Files.Add(destinationFilePath, finalArray,null,true);

combStream.Close();

Hi Dan,

I have reproduced the first problem at my end and logged it as PDFKITNET-16789 in our issue tracking system. Our team will look into this issue and you’ll be updated via this forum thread once it is resolved.

As far as your second requirement is concerned, you can use AddText method of PdfFileMend class to add some text on a particular page at some particular location.

I hope this helps. If you have any further questions, please do let us know.
Regards,


I am having trouble adding text to my blank page. The code I am trying to use is given below. However the resulting combined pdf file is corrupt. If I just add the blank page without adding text the resulting file if fine. Can you tell me where I have gone wrong? I also have attached the "blankpage.pdf" I am using.

thanks

SPListItemCollection myFileAttachments = GetAttachmentList(strFormAttLib, sAttachmentIDField, sAttID);

int i = myFileAttachments.Count;

if (i > 0)

{

SPFile blankPage = getBlankPage(); // mySites.AllWebs[txtProjectSite.Text].GetFile("Documents/blankpage.pdf");

Stream[] pdfStreams = new Stream[2 *i + 1];

Stream fsOrig = spDestinationFile.OpenBinaryStream();

pdfStreams[0] = fsOrig;

int m = 1;

foreach (SPListItem myAtt in myFileAttachments)

{

SPFile myAttFile = myAtt.File;

string sAttachmentMessage = "Attachment Filename: " + myAtt.Name;

FormattedText ft = new FormattedText(sAttachmentMessage, System.Drawing.Color.FromArgb(0, 200, 0), Aspose.Pdf.Kit.FontStyle.TimesRoman, EncodingType.Winansi, false, 12);

Stream bp = blankPage.OpenBinaryStream();

Stream bpOut = new MemoryStream();

PdfFileMend mendor = new PdfFileMend(bp, bpOut);

mendor.AddText(ft, 1, 50, 100, 100, 200);

mendor.Close();

pdfStreams[m] = bpOut;

m = m + 1;

Stream myAttStream = myAttFile.OpenBinaryStream();

pdfStreams[m] = myAttStream;

m = m + 1;

}

MemoryStream combStream = new MemoryStream();

PdfFileEditor pdfEditor = new PdfFileEditor();

pdfEditor.Concatenate(pdfStreams, combStream);

byte[] finalArray = combStream.GetBuffer();

spDestinationFile = destinationWeb.Files.Add(destinationFilePath, finalArray,null,true);

combStream.Close();

}

Hi Dan,

Please download the hot fix attached with this post and try it at your end. I hope it is going to resolve your issue, however if you still find some problem please do let us know.

We’re sorry for the inconvenience.
Regards,

thanks. I will try the update later this week. In my previous post I also asked for assistance in adding text to a blank page. Can you take a look at my code and point me in the right direction?

thanks Dan

Hi Dan,

I’m unable to notice any issue due to the blankpage.pdf or AddText feature; this issue might be caused due to the other concatenate issue that was fixed in the previous hot fix. Please try this fix at your end and see if you’re still facing any problem

We’re sorry for the inconvenience and looking forward to help you out.
Regards,

The issues you have found earlier (filed as 16789) have been fixed in this update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

I have updated to the latest version of the pdf.kt (4.5) but am still having trouble with adding text to a page. I have attached two files as examples. The file with the “notext” was created with just a blank page inserted between two pdf documents. the other file, which appears corrupted, I added a blank page and attempted to add text to that blank page.

my code snippet is below. The only difference in the code between the two files is hi-lited.

Can you see where I am going wrong?

thanks

private void WebConvertAlternate()
{
StringBuilder sbProcess = new StringBuilder();
SPSite mySites = null;
SPWeb myWeb = null;
try
{
setMainNodeValue("/my:MainFields/my:ConversionMessage", "");
string strServerUrl = _strUri.ToString(); // "[http://dev.moss.lab](http://dev.moss.lab/)";
string sBP = readMainNode("/my:MainFields/my:BidPackage");
string strSite = _strSitePath.ToString() + "/" + sBP;
string strFormLib = readMainNode("/my:MainFields/my:SourceForms"); //eg, Field Orders
string strFormInfo = readMainNode("/my:MainFields/my:CurrentFormInfo");
string[] formLibParts = strFormInfo.Split('|'); // cmbFormLibrary.SelectedValue.ToString().Split('|');
string strFormAttLib = formLibParts[0]; //attachment list for form list
string strFormContentType = formLibParts[1]; //content type for converted file
string sSourceIDField = formLibParts[2]; //source list id field
string sAttachmentIDField = formLibParts[3]; //idfield for attachment libraries
string fieldInternalName = "";
string sAttID = "";
string destFolder = readMainNode("/my:MainFields/my:DestinationLibrary"); // txtDestinationLibrary.Text;

SPSecurity.RunWithElevatedPrivileges(delegate()
{
mySites = new SPSite(strServerUrl);
myWeb = mySites.OpenWeb(strSite);
});
SPList formList = myWeb.Lists[strFormLib];
SPListItemCollection items = formList.Items;
DocumentConverterServiceClient client = WebServiceConverterHelper.OpenService();
//** Set the various open options
OpenOptions openOptions = new OpenOptions();
openOptions.Password = "";
openOptions.AllowMacros = MacroSecurityOption.None;
openOptions.RefreshContent = true;
ConversionSettings conversionSettings = new ConversionSettings();
conversionSettings.Fidelity = ConversionFidelities.Full;
conversionSettings.Format = OutputFormat.PDF;
conversionSettings.Quality = ConversionQuality.OptimizeForPrint;
conversionSettings.Range = ConversionRange.VisibleDocuments;
conversionSettings.StartPage = 0;
conversionSettings.EndPage = 0;
conversionSettings.GenerateBookmarks = BookmarkGenerationOption.Automatic;
conversionSettings.PDFProfile = PDFProfile.PDF_1_5;

foreach (SPListItem myItem in items)
{
if (myItem.FileSystemObjectType == SPFileSystemObjectType.File)
{
SPFile myFile = myItem.File;
string sourceFileName = myFile.Name;
openOptions.OriginalFileName = Path.GetFileName(sourceFileName);
string sFileExtension = Path.GetExtension(sourceFileName).ToLower();
if (sFileExtension == ".xml")
{
openOptions.FileExtension = sFileExtension;
string destinationFileName = Path.GetFileNameWithoutExtension(sourceFileName) + ".pdf";
sbProcess.Append("Reading source file " + sourceFileName + ".");
sbProcess.AppendLine();
byte[] sourceFileArray = myFile.OpenBinary();
byte[] convertedFile = client.Convert(sourceFileArray, openOptions, conversionSettings);
SPFolder destinationFolder = myWeb.GetFolder(destFolder);
string destinationFilePath = string.Format("{0}/{1}", destinationFolder.Url, destinationFileName);
SPWeb destinationWeb = destinationFolder.ParentWeb;
SPFile spDestinationFile = destinationWeb.GetFile(destinationFilePath);
spDestinationFile = destinationWeb.Files.Add(destinationFilePath, convertedFile, null, true);
ArrayList aFields = GetMetaDataFields(strFormLib);
foreach (string aField in aFields)
{
string[] intNames = aField.Split('|');
string sourceIntName = intNames[0];
string destIntName = intNames[1];
spDestinationFile.Item[destIntName] = myFile.Item[sourceIntName];
}
spDestinationFile.Item["ContentType"] = strFormContentType;
spDestinationFile.Item["Title"] = destinationFileName;
spDestinationFile.Item["Bid_x0020_Package"] = sBP;
spDestinationFile.Item.Update();

if (sAttachmentIDField != "")
{
fieldInternalName = formList.Fields[sSourceIDField].InternalName;
sAttID = myItem[fieldInternalName].ToString();
SPListItemCollection myFileAttachments = GetAttachmentList(strFormAttLib, sAttachmentIDField, sAttID);
int i = myFileAttachments.Count;
if (i > 0)
{
SPFile blankPage = getBlankPage(); // mySites.AllWebs[txtProjectSite.Text].GetFile("Documents/blankpage.pdf");
Stream[] pdfStreams = new Stream[2 * i + 1];
Stream fsOrig = spDestinationFile.OpenBinaryStream();
pdfStreams[0] = fsOrig;
int m = 1;
foreach (SPListItem myAtt in myFileAttachments)
{
SPFile myAttFile = myAtt.File;
Stream bp = blankPage.OpenBinaryStream();

//**** code to add text to the blank page
string sAttachmentMessage = "Attachment Filename: " + myAtt.Name;
FormattedText ft = new FormattedText(sAttachmentMessage, System.Drawing.Color.FromArgb(0, 200, 0), Aspose.Pdf.Kit.FontStyle.TimesRoman, EncodingType.Winansi, false, 12);
Stream bpOut = new MemoryStream();
PdfFileMend mendor = new PdfFileMend(bp, bpOut);
mendor.AddText(ft, 1, 50, 100, 100, 200);
mendor.Close();
//****

pdfStreams[m] = bp;
m = m + 1;
Stream myAttStream = myAttFile.OpenBinaryStream();
pdfStreams[m] = myAttStream;
m = m + 1;
}
MemoryStream combStream = new MemoryStream();
PdfFileEditor pdfEditor = new PdfFileEditor();
pdfEditor.Concatenate(pdfStreams, combStream);
byte[] finalArray = combStream.GetBuffer();
spDestinationFile = destinationWeb.Files.Add(destinationFilePath, finalArray, null, true);
combStream.Close();
}
}
sbProcess.Append("Completed source file " + sourceFileName + ".");
sbProcess.AppendLine();
}
}
}
sbProcess.Append("Completed converting form files from " + strFormLib + " in " + sBP + ".");
sbProcess.AppendLine();
}
catch (FaultException ex)
{
sbProcess.Append("FaultException occurred: ExceptionType: " + ex.Detail.ExceptionType.ToString());
sbProcess.AppendLine();
}
catch (Exception ex)
{
sbProcess.Append(ex.Message.ToString());
sbProcess.AppendLine();
}
finally
{

string sBP = readMainNode("/my:MainFields/my:BidPackage");

string strFormLib = readMainNode("/my:MainFields/my:SourceForms");
sbProcess.Append("Completed converting form files from " + strFormLib + " in " + sBP + ".");
sbProcess.AppendLine();
setMainNodeValue("/my:MainFields/my:ConversionMessage", sbProcess.ToString());

if (myWeb != null)
{
myWeb.Dispose();
}
if (mySites != null)
{
mySites.Dispose();
}
}
}

Hi Dan,

We need to investigate this issue at our end. Please spare us some time so we could look into this problem in detail. You’ll be updated with the results accordingly.

We’re sorry for the inconvenience.
Regards,

Hi Dan,

I’m working on this issue, however I need a little assistance in understanding this issue. As I currently understand, you are mainly trying to do two things using Aspose.Pdf.Kit:

1. Add some text to a blank PDF page ( a single page blank PDF file)
2. Concatenate three files i.e. File1.Pdf + blank.pdf + File2.pdf

Is that correct? As I can see you have shared the output PDF files. In order to investigate the issue, we need the original input files. i.e. the blank PDF file before adding text and the other two files which are being concatenated with this blank page. Please share these files, so we could reproduce the issue at our end.

We’re sorry for the inconvenience.
Regards,

attached are 3 files.

bph ...noattachment.pdf is the parent pdf document (it is created from a sharepoint infopath form library using a software package from Muhimbi (something a colleague ordered when my initial attempts at using your aspose.form were not very successful - the Muhimbi was specifically geared toward converting xml files in sharepoint infopath form libraries to pdf files).

blankpage.pdf is the blank page between the parent pdf document and its attachment

BPH RFI 0002....pdf is the file being appended to the parent following the blank page.

thanks for your assistance.

Dan

Hi Dan,

I have tested this issue using your sample input files, but couldn’t notice any problem; I tried using file streams and memory streams both. Now, there could be three things which might be causing the problem:

1. you need to make sure the DLL you’re using in your project is the latest 4.5.0
2. move the position of the output stream, after adding text to the blank file, to 0 i.e. bpOut.Position = 0
3. if it still doesn’t work then please make sure that the position of all the input streams being concatenated is set to 0.

I’m sure this is going to resolve your issue. If you still find the issue or have some more questions, please do let us know.
Regards,


ok. settng the position to 0 fixed the error, in that the files are appended ok, but the text I am appending isn’t appearing. I’ve tried to use similar settings as in your examples but I must be missing something. Any ideas? thanks

Hi Dan,

I have attached a sample with this post. Can you please try it at your end using the latest version of Aspose.Pdf.Kit for .NET? Also, please compare it with your code. I hope this might help.

We’re sorry for the inconvenience. If you have any further questions, please do let us know.
Regards,

after studying your code for a while it finally dawned on me that I had a typo in my code;

I had

pdfStreams[m] = bp;

when it should have been

pdfStreams[m] = bpOut;

Once I corrected the typo, it worked. thanks for your assistance.

as a recap, my process involves converting records in a sharepoint infopath form library to pdf files. Some of the infopath forms have attachments, which have to be extracted and appended to the parent pdf file. I am using the pdfkit to assist in merging the parent and attachment files, which after your assistance, was successful. The attachments were supposed to be pdf files. I have now run into the situation in which the attachments are word documents instead of pdf files. Of course, the process I had developed does not work when the attachment is a word document. When I try to open the resulting file I get a message from acrobat saying there was an error opening the document. The attachments are extracted form the infopath forms as a byte[] array (base64Binary). I am assuming I must do something different with it depending upon whether is from a pdf file or word document file. I am hoping that is it obvious to you and that I can use one of your great tools to make it work. My current code snippet is below. Thanks.

Stream blankPageStream = blankPage.OpenBinaryStream(); //just a blank page to separate a parent from its attachments
blankPageStream.Position = 0;
string sAttachmentMessage = "Attachment Filename: " + getAttachmentFileName(myAtt.Attachment); // myAtt.Attachment is a byte[] array of the attachment. it is base64Binary
FormattedText ft = new FormattedText(sAttachmentMessage, System.Drawing.Color.Black, Aspose.Pdf.Kit.FontStyle.TimesRoman, EncodingType.Winansi, false, 10);
Stream bpOut = new MemoryStream(); //bpOut is just a stream to hold the blank page with the FormattedText added by the PDFFileMend
PdfFileMend mendor = new PdfFileMend(blankPageStream, bpOut);
int p = 1;
int lx = 50;
int ly = 400;
mendor.AddText(ft, p, lx, ly);
mendor.Close();
bpOut.Position = 0;
Stream fsOrig = spDestinationFile.OpenBinaryStream(); //this is the stream for the parent pdf file
MemoryStream concatStream = CombineStreams(fsOrig, bpOut); //this is the stream of the parent plus the blank page separator
MemoryStream msAtt = new MemoryStream(myAtt.Attachment);// this is a memory stream created from the attachment byte array
concatStream = CombineStreams(concatStream, msAtt); //this is the stream of the parent plus the blank page separator plus the attachment
byte[] finalArray = concatStream.GetBuffer(); //this is a byte array of the combined stream
spDestinationFile = destinationWeb.Files.Add(destinationFilePath, finalArray, null, true); //this writes the result to the sharepoint library

Hi Dan,

We’re looking into this issue at our end and you’ll be updated with the results accordingly.

We’re sorry for the inconvenience.
Regards,

Hi Dan,

As far as I understand, you’re trying to concatenate the attachments with the original PDF file. It works fine when the attachments are the PDF files, while it fails when these are the Word documents.

I have also noticed that you’re reading the attachments directly into a MemoryStream and then concatenating them as shown below with the help of two lines of code from your sample:

MemoryStream msAtt = new MemoryStream(myAtt.Attachment);// this is a memory stream created from the attachment byte array

concatStream = CombineStreams(concatStream, msAtt); //this is the stream of the parent plus the blank page separator plus the attachment

Well, it’ll work fine in case the attachment (myAtt.Attachment) is a PDF file; however, you can’t concatenate a Word document to a PDF directly. For that matter, you’ll have to first convert the Document attachment to PDF and then concatenate it with the original file. You can use Aspose.Words to convert the Doc file to PDF. If you find any trouble converting Doc to PDF, please post your query in the Aspose.Words forum.

If you think I haven’t understood your requirement and the issue clearly, please elaborate a little bit.
Regards,