Hi,
Hi John,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
Thank you for sharing the template files.
I'm unable to extract embedded docs from many of the pdf files. Noticeably this number is pretty high in my test set. I'm attaching a document(PDFWithFileAttachmentAnnotation) that I couldn't extract the embedded docs from. I'm using "EmbeddedFileCollection files = pdfDoc.EmbeddedFiles".
We have found your mentioned issue after an initial test. Your issue has been registered in our issue tracking system with issue id: PDFNEWNET-30920. We will notify you via this forum thread regarding any update against your issue.
And another is i'm getting a invalid cast exception while opening some kind of documents like the one that I have attached(CataLyst_3Attch) file for you to test it.
We have found your mentioned issue after an initial test using your shared template file. Your issue has been registered in our issue tracking system with issue id: PDFNEWNET-30922. We will notify you via this forum thread regarding any update against your issue.
Sorry for the inconvenience caused,
Hi John,
Thanks for your patience. I am pleased to share that the issues reported earlier have been fixed and their hotfix will be included in upcoming release version. Please be patient and wait for the new v6.3.0.
Please notice that pdf document doesn’t have files which are embedded directly. There are 2 types of attachments (embedded files): 1) those which come through pdf document catalog /Names << /EmbeddedFiles this >> entry, 2) those which come through page file attachment annotations. Next 2 lines get all embedded files of type 1:
EmbeddedFileCollection embeddedFiles = pdfDocument.EmbeddedFiles;
{
//get the attachment and write to file or stream
byte[] fileContent = new byte[fileSpecification.Contents.Length];
fileSpecification.Contents.Read(fileContent, 0,
fileContent.Length);
FileStream fileStream = new FileStream(@“d:\pdftest” + fileSpecification.Name,
FileMode.Create);
fileStream.Write(fileContent, 0, fileContent.Length);
fileStream.Close();
}
extractor.BindPdf(@"PDFWithFileAttachmentAnnotation.pdf");
extractor.ExtractAttachment();
extractor.GetAttachment("");
And next sample does exactly the same but accessing each embedded file separately:
PdfExtractor extractor = new PdfExtractor();
extractor.BindPdf(@"PDFWithFileAttachmentAnnotation.pdf");
foreach (string name in extractor.GetAttachNames())
{
extractor.ExtractAttachment(name);
extractor.GetAttachment("");
}
Also PdfExtractor allows to extract embedded file into memory (see comments). Pay attention that PdfExtractor extracts all types of embedded files. Lets use DOM. I know that pdf document has only one page so I use this fact in the next code snippet:
foreach (Annotation annotation in pdfDocument.Pages[1].Annotations)
if (annotation is FileAttachmentAnnotation)
{
FileAttachmentAnnotation attachment = (annotation as FileAttachmentAnnotation);
using (BinaryReader reader = new BinaryReader(attachment.File.Contents))
using (BinaryWriter writer = new BinaryWriter(new FileStream(Path.GetFileName(attachment.File.Name), FileMode.Create)))
writer.Write(reader.ReadBytes((int)attachment.File.Contents.Length));
}
The issues you have found earlier (filed as 30920 ;30922 ) have been fixed in this update.
This message was posted using Notification2Forum from Downloads module by aspose.notifier.