Hello,
I am using Aspose.Pdf to convert PDF files to the PDF/A standard. In the process, I noted that when converting a PDF file with a newly embedded file (sample-input.pdf), the processed file ignores the MIME Type given in the fileSpecification and uses application/pdf as MIME Type (sample-output.pdf).
sample-output.pdf (407,6 KB)
sample-input.pdf (403,9 KB)
Code used (C#):
using (var doc = new Document(input))
{
var fileSpecification = new FileSpecification((new StreamReader(documentPath)).BaseStream, "factur-x.xml")
{
AFRelationship = AFRelationship.Alternative,
MIMEType = "text/xml"
};
doc.EmbeddedFiles.Add(fileSpecification);
var options = new PdfFormatConversionOptions(errorStream, PdfFormat.PDF_A_3B, ConvertErrorAction.None)
{
OptimizeFileSize = true,
ExcludeFontsStrategy = RemoveFontsStrategy.SubsetFonts | RemoveFontsStrategy.RemoveDuplicatedFonts
};
doc.Convert(options);
doc.Save(output);
}
If before saving I use doc.EmbeddedFiles[doc.EmbeddedFiles.Count].MIMEType = "text/xml";
, then the MIME Type is correctly saved.
Aspose output:
39 0 obj
<</Names[(factur-x.xml)41 0 R]>>
endobj
40 0 obj
<</Filter/FlateDecode/Length 1437/Type/EmbeddedFile/Subtype/application#2Fpdf/Params<</CreationDate(D:20250206152709+02'00')/ModDate(D:20250206152709+02'00')/Size 5562>>>>stream
With Bugfix using EmbeddedFiles:
39 0 obj
<</Names[(factur-x.xml)41 0 R]>>
endobj
40 0 obj
<</Filter/FlateDecode/Length 1437/Type/EmbeddedFile/Subtype/text#2Fxml/Params<</CreationDate(D:20250206152709+02'00')/ModDate(D:20250206152709+02'00')/Size 5562>>>>stream
@daniel.castilla
When converting PDF files to the PDF/A standard using Aspose.Pdf, it appears that the MIME type of embedded files may not be preserved as expected. In your case, the MIME type specified in the FileSpecification
is being overridden to application/pdf
during the conversion process.
To ensure that the correct MIME type is retained, you can set the MIME type of the embedded file after adding it to the document but before saving it. Here’s how you can modify your code:
using (var doc = new Document(input))
{
var fileSpecification = new FileSpecification((new StreamReader(documentPath)).BaseStream, "factur-x.xml")
{
AFRelationship = AFRelationship.Alternative,
MIMEType = "text/xml" // Set the desired MIME type
};
doc.EmbeddedFiles.Add(fileSpecification);
// Ensure the MIME type is set correctly before conversion
doc.EmbeddedFiles[doc.EmbeddedFiles.Count - 1].MIMEType = "text/xml";
var options = new PdfFormatConversionOptions(errorStream, PdfFormat.PDF_A_3B, ConvertErrorAction.None)
{
OptimizeFileSize = true,
ExcludeFontsStrategy = RemoveFontsStrategy.SubsetFonts | RemoveFontsStrategy.RemoveDuplicatedFonts
};
doc.Convert(options);
doc.Save(output);
}
In this code, after adding the FileSpecification
to the EmbeddedFiles
collection, we explicitly set the MIME type again to ensure it is correctly saved in the output PDF/A file. This should resolve the issue of the MIME type being incorrectly set to application/pdf
.
If you continue to experience issues, consider checking for any updates or bug fixes in the Aspose.Pdf library that may address this behavior.
Sources:
[1]: ConvertPDFToPDFAFormat.java
It does not work, the MIME type is being changed after converting. I have to call doc.EmbeddedFiles[mainDocument.EmbeddedFiles.Count].MIMEType = "text/xml";
AFTER converting to PDF/A to fix the issue. But it’s a hotfix of mine, Aspose should handle it correctly.
@daniel.castilla
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-59855
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.