We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

.NET: XMP metadata property value containing < and > characters is set with escaped/encoded characters &lt; and &gt;

We are attempting to use the Aspose.PDF .NET library to set XMP metadata in PDF and AI files. This generally works as expected. But when we set a value that contains < and > characters on a XMP metadata property (like RDF tags), then look directly at the XMP metadata packet within the file data (for example by opening it in Notepad++), we see the value that’s been written has these characters escaped/encoded as < and >. This is not the case when using the Aspose.Imaging .NET or Aspose.PSD .NET libraries to set XMP metadata in JPEG/PNG/TIFF/GIF or PSD/PSB file formats.

Example:
We set the value of dc:creator as <rdf:Seq><rdf:li>2021</rdf:li></rdf:Seq>
In the XMP metadata packet, we see: <dc:creator>&lt;rdf:Seq&gt;&lt;rdf:li&gt;Aprimo 2022 (C)&lt;/rdf:li&gt;&lt;/rdf:Seq&gt;</dc:creator>

We thought it might be related to setting this RDF data as a string when perhaps it should be a XmpValue array:

pdf.Metadata.Add(“dc:keywords”, new XmpValue(new XmpValue[] { new XmpValue("<rdf:Seq><rdf:li>2021</rdf:li></rdf:Seq>") } ));

But then we get the same result, except it’s enclosed by <rdf:Bag> </rdf:Bag> tags.

We could try adding the string values inside of the RDF tags in the original value we want to set to the array, which works like this:

pdf.Metadata.Add(“dc:keywords”, new XmpValue(new XmpValue[] { new XmpValue(“2021”), new XmpValue(“2022”) }));

and results in this value in the XMP metadata property:

<dc:keywords>
<rdf:Bag>
<rdf:li>2021</rdf:li>
<rdf:li>2022</rdf:li>
</rdf:Bag>
</dc:keywords>

In this case, the < and > characters aren’t escaped/encoded. But then we don’t have any control over the enclosing tags of the inserted value, it’s always <rdf:Bag> </rdf:Bag> when we want to be able to customize the RDF container (Bag/Alt/Seq, see https://stackoverflow.com/questions/29001433/how-rdfbag-rdfseq-and-rdfalt-is-different-while-using-them)

So the question is, is there an issue with how we are using the Aspose.PDF library to set XMP metadata, or is this a bug?

A notable difference between Aspose.PDF and Aspose.Imaging / Aspose.PSD is that with the latter two libraries, we are using the XmpPacketWrapper object. Aspose.PDF doesn’t have an equivalent XmpPacketWrapper object that we have access to or can use in the same way to manipulate the file’s XMP metadata packet.

We are using Aspose.PDF version 21.12.0. Release notes for the 22.x releases do not indicate that this is an issue that has been resolved in later versions.

Our implementation to set XMP metadata in Aspose.PDF:

var pdf = new Aspose.Pdf.Document(localInputPath);

var namespace = pdf.Metadata.GetNamespaceUriByPrefix(“dc”);
if (String.IsNullOrEmpty(namespace))
{
pdf.Metadata.RegisterNamespaceUri(“dc”, “http://purl.org/dc/elements/1.1/”);
}

if (pdf.Metadata.ContainsKey(“dc:keywords”))
{
pdf.Metadata[“dc:keywords”] = “<rdf:Seq><rdf:li>2021</rdf:li></rdf:Seq>”;
}
else
{
pdf.Metadata.Add(“dc:keywords”, “<rdf:Seq><rdf:li>2021</rdf:li></rdf:Seq>”);
}

pdf.Save(localOutputPath);

@amytant

Can you please share a sample PDF document along with generated output PDF at your end? We will test the scenario in our environment and address it accordingly.

We’re seeing this with every PDF or AI file we try to set XMP metadata on. Here are sample input and PDF documents.
MultiColumn input.pdf (36.8 KB)
MultiColumn output.pdf (35.2 KB)

@amytant

We have tested the scenario in our environment while using 22.2 version of the API and noticed the similar issue in the generated PDF output. Therefore, an issue as PDFNET-51477 has been logged in our issue tracking system. We will further look into its details and keep you posted with the status of its rectification. Please be patient and spare us some time.

We are sorry for the inconvenience.

1 Like