XML-FO Validation issues

We’re currently migrating from an old version of Aspose.Pdf (11.6.0) to the current one, and facing some issues regarding XML-FO binding.

Using the example application:

using (var xmlStream = new FileStream("Content.fo.xml", FileMode.Open))
using (var outputStream = new FileStream("Result.pdf", FileMode.Create))
{
    var loadOptions = new XslFoLoadOptions();
    var pdfDocument = new Document(xmlStream, loadOptions);
    
    // pdfDocument.Convert("ConversionResult.txt", PdfFormat.PDF_A_1A, ConvertErrorAction.Delete);
    pdfDocument.Save(outputStream);

    if (!pdfDocument.Validate("ValidationResult.txt", pdfDocument.PdfFormat))
    {
        throw new Exception("Could not validate PDF!");
    }
}

(Sidetopic: why do I need to convert the document for it to be compliant out of the box? This will fail the validation with the convert() commented out.)

With the following XML-FO content:

<?xml version="1.0" encoding="utf-8"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
    <fo:layout-master-set>
        <fo:simple-page-master master-name="main-page-master" page-width="210mm" page-height="297mm" margin-top="4.2mm" margin-bottom="0mm" margin-left="20.1mm" margin-right="20.1mm">
        <fo:region-body region-name="xsl-region-body" margin-top="35.8mm" margin-bottom="35mm" />
        <fo:region-before region-name="xsl-region-before" extent="25mm" />
        <fo:region-after region-name="xsl-region-after" extent="25mm" />
        </fo:simple-page-master>
    </fo:layout-master-set>
    <fo:page-sequence master-reference="main-page-master">
        <fo:static-content flow-name="xsl-region-after">
        <fo:wrapper/>
        </fo:static-content>
        <fo:flow flow-name="xsl-region-before">
        <fo:wrapper/>
        </fo:flow>
        <fo:flow flow-name="xsl-region-body">
        <fo:wrapper>Hello world!</fo:wrapper>
        </fo:flow>
    </fo:page-sequence>
</fo:root>

Aspose will throw an exception with two validation errors:

============================================
For "fo:page-sequence", only one "fo:flow" may be declared. Line(fo:flow)\Col(17)
---------------------------------
"fo:#PCDATA" is not a valid child of "wrapper"![ rule.wrapperInvalidChildForParent] Line(18)\Col(8)
============================================

First of all: what is the currently supported XML-FO spec used in Aspose.Pdf?

XML-FO 1.0 supports fo:wrapper with #PCDATA content: Formatting Objects

XML-FO 1.1 supports multiple fo:flow inside a fo:page-sequence: Extensible Stylesheet Language (XSL) Version 1.1

@tobbentm

We were able to notice the error that you have reported and logged it under an investigation ticket PDFNET-46996 in our issue tracking system. We will further look into details of it and share our feedback with you as soon as investigation is complete. Please be patient and spare us little time.

Would you kindly create a separate topic for this inquiry while attaching respective source and output PDF document. We will surely test the scenario in our environment and address it accordingly.

Great, please let me know if you need more extensive samples for the XML-FO content.

I’ve noticed several other discrepancies from the specification, but I assume you could run a full test yourself?

As for the sidetopic regarding validation, the example code is embedded in the original post. To explain a bit better: after creating a document (new Document()), the Document object will have a “PdfFormat” property, which says what type of PDF specification it should comply with. However, if you immediately Validate() the Document with the initial PdfFormat, the validation fails! How could Aspose possible create an invalid Document out of the box?

@tobbentm

We will surely let you know in case we need further information from your side.

Thanks for further elaboration.

Document.Validate() method is used to validate the compliance of PDF/A Format. Document is not converted to PDF/A on Save for performance purposes, which is why you should call Convert() method to get PDF/A compliant document. But, we admit that PdfFormat Enum creates confusion and we will fix it default value.

For the purpose, a ticket as PDFNET-46999 has been generated in our issue tracking system and linked to this forum thread. You will receive a notification as soon as the ticket is resolved. Please spare us little time.

The issues you have found earlier (filed as PDFNET-46999) have been fixed in Aspose.PDF for .NET 19.10.