I am trying to convert entire HTML pages to PDF. I keep getting parsing errors. My code looks something like this.
Pdf pdf = new Pdf();
Section section = pdf.getSections().add();
Text HTMLText1 = new Text(HTMLTextFromFile);
HTMLText1.setIsHtmlTagSupported(true);
section.getParagraphs().add(HTMLText1);
String outFilePDF = "d:/pdftest/SamplePDF_HTMLTest.pdf";
pdf.save(outFilePDF);
My HTML pages start with a xhtml1-transitional doctype
...
The error I am getting is:
[Fatal Error] :1:16: A DOCTYPE is not allowed in content.
org.xml.sax.SAXParseException: A DOCTYPE is not allowed in content.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.
java:264)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Doc
umentBuilderImpl.java:292)
at aspose.pdf.xml.h.a(SourceFile:428)
at aspose.pdf.xml.h.a(SourceFile:2388)
at aspose.pdf.xml.ao.a(SourceFile:441)
at aspose.pdf.xml.n.a(SourceFile:759)
at aspose.pdf.xml.P.a(SourceFile:105)
at aspose.pdf.xml.w.a(SourceFile:112)
at aspose.pdf.Pdf.save(SourceFile:1142)
When I remove the doctype I get the following error:
[Fatal Error] :1:1793: The entity name must immediately follow the '&' in the en
tity reference.
org.xml.sax.SAXParseException: The entity name must immediately follow the '&' i
n the entity reference.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.
java:264)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Doc
umentBuilderImpl.java:292)
at aspose.pdf.xml.h.a(SourceFile:428)
at aspose.pdf.xml.h.a(SourceFile:2388)
at aspose.pdf.xml.ao.a(SourceFile:441)
at aspose.pdf.xml.n.a(SourceFile:759)
at aspose.pdf.xml.P.a(SourceFile:105)
at aspose.pdf.xml.w.a(SourceFile:112)
at aspose.pdf.Pdf.save(SourceFile:1142)
A little help is appreciated. What am I doing wrong?