Order/Unordered Lists lose formatting when they break across multiple pages and table structure creates large spaces when text breaks across multiple pages

We are using Aspose.Total v22.2

Attempting to create a PDF from HTML string has an issue when you have an order list that crosses on multiple pages. The List looks find on the first page but as soon as it moves to the next page is disregards the alignment.

This is the layout that I’m getting after the PDF is created from the HTML
image.png (78.2 KB)

This is the HTML before creating the PDF
image.png (54.0 KB)

Here is the portion of HTML specifically for order list piece on item 23.

<div style="margin-left:0em;" class="meeting-item" data-itemid="123858" data-hasattachments="True">
    <table class="item-table" style="padding-bottom:5px;">
        <tbody>
            <tr>
                <td class="attachment-cell">
                    <div class="pull-left attachment-icon-holder"><span class="glyphicon glyphicon-paperclip" title="Has Attachments" style="color:#aaa;"></span></div>
                </td>
                <td class="number-cell">
                    <div style="display: block; width: 100%;"><span style="font-size:12pt;"><span style="font-family:Arial,Helvetica,sans-serif;">(23)</span></span></div>
                </td>
                <td class="item-cell">
                    <div class="agenda-item" id="AgendaItem_123858">
                        <span style="display:block;">
                            <span style="display:block;width:100%;">
                                <span style="vertical-align:top;font-weight:bold;font-size:12pt;font-family:Arial,Helvetica,sans-serif;">13-1232-S2</span>
                            </span>
                            <span style="display:block;width:100%;padding-bottom:12pt;">
                                <span style="width:20%;display:inline-block;vertical-align:top;font-weight:bold;font-family:Arial!important;font-size:12pt!important;">CD 12</span>
                                <span style="display:inline-block;width:79%;vertical-align:top;font-family:Arial!important;font-size:12pt!important;">
                                    <span style="text-align:justify;font-size:12pt;font-family:Arial,Helvetica,sans-serif;">
                                        <span style="font-size:12pt;">
                                            <span style="font-family:Arial,Helvetica,sans-serif;">
                                                MOTION (LEE - KORETZ) relative to funding for any aspect of the efforts / operations of the Granada Hills Youth Recreation Center, Inc.
                                            </span>
                                        </span>
                                    </span>
                                </span>
                            </span>
                            <span style="display:block;width:100%;padding-bottom:12pt;">
                                <span style="display:inline-block;width:20%;vertical-align:top;">&nbsp;</span>
                                <span style="display:inline-block;width:79%;vertical-align:top;font-family:Arial!important;text-align:justify;font-size:12pt!important;">
                                    Recommendations for Council action:
                                    <br>
                                    <br>
                                    <ol>
                                        <li>RESOLVE that $20,000 in the Sunshine Canyon Community Amenities Trust Fund No. 699/14 be allocated and appropriated for any aspect of the efforts / operations of the Granada Hills Youth Recreation Center, Inc.<br>
                                            <br>
                                            &nbsp;
                                        </li>
                                        <li>DIRECT City Clerk to prepare and process the necessary document(s) with, and/or payment(s) to Granada Hills Youth Recreation Center, Inc., or any other agency or organization, as appropriate, in the above amount, from the above source, and for the above purposes, subject to the approval of the City Attorney as to form, if needed; and, if needed, Authorize the Councilmember of the Twelfth District to execute any such documents on behalf of the City.<br>
                                            <br>
                                            &nbsp;
                                        </li>
                                        <li>AUTHORIZE the City Clerk to make any technical corrections or clarifications to the above fund transfer instructions in order to effectuate the intent of this motion.</li>
                                    </ol>
                                </span>
                            </span>
                        </span>
                    </div>
                </td>
            </tr>
        </tbody>
    </table>
</div>

I Have also tried accomplishing this with a table structure and where the table keeps the alignment correct is causes other issues where the text does not stay together as you can see on this output. The table structure does look much better but without the text flowing from page to page and leaving large gaps like you see it doesn’t work.
image.png (42.3 KB)

HI,
There is a document with a tutorial on page set up for Aspose Total. Please check it out at Fine-Tuning Converters.

You will find helpful the examples on how to control the page margins and alignments.

I hope it helps.

One thing to note also is we are using the .net Version of Aspose.

We don’t actually do any page setup we allow the HTML to configure the entire document and we use

public void AddPages(string pdfHtml)
        {
            // Encode PDF HTML to UTF8
            byte[] bytes = Encoding.UTF8.GetBytes(pdfHtml);
            pdfHtml = Encoding.UTF8.GetString(bytes);

            Document agendaDoc = new Document(new MemoryStream(Encoding.UTF8.GetBytes(pdfHtml)), htmlLoadOptions);
            ELSLogHelper.InsertInfoLog(_callContext, ELSLogHelper.AsposeLogMessage("Open"), MethodBase.GetCurrentMethod()?.Name, MethodBase.GetCurrentMethod().DeclaringType?.Name, Environment.StackTrace);
            MemoryStream stream = new MemoryStream();

            agendaDoc.Save(stream);
            ELSLogHelper.InsertInfoLog(_callContext, ELSLogHelper.AsposeLogMessage("Save"), MethodBase.GetCurrentMethod()?.Name, MethodBase.GetCurrentMethod().DeclaringType?.Name, Environment.StackTrace);

            mainDocument = new Document(stream);
            ELSLogHelper.InsertInfoLog(_callContext, ELSLogHelper.AsposeLogMessage("Open"), MethodBase.GetCurrentMethod()?.Name, MethodBase.GetCurrentMethod().DeclaringType?.Name, Environment.StackTrace);
            PdfFileEditor pfe = new PdfFileEditor();
            pfe.ResizeContents(mainDocument, PdfFileEditor.ContentsResizeParameters.PageResize(PageSize.PageLetter.Width, PageSize.PageLetter.Height));
            pfe.ResizeContents(agendaDoc, PdfFileEditor.ContentsResizeParameters.PageResize(PageSize.PageLetter.Width, PageSize.PageLetter.Height));
        }

Which creates the entire PDF document for us. After the document is created we use our own SaveFile method which saves the PDF document as a PDF and then places it in azure.

private async Task SaveFile(CompiledMeetingDocumentFile file, string filename, int meetingId, bool isPdfConvert, IDocumentHelper document = null)
        {
            string filePath = string.Empty;
            try
            {
                // Creates a temp folder on the hosting environment if it doesn't already exist.
                var savePath = System.Web.Hosting.HostingEnvironment.MapPath($"~/temp/{meetingId}/");
                if (!Directory.Exists(savePath))
                {
                    if (string.IsNullOrEmpty(savePath))
                    {
                        throw new Exception("Could not create a valid file path in the current Hosting Environment.");
                    }
                    Directory.CreateDirectory(savePath);
                }
                filePath = Path.Combine(savePath, filename);

                string fileExt = file.CompileOutputType == CompileOutputTypes.Docx ? ".docx" : ".pdf";

                if (isPdfConvert)
                {
                    Aspose.Words.Document PdfDocument = new Document(filePath + ".docx");
                    ELSLogHelper.InsertInfoLog(_callContext, ELSLogHelper.AsposeLogMessage("Open"), MethodBase.GetCurrentMethod()?.Name, MethodBase.GetCurrentMethod().DeclaringType?.Name, Environment.StackTrace);
                    filePath += fileExt;
                    PdfDocument.Save(filePath, SaveFormat.Pdf);
                    ELSLogHelper.InsertInfoLog(_callContext, ELSLogHelper.AsposeLogMessage("Save"), MethodBase.GetCurrentMethod()?.Name, MethodBase.GetCurrentMethod().DeclaringType?.Name, Environment.StackTrace);
                }
                else
                {
                    filePath += fileExt;
                    document.Save(filePath);
                }

                string permFilePath = string.Format("Meetings/{0}/{1}{2}", meetingId, Path.GetFileNameWithoutExtension(filePath), fileExt);
                await _azureProvider.SaveAzureFileAsync(permFilePath, File.ReadAllBytes(filePath));

                file.FilePath = permFilePath;
                file.CompileFinishedDate = DateTime.UtcNow;
            }
            catch (Exception exception)
            {
                LogManager.Error($"Failed to save compiled file: meetingId: {meetingId} fileName: {filename}", exception);
                throw;
            }
            finally
            {
                if (File.Exists(filePath))
                    File.Delete(filePath);
            }
        }

There is something in Apose that when it is converting the HTML into a PDF that is causing it to break.

I just went and downloaded the PDF and used the Aspose PDF to HTML online converted to convert the document back to HTML and see what might be happening there and found the following.

This screen shot is of the order list in the correct location on the page.
image.png (330.3 KB)

The following screen shot is of the following list items in the incorrect location on the page.
image.png (324.9 KB)

As you can see on the screen shots the left is changed from 17.1535em to 9.041em. Aspose is doing that all on it’s own during the HTML to PDF conversion.

This is looking like a bug in Apose and the more I’m digging into this here the more it is when text is on multiple pages.

I also just found that when text that is not in an order list is breaking across pages Aspose just decides to put text where ever it wants to apparently. On the screen shot below the left side is the HTML and on the right side is the created PDF.
image.png (150.6 KB)

Here is the conversion back to HTML from the PDF using the online converter. You can see from the PDF back to HTML it is converting it the same but when you go from HTML to PDF it is breaking.
image.png (52.0 KB)

@styler

To ensure a timely and accurate response, please attach the following resources here for testing:

  • Your input HTML document.
  • Please attach the output file that shows the undesired behavior.
  • Please attach the expected output file that shows the desired behavior.
  • Please create a standalone console application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.