Word Section Break

CiaraMurphy · May 4, 2016, 7:35am

Hello,

I have an issue when converting word documents to pdf. If there is a “Section Break (continuous)” at the end of the word document and I insert an image after this break it gets removed during the conversion. If the image is inserted before the section break its fine. I have the following IF statement to strip out headers and footers, this seems to be the section of code that removes the image if it is inserted after the “Section Break (continuous)” but the code should only be striping out the headers and footers:

// do we have more than one page? then we need to tidy our headers and footers
if (iNumberSections > 1)
{
    // are there headers and footers?
    if (objDoc.Sections[0].HeadersFooters.Count > 0)
    {
        // get the main header
        Aspose.Words.HeaderFooter x = objDoc.Sections[0].HeadersFooters[0];
        Aspose.Words.HeaderFooter BottomFooter = objDoc.Sections[0].HeadersFooters[1];

        // strip all the headers & footers from the document
        int iNumberDocSections = objDoc.Sections.Count;
        for (int i = iNumberDocSections - 1; i >= 0; i–)
        {
            int iNumberHeaderFooters = objDoc.Sections[i].HeadersFooters.Count;
            for (int j = iNumberHeaderFooters - 1; j >= 0; j–)
            {
                objDoc.Sections[i].HeadersFooters[j].Remove();
            }
        }

        // at this stage headers should be clear
        objDoc.UpdateFields();
        objDoc.AcceptAllRevisions();

        // insert our header and footer
        objDoc.Sections[0].HeadersFooters.Insert(0, x);
        objDoc.Sections[0].HeadersFooters.Insert(1, BottomFooter);

        // convert the document
        objDoc.Save(sTempLocation + sPDFName + ".pdf");
    }
    else
    {
        // no headers or footers
        objDoc.Save(sTempLocation + sPDFName + ".pdf");
    }
}
else
{
    // we’ve only one page no work to do here
    objDoc.Save(sTempLocation + sPDFName + ".pdf");
}
objDoc.Save(sTempLocation + sPDFName + ".pdf");
File.Delete(sCopiedFileName);
}

}

Does anyone know why this would be happening? or how to stop it?

Thanks

tahir.manzoor · May 5, 2016, 6:46am

Hi there,

Thanks for your inquiry. To ensure a timely and accurate response, please attach the following resources here for testing:

Your input Word document.
Please attach the output Pdf that shows the undesired behavior.
Please create a standalone console application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we’ll start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip them and Click ‘Reply’ button that will bring you to the ‘reply page’ and there at the bottom you can include any attachments with that post by clicking the ‘Add/Update’ button.

CiaraMurphy · May 5, 2016, 9:22am

Hello,

I am unable to upload original documents or the full solution as this would be a breach of security. The snippet of code above is what is removing the image when it shouldn’t be. I have thrown together a sample document for you which I have attached, the word document attached contains an image after the section break. You will notice from the pdf file (produced after conversion) that the image was cut off during the conversion process. From debugging I can see that the IF statement above only gets entered when the document contains the section break, if I have no Section Break (Continuous) the if statement gets passed over and the document gets converted and includes the image.

Any help is much appreciated,

Thank You
Ciara

tahir.manzoor · May 6, 2016, 3:55am

Hi Ciara,

Thanks for your inquiry. We suggest you please call Document.UpdatePageLayout method before saving document to Pdf. Hope this helps you.

If you still face problem, please upgrade to latest version of Aspose.Words for .NET 16.3.0 and let us know how it goes on your side.

CiaraMurphy · May 6, 2016, 5:50am

Hi Tahir ,

Calling the document.UpdatePage layout did not resolve the issue. I was thinking I could check the section to see if an image exists, the entire method being used is:

public static void FormatConvertDocs(string sCopiedFileName, bool p_bRemoveSeal, string sTempLocation, out string sPDFName)
{
    string sDocType = System.IO.Path.GetExtension(sCopiedFileName);
    sPDFName = System.IO.Path.GetFileNameWithoutExtension(sCopiedFileName);

    // 4. we need to check if the type is a document, if it is then we don’t need to convert it
    if (sDocType.ToLower() != ".pdf")
    {
        Aspose.Words.Document objDoc = new Aspose.Words.Document(sCopiedFileName);

        if (p_bRemoveSeal)
        {
            // remove the seal - return as txt file?
            objDoc.Save(sTempLocation + sPDFName + ".txt");
            File.Delete(sCopiedFileName);
        }
        else
        {
            int iNumberSections = objDoc.Sections.Count;

            // is there a blank page? if so remove it
            if (objDoc.Sections[iNumberSections - 1].Body.GetText() == "\f")
            {
                string bodyType = objDoc.Sections[iNumberSections - 1].

                // empty last page here do something with it
                objDoc.Sections[iNumberSections - 1].Remove();
            }

            // check for hidden report text - EJO fault fixed by this
            if (objDoc.Sections[0].Body.GetText().ToLower().Contains("this file was created by oracle reports. view this document in page layout mode"))
            {
                int ipara = objDoc.Sections[0].Body.Paragraphs.Count;
                if (objDoc.Sections[0].Body.Paragraphs[1].GetText().ToLower().Contains("this file was created by oracle reports. view this document in page layout mode"))
                {
                    objDoc.Sections[0].Body.Paragraphs[1].Remove();
                }
                else if (objDoc.Sections[0].Body.Paragraphs[0].GetText().ToLower().Contains("this file was created by oracle reports. view this document in page layout mode"))
                {
                    objDoc.Sections[0].Body.Paragraphs[0].Remove();
                }
            }

            // do we have more than one page? then we need to tidy our headers and footers
            if (iNumberSections > 1)
            {
                // are there headers and footers?
                if (objDoc.Sections[0].HeadersFooters.Count > 0)
                {
                    // get the main header
                    Aspose.Words.HeaderFooter x = objDoc.Sections[0].HeadersFooters[0];
                    Aspose.Words.HeaderFooter BottomFooter = objDoc.Sections[0].HeadersFooters[1];

                    // strip all the headers & footers from the document
                    int iNumberDocSections = objDoc.Sections.Count;
                    for (int i = iNumberDocSections - 1; i >= 0; i–)
                    {
                        int iNumberHeaderFooters = objDoc.Sections[i].HeadersFooters.Count;
                        for (int j = iNumberHeaderFooters - 1; j >= 0; j–)
                        {
                            objDoc.Sections[i].HeadersFooters[j].Remove();
                        }
                    }

                    // at this stage headers should be clear
                    objDoc.UpdateFields();
                    objDoc.AcceptAllRevisions();

                    // insert our header and footer
                    objDoc.Sections[0].HeadersFooters.Insert(0, x);
                    objDoc.Sections[0].HeadersFooters.Insert(1, BottomFooter);

                    // Ciara - 06/05/2016
                    objDoc.UpdatePageLayout();

                    // convert the document
                    objDoc.Save(sTempLocation + sPDFName + ".pdf");
                }
                else
                {

                    // no headers or footers
                    objDoc.Save(sTempLocation + sPDFName + ".pdf");
                }
            }
            else
            {

                // we’ve only one page no work to do here
                objDoc.Save(sTempLocation + sPDFName + ".pdf");
            }

            objDoc.Save(sTempLocation + sPDFName + ".pdf");
            File.Delete(sCopiedFileName);
        }

    }
}

All we really do with this is pass in the document location, convert it to pdf (using this method) then save the pdf. What I think we could do is where I check for an empty page which is:

// is there a blank page? if so remove it
if (objDoc.Sections[iNumberSections - 1].Body.GetText() == "\f")
{
    string bodyType = objDoc.Sections[iNumberSections - 1].

    // empty last page here do something with it
    objDoc.Sections[iNumberSections - 1].Remove();
}

I would like to bypass the line objDoc.Sections[iNumberSections - 1].Remove(); if an image exists in this section. Do you know how I could do this?

Thanks
Ciara

tahir.manzoor · May 9, 2016, 2:48am

Hi Ciara,

Thanks for sharing the detail. Please remove following code from you application. This code snippet removes the section which contains the image only.

// is there a blank page? if so remove it
if (objDoc.Sections[iNumberSections - 1].Body.GetText() == "\f")
{
    string bodyType = objDoc.Sections[iNumberSections - 1].
    // empty last page here do something with it
    objDoc.Sections[iNumberSections - 1].Remove();
}

It seems that you want to remove empty pages from the end of document. If this is the case, you can use following code snippet to remove empty pages from the end of document. Hope this helps you.

// Remove the empty paragraphs if necessary.
while (!objDoc.LastSection.Body.LastParagraph.HasChildNodes)
{
    if (objDoc.LastSection.Body.LastParagraph.PreviousSibling != null &&
    objDoc.LastSection.Body.LastParagraph.PreviousSibling.NodeType != NodeType.Paragraph)
        break;
    objDoc.LastSection.Body.LastParagraph.Remove();
    // If the current section becomes empty, we should remove it.
    if (!objDoc.LastSection.Body.HasChildNodes)
        objDoc.LastSection.Remove();
    // We should exit the loop if the document becomes empty.
    if (!objDoc.HasChildNodes)
        break;
}