Hello,
I have an issue when converting word documents to pdf. If there is a “Section Break (continuous)” at the end of the word document and I insert an image after this break it gets removed during the conversion. If the image is inserted before the section break its fine. I have the following IF statement to strip out headers and footers, this seems to be the section of code that removes the image if it is inserted after the “Section Break (continuous)” but the code should only be striping out the headers and footers:
// do we have more than one page? then we need to tidy our headers and footers
if (iNumberSections > 1)
{
// are there headers and footers?
if (objDoc.Sections[0].HeadersFooters.Count > 0)
{
// get the main header
Aspose.Words.HeaderFooter x = objDoc.Sections[0].HeadersFooters[0];
Aspose.Words.HeaderFooter BottomFooter = objDoc.Sections[0].HeadersFooters[1];
// strip all the headers & footers from the document
int iNumberDocSections = objDoc.Sections.Count;
for (int i = iNumberDocSections - 1; i >= 0; i–)
{
int iNumberHeaderFooters = objDoc.Sections[i].HeadersFooters.Count;
for (int j = iNumberHeaderFooters - 1; j >= 0; j–)
{
objDoc.Sections[i].HeadersFooters[j].Remove();
}
}
// at this stage headers should be clear
objDoc.UpdateFields();
objDoc.AcceptAllRevisions();
// insert our header and footer
objDoc.Sections[0].HeadersFooters.Insert(0, x);
objDoc.Sections[0].HeadersFooters.Insert(1, BottomFooter);
// convert the document
objDoc.Save(sTempLocation + sPDFName + ".pdf");
}
else
{
// no headers or footers
objDoc.Save(sTempLocation + sPDFName + ".pdf");
}
}
else
{
// we’ve only one page no work to do here
objDoc.Save(sTempLocation + sPDFName + ".pdf");
}
objDoc.Save(sTempLocation + sPDFName + ".pdf");
File.Delete(sCopiedFileName);
}
}
Does anyone know why this would be happening? or how to stop it?
Thanks
Hi there,
Thanks for your inquiry. To ensure a timely and accurate response, please attach the following resources here for testing:
- Your input Word document.
- Please attach the output Pdf that shows the undesired behavior.
- Please create a standalone console application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing.
As soon as you get these pieces of information ready, we’ll start investigation into your issue and provide you more information. Thanks for your cooperation.
PS: To attach these resources, please zip them and Click ‘Reply’ button that will bring you to the ‘reply page’ and there at the bottom you can include any attachments with that post by clicking the ‘Add/Update’ button.
Hello,
I am unable to upload original documents or the full solution as this would be a breach of security. The snippet of code above is what is removing the image when it shouldn’t be. I have thrown together a sample document for you which I have attached, the word document attached contains an image after the section break. You will notice from the pdf file (produced after conversion) that the image was cut off during the conversion process. From debugging I can see that the IF statement above only gets entered when the document contains the section break, if I have no Section Break (Continuous) the if statement gets passed over and the document gets converted and includes the image.
Any help is much appreciated,
Thank You
Ciara
Hi Ciara,
Thanks for your inquiry. We suggest you please call Document.UpdatePageLayout method before saving document to Pdf. Hope this helps you.
If you still face problem, please upgrade to latest version of Aspose.Words for .NET 16.3.0 and let us know how it goes on your side.
Hi Tahir ,
Calling the document.UpdatePage layout did not resolve the issue. I was thinking I could check the section to see if an image exists, the entire method being used is:
public static void FormatConvertDocs(string sCopiedFileName, bool p_bRemoveSeal, string sTempLocation, out string sPDFName)
{
string sDocType = System.IO.Path.GetExtension(sCopiedFileName);
sPDFName = System.IO.Path.GetFileNameWithoutExtension(sCopiedFileName);
// 4. we need to check if the type is a document, if it is then we don’t need to convert it
if (sDocType.ToLower() != ".pdf")
{
Aspose.Words.Document objDoc = new Aspose.Words.Document(sCopiedFileName);
if (p_bRemoveSeal)
{
// remove the seal - return as txt file?
objDoc.Save(sTempLocation + sPDFName + ".txt");
File.Delete(sCopiedFileName);
}
else
{
int iNumberSections = objDoc.Sections.Count;
// is there a blank page? if so remove it
if (objDoc.Sections[iNumberSections - 1].Body.GetText() == "\f")
{
string bodyType = objDoc.Sections[iNumberSections - 1].
// empty last page here do something with it
objDoc.Sections[iNumberSections - 1].Remove();
}
// check for hidden report text - EJO fault fixed by this
if (objDoc.Sections[0].Body.GetText().ToLower().Contains("this file was created by oracle reports. view this document in page layout mode"))
{
int ipara = objDoc.Sections[0].Body.Paragraphs.Count;
if (objDoc.Sections[0].Body.Paragraphs[1].GetText().ToLower().Contains("this file was created by oracle reports. view this document in page layout mode"))
{
objDoc.Sections[0].Body.Paragraphs[1].Remove();
}
else if (objDoc.Sections[0].Body.Paragraphs[0].GetText().ToLower().Contains("this file was created by oracle reports. view this document in page layout mode"))
{
objDoc.Sections[0].Body.Paragraphs[0].Remove();
}
}
// do we have more than one page? then we need to tidy our headers and footers
if (iNumberSections > 1)
{
// are there headers and footers?
if (objDoc.Sections[0].HeadersFooters.Count > 0)
{
// get the main header
Aspose.Words.HeaderFooter x = objDoc.Sections[0].HeadersFooters[0];
Aspose.Words.HeaderFooter BottomFooter = objDoc.Sections[0].HeadersFooters[1];
// strip all the headers & footers from the document
int iNumberDocSections = objDoc.Sections.Count;
for (int i = iNumberDocSections - 1; i >= 0; i–)
{
int iNumberHeaderFooters = objDoc.Sections[i].HeadersFooters.Count;
for (int j = iNumberHeaderFooters - 1; j >= 0; j–)
{
objDoc.Sections[i].HeadersFooters[j].Remove();
}
}
// at this stage headers should be clear
objDoc.UpdateFields();
objDoc.AcceptAllRevisions();
// insert our header and footer
objDoc.Sections[0].HeadersFooters.Insert(0, x);
objDoc.Sections[0].HeadersFooters.Insert(1, BottomFooter);
// Ciara - 06/05/2016
objDoc.UpdatePageLayout();
// convert the document
objDoc.Save(sTempLocation + sPDFName + ".pdf");
}
else
{
// no headers or footers
objDoc.Save(sTempLocation + sPDFName + ".pdf");
}
}
else
{
// we’ve only one page no work to do here
objDoc.Save(sTempLocation + sPDFName + ".pdf");
}
objDoc.Save(sTempLocation + sPDFName + ".pdf");
File.Delete(sCopiedFileName);
}
}
}
All we really do with this is pass in the document location, convert it to pdf (using this method) then save the pdf. What I think we could do is where I check for an empty page which is:
// is there a blank page? if so remove it
if (objDoc.Sections[iNumberSections - 1].Body.GetText() == "\f")
{
string bodyType = objDoc.Sections[iNumberSections - 1].
// empty last page here do something with it
objDoc.Sections[iNumberSections - 1].Remove();
}
I would like to bypass the line objDoc.Sections[iNumberSections - 1].Remove(); if an image exists in this section. Do you know how I could do this?
Thanks
Ciara
Hi Ciara,
Thanks for sharing the detail. Please remove following code from you application. This code snippet removes the section which contains the image only.
// is there a blank page? if so remove it
if (objDoc.Sections[iNumberSections - 1].Body.GetText() == "\f")
{
string bodyType = objDoc.Sections[iNumberSections - 1].
// empty last page here do something with it
objDoc.Sections[iNumberSections - 1].Remove();
}
It seems that you want to remove empty pages from the end of document. If this is the case, you can use following code snippet to remove empty pages from the end of document. Hope this helps you.
// Remove the empty paragraphs if necessary.
while (!objDoc.LastSection.Body.LastParagraph.HasChildNodes)
{
if (objDoc.LastSection.Body.LastParagraph.PreviousSibling != null &&
objDoc.LastSection.Body.LastParagraph.PreviousSibling.NodeType != NodeType.Paragraph)
break;
objDoc.LastSection.Body.LastParagraph.Remove();
// If the current section becomes empty, we should remove it.
if (!objDoc.LastSection.Body.HasChildNodes)
objDoc.LastSection.Remove();
// We should exit the loop if the document becomes empty.
if (!objDoc.HasChildNodes)
break;
}