Hidden text in save as PDF and page count

I want to save a word document as PDF, but without the hidden text. Is there a way to do this ?

I am also trying to count the pages of the word document, but this is also working out the result assuming that all hidden text is shown.

Can I do these functions while keeping the hiddden text, hidden ?

Thanks
Fiona

Hi Fiona,

Thanks for you inquiry. Please use the Font.Hidden property to get the hidden text from the document. True if the font is formatted as hidden text.

In
your case, I suggest you please first remove the hidden text from
document and then convert the document to Pdf. Hope this helps you. Please let us know if you have any
more queries.

Document doc = new Document(MyDir + "in.docx");
foreach (Paragraph par in doc.GetChildNodes(NodeType.Paragraph, true))
{
    par.ParagraphBreakFont.Hidden = false;
    foreach (Run run in par.GetChildNodes(NodeType.Run, true))
    {
        if (run.Font.Hidden)
            run.Font.Hidden = false;
    }
}
doc.Save(MyDir + "Out.pdf");

Thanks for the quick reply. I will try this out.

Fiona

If there is hidden text within tables etc, can I just delete the runs that are hidden ? How can I tell whether the whole table needs to be deleted ?
Also, if I delete all of the hidden runs, will their owning paragraphs be deleted as well ?

Thanks
Fiona

Hi Fiona,

Thanks for you inquiry. The code shared in my last post here deletes the hidden Run nodes. You can remove the empty table and paragraph. Please use the Node.ToString Method (SaveFormat) to export the content of the node into a string in the text format as shown in following code snippet and delete the node if it is empty. Hope this helps you.

If you still face problem, please share your input document here for testing. We will then provide you more information on this along with code.

Document doc = new Document(MyDir + "in.docx");
foreach (Paragraph par in doc.GetChildNodes(NodeType.Paragraph, true))
{
    par.ParagraphBreakFont.Hidden = false;
    foreach (Run run in par.GetChildNodes(NodeType.Run, true))
    {
        if (run.Font.Hidden)
            run.Font.Hidden = false;
    }
    // Remove the empty Paragraph 
    if (par.ToString(SaveFormat.Text).Trim() == String.Empty)
        par.Remove();
}
// Remove the empty Tables 
foreach (Table table in doc.GetChildNodes(NodeType.Table, true))
{
    if (table.ToString(SaveFormat.Text).Trim() == String.Empty)
        table.Remove();
}
doc.Save(MyDir + "Out.docx");

Thanks. Presumably, to delete the runs, I just need to call run.Remove() ?

Thanks
Fiona

Hi Fiona,

Thanks for you inquiry. Yes, you can delete the Run node by using Run.Remove method. Please let us know if you face any issue.

Hi,

If this ever gets implemented this shouly be done as an
opt-in variant that can be controlled via a new property. Otherwise this would
cause pretty much harm to our existing use cases where we want the hidden text to be shown (especially for saving Flat-OPC-files as text-files).

kind regards

Hi there,

Thanks for your inquiry. You may use DocumentVisitor as shown below to remove hidden formatting of text. Hope this helps you.

public class HiddenText : DocumentVisitor
{
    /// 
    /// Called when a Run node is encountered in the document.
    /// 
    public override VisitorAction VisitRun(Run run)
    {
        if (run.Font.Hidden)
            run.Font.Hidden = false;
        // Let the visitor continue visiting other nodes.
        return VisitorAction.Continue;
    }
}
Document doc = new Document(MyDir + "in.docx");
doc.Accept(new HiddenText());