Get Paragraphs in Word Document and Apply Font Formatting on List Paragraphs using VB.NET Code

Hi, support:

Here I report two bugs for aspose.words.dll,
for Bug1, the dll cannot obtain any paragraph from the document, please see the AbnormalSample1.zip (download it and then change its file extension as “.doc”) ;

for Bug2, the dll cannot apply all the paragraph format settings to the given paragraph, only the new paragraph format settings such as row space, space before paragraph, space after paragraph are applied, however, the paragraph format settings such as fontname and fontsize are void and not applied, please see the AbnormalSample2.zip (download it and then change its file extension as “.doc”) .

Please check them and fix it!
Thanks!
Ducaisoft

@ducaisoft,

I am afraid, we do not see any ZIP files attached in your previous post. If your .zip files have bigger sizes than the attachment limit in this forum then you may upload the ZIP files to Dropbox or any other file hosting service and share the download links here for testing. Please do not include Aspose.Words DLL files in ZIP packages to reduce the file sizes. Thanks for your cooperation.

Sorry for this!
I forgot to upload the files!

AbnormalSample1.zip (29 KB)
AbnormalSample2.zip (15.5 KB)

Bug appears as that Para.ParagraphFormat.Style.Font.Name =“Arial” and Para.ParagraphFormat.Style.Font.Size = Global.Aspose.Words.ConvertUtil.InchToPoint(20 / 72) are not truly applied for this paragraph!

@ducaisoft,

The archives AbnormalSample1.zip and AbnormalSample2.zip you attached seems corrupted or damaged. Win RAR for example gives the following message:

The archive is either in unknown format or damaged

Upon extracting with 7-Zip, we see the following content.

So, please ZIP these resources and reattach them here for testing. Thanks for your cooperation.

Dear support,

You should change the file extension “.zip” as “.doc”, this will be ok!

@ducaisoft,

For “AbnormalSample1.doc”, can you please elaborate how can we reproduce this bug on our end? Please provide piece of source code to reproduce this issue on our end.

We checked the following code and it iterates over Paragraph collection correctly:

Document doc = new Document("E:\\Temp\\AbnormalSample1.doc");

DocumentBuilder builder = new DocumentBuilder();
foreach(Paragraph para in doc.FirstSection.Body.GetChildNodes(NodeType.Paragraph, true))
{
    builder.Writeln(para.ToString(SaveFormat.Text));
}

builder.Document.Save("E:\\Temp\\20.1.doc");

Secondly, you can use the following code to apply font formatting on a text of an existing paragraph:

Document doc = new Document("E:\\Temp\\AbnormalSample2.doc");

Paragraph firstPara = doc.FirstSection.Body.Paragraphs[1];
foreach (Run run in firstPara.Runs)
{
    run.Font.Bold = true;
    run.Font.Color = Color.Red;
    run.Font.Size = 18;
    run.Font.Name = "Arial";

}

doc.Save("E:\\Temp\\20.1.doc");

Thanks for your response!
For “AbnormalSample1.doc”, my VB.net codes are as following:

For Each Para As Global.Aspose.Words.Paragraph In doc.GetChildNodes(Global.Aspose.Words.NodeType.Paragraph, True)
Dim s As String = Para.Range.Text.Trim 'Here s is always null string!
'some codes
next

For “AbnormalSample2.doc”, by referring to your codes, there appears a bug like this:
if there is a text such as auto-numbering text field, the run cannot include them into its runs collections.
please refer to the samples( PS: You should change the file extension “.zip” as “.doc").

TestFile.zip (20 KB)
CorrectDesiredOut.zip (20 KB)
UnexpectedOutByWords.dll.zip (11.5 KB)

@ducaisoft,

Regarding the NULL string problem, the issue occurs because there are six headers and footers with empty Paragraphs in them in AbnormalSample1.doc document. The following line takes into account all the Paragraphs in document (including the headers/footers).

doc.GetChildNodes(Global.Aspose.Words.NodeType.Paragraph, True)

To skip Paragraphs of headers and footers, please try the following code;

Document doc = new Document("E:\\Temp\\AbnormalSample1.doc");
DocumentBuilder builder = new DocumentBuilder();

foreach (Paragraph para in doc.FirstSection.Body.GetChildNodes(NodeType.Paragraph, true))
    builder.Writeln(para.ToString(SaveFormat.Text).Trim());

builder.Document.Save("E:\\Temp\\20.2.doc");

Please try using the following code to specify font formatting of list labels:

Document doc = new Document("E:\\Temp\\testFile.doc");

foreach(Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
{
    foreach (Run run in para.Runs)
    {
        run.Font.Bold = true;
        run.Font.Color = Color.Red;
        run.Font.Size = 18;
        run.Font.Name = "Arial";
    }

    if (para.IsListItem)
    {
        para.ListLabel.Font.Bold = true;
        para.ListLabel.Font.Color = Color.Red;
        para.ListLabel.Font.Size = 18;
        para.ListLabel.Font.Name = "Arial";
    }
}

doc.Save("E:\\Temp\\20.2.docx");

Hope, this helps.