Hii Team,
I want tp remove empty new lines from my converted word document.
NewLinesIssue.PNG (128.0 KB) In this image you can see the document has extra lines between content. I want to remove them.
sample1.docx (29.2 KB) this is my converted word document.
Could you please help me ??
@nethmi You can use code like the following to remove empty paragraphs from your document:
Document doc = new Document(@"C:\Temp\in.docx");
NodeCollection paragraphs = doc.GetChildNodes(NodeType.Paragraph, true);
foreach (Paragraph p in paragraphs)
{
// If paragraph has empty text, there might be a shape, or bookmark
// Copy content to the next or previous paragraph.
if (string.IsNullOrEmpty(p.ToString(SaveFormat.Text).Trim()))
{
if (p.NextSibling != null && p.NextSibling.NodeType == NodeType.Paragraph)
while (p.HasChildNodes)
((Paragraph)p.NextSibling).PrependChild(p.LastChild);
if (p.PreviousSibling != null && p.PreviousSibling.NodeType == NodeType.Paragraph)
while (p.HasChildNodes)
((Paragraph)p.PreviousSibling).AppendChild(p.FirstChild);
}
// Empty paragraphs can be removed.
if (!p.HasChildNodes)
p.Remove();
}
doc.Save(@"C:\Temp\out.docx");
1 Like
Hii team,
Now It’s working. But I got a new issue.issue.PNG (126.2 KB)
expected output isexpected.PNG (138.9 KB)
Could you please check that??
@nethmi You can try removing only actually empty paragraphs:
NodeCollection paragraphs = doc.GetChildNodes(NodeType.Paragraph, true);
foreach (Paragraph p in paragraphs)
{
// Empty paragraphs can be removed.
if (!p.HasChildNodes)
p.Remove();
}
Or remove only paragraphs that do not contain shapes:
NodeCollection paragraphs = doc.GetChildNodes(NodeType.Paragraph, true);
foreach (Paragraph p in paragraphs)
{
// Empty paragraphs can be removed.
if (string.IsNullOrEmpty(p.ToString(SaveFormat.Text).Trim()) &&
p.GetChildNodes(NodeType.Shape, true).Count == 0)
p.Remove();
}
hii Team.
my second issue is still not solved. because both problems are occurring with the shapes. Could you please give me a implementation only for paragraphs with wrapped images???
//this is my method implementation for wrapping
private static void SetImageLayout(Document document, HtmlDocument htmlDocument)
{
HtmlNodeCollection images = htmlDocument.DocumentNode.SelectNodes("//img");
NodeCollection shapes = document.GetChildNodes(NodeType.Shape, true);
if (images != null)
{
var imgIndex = 0;
foreach (Shape shape in shapes)
{
var image = images.ElementAt(imgIndex);
if (image.HasClass("fr-fil"))
{
shape.WrapType = WrapType.Square;
}
else
{
shape.HorizontalAlignment = HorizontalAlignment.Center;
}
shape.AllowOverlap = false;
imgIndex++;
}
}
}
or do you have any suggestions??
@nethmi You can modify your code like the following to skip paragraphs that contain inline shapes:
NodeCollection paragraphs = doc.GetChildNodes(NodeType.Paragraph, true);
foreach (Paragraph p in paragraphs)
{
// If paragraph has empty text, there might be a shape, or bookmark
// Copy content to the next or previous paragraph.
if (string.IsNullOrEmpty(p.ToString(SaveFormat.Text).Trim()))
{
// Skip paragraphs with inline shapes.
NodeCollection shapes = p.GetChildNodes(NodeType.Shape, true);
foreach (Shape s in shapes)
if (s.WrapType == WrapType.Inline)
continue;
if (p.NextSibling != null && p.NextSibling.NodeType == NodeType.Paragraph)
while (p.HasChildNodes)
((Paragraph)p.NextSibling).PrependChild(p.LastChild);
if (p.PreviousSibling != null && p.PreviousSibling.NodeType == NodeType.Paragraph)
while (p.HasChildNodes)
((Paragraph)p.PreviousSibling).AppendChild(p.FirstChild);
}
// Empty paragraphs can be removed.
if (!p.HasChildNodes)
p.Remove();
}
1 Like
Hii Team,
Its’working. Thank you very much
1 Like