I am having a document with some tables. Each table contains some data fields, bookmarks, images etc,.
I want to remove the table structure and i want keep the data which are inside the table. I can do this manually for a Word Document by following steps.
Select the table which you want to remove.
Go to Layout Tab and select Convert to Text option under Data option.
Choose Tab option from the dialog box and click OK.
Kindly let me know any options in Aspose to do the above operation by programatically.
Hi Allan,
Thanks for your inquiry. You can use the following code to convert a table to text:
Document doc = new Document(MyDir + @"in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
Node[] tables = doc.GetChildNodes(NodeType.Table, true).ToArray();
foreach(Table table in tables)
{
// Create new paragraph for string table
Paragraph par = new Paragraph(doc);
// Inser this paragraph after table
table.ParentNode.InsertAfter(par, table);
builder.MoveTo(par);
builder.Font.Name = "Courier New";
// Insert string table
builder.Writeln(ConvertTable(table));
table.Remove();
}
doc.Save(MyDir + @"out.docx");
///
/// Method converts table to string
///
/// input table
/// String that represents content of input table
private static string ConvertTable(Table tab)
{
string output = string.Empty;
// Calculate max string length of each table column
ArrayList columnWidhs = new ArrayList();
int tableWidth = 0;
string horizontalBorder = string.Empty;
// Loop through all rows in table
foreach(Row row in tab.Rows)
{
// Loop througth all cells in current row
foreach(Cell cell in row.Cells)
{
int cellIndex = row.Cells.IndexOf(cell);
if (columnWidhs.Count> cellIndex)
{
if ((int) columnWidhs[cellIndex] <cell.ToTxt().Length)
{
columnWidhs[cellIndex] = cell.ToTxt().Length;
}
}
else
{
columnWidhs.Add(cell.ToTxt().Length);
}
}
}
// Calculate width of table
for (int index = 0; index <columnWidhs.Count; index++)
{
tableWidth += (int) columnWidhs[index];
}
tableWidth += columnWidhs.Count;
// Build horizontal border
for (int index = 0; index <tableWidth; index++)
{
horizontalBorder += "-";
}
horizontalBorder += "\r\n";
// Insert "Top Border"
output += horizontalBorder;
// Loop through all rows in table
foreach(Row row in tab.Rows)
{
string currentRow = "|";
// Loop througth all cells in current row
foreach(Cell cell in row.Cells)
{
int cellIndex = row.Cells.IndexOf(cell);
// Remove line breaks from cell text
string curentCell = cell.ToTxt().Replace("\r", " ").Replace("\n", " ");
// Insert white spaces to the end of cell text
while (curentCell.Length <(int) columnWidhs[cellIndex])
{
curentCell += " ";
}
// Insert "Vertical border"
currentRow += curentCell + "|";
}
output += currentRow + "\r\n";
// Insert "horizontal Border"
output += horizontalBorder;
}
return output;
}
I hope, this helps.
Best regards,
Thanks for your reply.
There an issue i faced after your sample code. If I have some bookmarks inside the table,I can’t see those bookmarks after converting table to text. The bookmarks has been removed after the conversion. What shall I do for this?
Hi Allan,
Thanks for your inquiry. We are checking with this scenario and will get back to you soon.
Best regards,
Hi Allan,
It is to update you that we have logged a task in our issue tracking system for our development team to develop functionality similar to Microsoft’s “Convert To Text” feature. Your ticket number is WORDSNET-10750 . Your request has also been linked to this task and you will be notified as soon as it is worked out. Sorry for the inconvenience.
Best regards,
@paybills We have added examples that mimics “Convert to Text” MS Word function.
{
for (Cell cell = row.FirstCell; cell != null; cell = cell.NextCell)
{
Console.WriteLine(cell.GetText());
}
}
//ExEnd
}
[Test]
public void ConvertWithParagraphMark()
{
Document doc = new Document(MyDir + "Nested tables.docx");
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
// Replace the table with the new paragraph
ConvertTable(table);
table.Remove();
doc.Save(ArtifactsDir + "output.docx");
if (cellText == string.Empty)
break;
foreach (Paragraph cellPara in cell.Paragraphs)
currentNode = table.ParentNode.InsertAfter(cellPara.Clone(true), currentNode);
}
}
}
[Test]
public void ConvertWith()
{
Document doc = new Document(MyDir + "Nested tables.docx");
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
// Convert table to text with specified separator.
ConvertWith(ControlChar.Tab, table);
// Remove table after convertion.
table.Remove();
doc.Save(ArtifactsDir + "Table.ConvertWith.docx");
The issues you have found earlier (filed as WORDSNET-10750) have been fixed in this Aspose.Words for .NET 24.8 update also available on NuGet .