How to get the indent of where a tab character would indent to

If processing a Word document that has a paragraph like this:
\t This is some text
Where the \t is a tab character, can I use aspose to determine which tabstop that will take me to? I have seen GetEffectiveTabStops but that doesn’t necessarily tell me which tab stop that tab ends on.

I essentially want to normalize a paragraph where a user has used space bar and tab characters to get the text to line up (instead of using the proper left indent property). So I want to calculate what the “effective” indent is (i.e. where the non-whitespace text starts).

Hi Simon,

Thanks for your inquiry. Could you please share your input and expected output document here for our reference? We will investigate how you want your final Word output be generated like. We will then provide you more information on this along with code.

In the attached file, in Word if you hide paragraph marks they look like they’re in the same list, but if you turn on paragraph marks you’ll see that the second one is actually typed out (using spaces and tabs to get it to line up).

So for the “Hello there” paragraph I can easily use Aspose to get me the numbering indent and paragraph indent. However for the “An example” paragraph Aspose will tell me that there is no number and a zero paragraph indent. So what I am doing is detecting that a paragraph starts with a number, and then I’m trying to calculate what the numbering indent and paragraph indent of that paragraph would be if it were done with paragraph properties instead of it being typed out, so that I can match it up to any lists with the same indents.

I’m going to call these indents the “apparent” indent because if you were to print it, it looks as though it’s indented to that level, even though it’s using white space characters to do so.

I could do it using Graphics.MeasureString but that won’t work with tabs because the tabs use tabstops. So, how would you suggest I calculate the “apparent indent” to the number and the start of the paragraph for the “An example” paragraph?

Hi Simon,

Thanks for sharing the detail. The second paragraph in your document does not contain the tab stop rather it contains tab character (ControlChar.Tab) and spaces. Please check the attached image for detail. You can check the tab character in a paragraph using following code example. Hope this helps you.

Document doc = new Document(MyDir + "Example.docx");
// Iterate through all paragraphs in the document
foreach (Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
{
    Console.WriteLine(para.GetText().Contains(ControlChar.Tab));
}
// Insert the same spaces and tab character into the document.
DocumentBuilder builder = new DocumentBuilder(doc);
builder.MoveToDocumentEnd();
builder.Writeln();
builder.Writeln(" " + ControlChar.Tab + "New text");
doc.Save(MyDir + "output.docx");

OK thanks for clearing that up for me, I thought tab stops were used whenever tab characters were used.

However I suspect you’ve missed the point here. I want to know what the INDENT to the number is. i.e. translate the whitespace characters (be it spaces or tabs) into an equivalent indent value so that I can remove the whitespace and modify the paragraph format so that the end result looks the same, but is using paragraph formatting instead of whitespace. Anyway, given it’s just a tab character and not using a tab stop can I just use the Graphics.MeasureString to get the value?

No, it doesn’t look like it. I’ve uploaded another example file. The tab character is definitely using the tab stop I added. Please advise on how I can calculate the “apparent indent”.

Perhaps I need to insert a dummy run and then use the LayoutCollector to calculate the indent in pixels and then convert it back to points… do you have a suggestion as to how I can achieve the above?

Hi Simon,

Thanks for your inquiry. The Aspose.Words.Layout namespace provides classes that allow to access information such as on what page and where on a page particular document elements are positioned, when the document is formatted into pages.

We suggest you please use LayoutCollector.GetEntity method to get an opaque position of the LayoutEnumerator which corresponds to the specified node. You can use returned value as an argument to Current given the document being enumerated and the document of the node are the same.

All text of the document is stored in runs of text. A run node may have single and multiple characters. If you need to navigate to a Run of text then you can insert bookmark right before it and then get the position of it. Please check the following code example. The code of FindAndInsertBookmark class is attached with this post.

Moreover, you can use ListLevel.NumberPosition property to get or set the position (in points) of the number or bullet for the list level and ListLevel.TextPosition property get or set the position (in points) for the second line of wrapping text for the list level.

Hope this helps you.

Document doc = new Document(MyDir + "Example.docx");
// Iterate through all paragraphs in the document
foreach (Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
{
    Console.WriteLine(para.ParagraphFormat.LeftIndent);
    if (para.IsListItem)
    {
        Console.WriteLine(para.ListFormat.List.ListLevels[0].NumberPosition);
        Console.WriteLine(para.ListFormat.List.ListLevels[0].TextPosition);
    }
}
Paragraph paragraph1 = (Paragraph)doc.GetChild(NodeType.Paragraph, 1, true);
paragraph1.Range.Replace("An example", "", new FindReplaceOptions { ReplacingCallback = new FindAndInsertBookmark("bookmark2") });
Bookmark bm = paragraph1.Range.Bookmarks["bookmark2"];
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator layoutEnumerator = new LayoutEnumerator(doc);
var renderObject = collector.GetEntity(bm.BookmarkStart);
layoutEnumerator.Current = renderObject;
RectangleF location2 = layoutEnumerator.Rectangle;
Console.WriteLine("Calculated position of 'An example'" + (location2.X - doc.FirstSection.PageSetup.LeftMargin));
bm.Remove();
doc.Save(MyDir + "output.docx");