TextFragmentAbsorber returns positions with paddings

Hello, I have to highlight some words in a Word document that converted to PNG image. I had that pipeline:

  1. Get Word document “Contract1.doc”
  2. Convert “Contract1.doc” to PDF
  3. Find words’ bounding rectangles using TextFragmentAbsorber
  4. Save source document as PNG image “aspose_test.png”
  5. For each rectangle in rectangle draw bounding rect on the image “aspose_test.png”

See code below:

private static void ExampleForAsposeForum()
{
    var wordDocument = new Aspose.Words.Document(@"D:\Contract1.doc");
    wordDocument.Save("./aspose_test.pdf", SaveFormat.Pdf);
    var imageSaveOptions = new ImageSaveOptions(SaveFormat.Png)
    {
        PageIndex = 0,
        Resolution = 300
    };

    wordDocument.Save("./aspose_test.png", imageSaveOptions);
      
    using (var img = new Bitmap("./aspose_test.png"))
    {
        var pdfDocument = new Aspose.Pdf.Document("./aspose_test.pdf");
        var words = new[]
        {
            "189981",
            "Боцмана",
            "«ТехноСистемы»",
            "Заказчика",
            "\"Ивановское\"",
            "Договора подряда",
            "СНиПов",
            "Цена\\s+договора,"
        };

        var regex = string.Join("|", words);
        var textFragmentAbsorber = new TextFragmentAbsorber(regex, new TextSearchOptions(true));
        textFragmentAbsorber.Visit(pdfDocument);

        foreach (TextFragment textFragment in textFragmentAbsorber.TextFragments)
        {
            var rect = new Rectangle(
                (int)textFragment.Rectangle.LLX, 
                (int)pdfDocument.Pages[1].MediaBox.Height - (int)textFragment.Rectangle.Height - (int)textFragment.Rectangle.LLY,
                (int)textFragment.Rectangle.Width, 
                (int)textFragment.Rectangle.Height);
          
            var vRatio = img.Height / pdfDocument.Pages[1].MediaBox.Height;
            var hRatio = img.Width / pdfDocument.Pages[1].MediaBox.Width;

            var scaledRect = new Rectangle(
                (int)(rect.X * hRatio),
                (int)(rect.Y * vRatio),
                (int)(rect.Width * hRatio),
                (int)(rect.Height * hRatio));

            using (var g = Graphics.FromImage(img))
            {
                var pen = new Pen(Color.Red, 5);
                g.DrawRectangle(pen, scaledRect);
                g.Save();
            }
            img.Save("./aspose_test_highlight.png");
        }
    }   
}

Problem is for some tokens (“189981”, “«ТехноСистемы»”) I got bounding rectangles that shifted little bit to the left.
I attached source document and result image to the post testdata.zip (358.8 KB)

@feeeper

I have worked with the data shared by you and have been able to reproduce the issue in our environment. A ticket with ID PDFNET-43773 has been logged in our issue management system for further investigation and resolution. The issue ID has been linked with this thread so that you will receive notification as soon as the issue is resolved.

We are sorry for the inconvenience.