Hello, I have to highlight some words in a Word document that converted to PNG image. I had that pipeline:
- Get Word document “Contract1.doc”
- Convert “Contract1.doc” to PDF
- Find words’ bounding rectangles using
TextFragmentAbsorber
- Save source document as PNG image “aspose_test.png”
- For each rectangle in rectangle draw bounding rect on the image “aspose_test.png”
See code below:
private static void ExampleForAsposeForum()
{
var wordDocument = new Aspose.Words.Document(@"D:\Contract1.doc");
wordDocument.Save("./aspose_test.pdf", SaveFormat.Pdf);
var imageSaveOptions = new ImageSaveOptions(SaveFormat.Png)
{
PageIndex = 0,
Resolution = 300
};
wordDocument.Save("./aspose_test.png", imageSaveOptions);
using (var img = new Bitmap("./aspose_test.png"))
{
var pdfDocument = new Aspose.Pdf.Document("./aspose_test.pdf");
var words = new[]
{
"189981",
"Боцмана",
"«ТехноСистемы»",
"Заказчика",
"\"Ивановское\"",
"Договора подряда",
"СНиПов",
"Цена\\s+договора,"
};
var regex = string.Join("|", words);
var textFragmentAbsorber = new TextFragmentAbsorber(regex, new TextSearchOptions(true));
textFragmentAbsorber.Visit(pdfDocument);
foreach (TextFragment textFragment in textFragmentAbsorber.TextFragments)
{
var rect = new Rectangle(
(int)textFragment.Rectangle.LLX,
(int)pdfDocument.Pages[1].MediaBox.Height - (int)textFragment.Rectangle.Height - (int)textFragment.Rectangle.LLY,
(int)textFragment.Rectangle.Width,
(int)textFragment.Rectangle.Height);
var vRatio = img.Height / pdfDocument.Pages[1].MediaBox.Height;
var hRatio = img.Width / pdfDocument.Pages[1].MediaBox.Width;
var scaledRect = new Rectangle(
(int)(rect.X * hRatio),
(int)(rect.Y * vRatio),
(int)(rect.Width * hRatio),
(int)(rect.Height * hRatio));
using (var g = Graphics.FromImage(img))
{
var pen = new Pen(Color.Red, 5);
g.DrawRectangle(pen, scaledRect);
g.Save();
}
img.Save("./aspose_test_highlight.png");
}
}
}
Problem is for some tokens (“189981”, “«ТехноСистемы»”) I got bounding rectangles that shifted little bit to the left.
I attached source document and result image to the post testdata.zip (358.8 KB)