I require the need to search a PDF document with regex such that it needs to be a positive match if 2 words are both found - order does not matter. Does the Aspose.PDF regex engine support this? I have tested other regex expressions successfully,
Thank you for contacting support.
We would like to update you that defining any criteria for searching text depends on the Regular Expression, and you can use different ,NET approaches to achieve your requirements. In case you face any problem while working with regular expression and Aspose.PDF for .NET API, please feel free to contact us with all the details and we will be more than happy to assist you further.
This regex does not appear to work. I would expect a positive match if both of those words were in the PDF anywhere in any order. (?=.*test)(?=.*long)
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(file.FullName);
Aspose.Pdf.Text.TextFragmentAbsorber textFragmentAbsorber = new Aspose.Pdf.Text.TextFragmentAbsorber();
textFragmentAbsorber.Phrase = “(?=.*test)(?=.*long)”;
textFragmentAbsorber.TextSearchOptions.IsRegularExpressionUsed = true;
Thank you for elaborating it further.
Would you please share the sample PDF file you are working with in your environment and verify this expression on some online Regex tester utility, or any other platform, to check if it works fine on other platforms and problem occurs only in case of Aspose.PDF for .NET API.
Here is a sample of it matching https://regex101.com/r/21CapV/1
Here is the file lookaround.pdf (30.8 KB)
Thank you for sharing requested data.
We have investigated your requirements and the regular expressions involving lookaround/lookbehind approaches does not seem to be supported at the moment. Therefore, an investigation ticket with ID PDFNET-45204 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.
We are sorry for the inconvenience.