Hi Team,
Hi Navaneethan,
TextFragmentAbsorber(“(?i)Line”, new TextSearchOptions(true));
Where should be search text ?Consider i am searching for a text called “testtext”.Can you please share the code
Hi Navaneethan,
Hi Team,
Hi Navaneethan,
Hi Team,
Hi Navaneethan,
//create TextAbsorber object to find
all the phrases matching the regular expression<o:p></o:p>
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(@"(?i)\bthe\b");
//set text search option to specify regular expression usage
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;
Please feel free to contact us for any further assistance.
Best Regards,
Hi Team,
It works when I am hardcoding the value like for example,
Hi Navaeenthan,
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(@"(?i)\b" + keyword + @"\b");<o:p></o:p>
@tilal.ahmad
Hi !
I am stuck in a similar issue !
I am using this, but it gives an erroe and says , “Syntax error on token “@”, delete this token”
I am working in Java.
Please help !
Please try removing ‘@’ from expression. If issue still persists, please share your sample PDF document and code snippet. We will further proceed to help you accordingly.
@asad.ali
I tried it, but it’s not working!
I am actually listing the number of docs containing the string that is searched.
Here is my code :
String strFind = "Test";
int count =0;
File[] files = new File("E:\\").listFiles();
for (File file : files) {
if (file.isFile()) {
String folderName = file.getParent();
String fileName = file.getName();
String extensionName = fileName.substring(fileName.lastIndexOf("."));
if(extensionName.equals(".pdf")) {
//System.out.println(“Processing document: " + fileName);
com.aspose.pdf.Document pdfDoc = new com.aspose.pdf.Document(file.getAbsolutePath());
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(@”(?i)\b"+strFind+@"\b"); // like 1999-2000
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.setTextSearchOptions(textSearchOptions);
// Accept the absorber for first page of document
pdfDoc.getPages().accept(textFragmentAbsorber);
// Get the extracted text fragments into collection
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.getTextFragments();
for (TextFragment textFragment : (Iterable) textFragmentCollection) {
if(textFragment.getText() != “”) {
count++;
}
}
if(count > 0) {
System.out.println(“E:\”+file.getName()+" || Count="+count);
}
count=0;
}
I want my search to be case INSENSITIVE. I know one way is by using , “?i”
But, I want to pass text as a variable
Please help !
We have tested the scenario in our environment using following code snippet with one of our sample PDF documents and were not able to notice any issue. API was able to find text based on regular expressions:
Document pdfDoc = new Document(dataDir + "sample.pdf");
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
String regex = "(?i)\\Small Demonstration\\b";
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(regex, textSearchOptions);
pdfDoc.getPages().accept(textFragmentAbsorber);
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.getTextFragments();
int count = 0;
for (TextFragment textFragment : textFragmentCollection) {
count++;
System.out.println(count + ". " + textFragment.getText());
}
Can you please share your sample PDF document with us along with the information of text which you want to extract. We will further test the scenario in our environment and address it accordingly.
This is working.
But, I actually wanted this to have a variable, instead of passing the hard coded text.
And, (?i)\Small Demonstration\b , this is missing a ‘b’ , I guess !
Well, I have done it successfully with the variable too !
String find = “(?i)\b”+strFind+"\b";
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(find);
I have got this working for me !
Anyways, Thanks a lot for your co-operation !
Thank you for your kind feedback.
We are glad to know that things are now working in your environment. Please feel free to contact us if you need any further assistance.