Hello,
We use the the Aspose.PDF for .NET package (22.9), I wish ignore or detect hidden text present in the text layer of a PDF file’s, containing the text layer. Generally, it is text that has been modified by a user. the fragment.TextState.Invisible property always returns False for all text in document !!!
Please note that this is a blocking issue for our ongoing development. We are willing to pay if necessary to fix this bug. The future use of your SDK is conditioned by the unblocking of this bug. It’s really blocking26025923_26025923_org.pdf (229.0 KB)
The added information is:
NOTE D’HONORAIRES n° 1
Facture n° 01.11.22
The hidden information is :
NOTE D’HONORAIRES N° 06 Solde
Facture n° 06 / 02 / 13
@Magali_ISAIA
Can you please share the complete sample code snippet for our reference so that we can test the scenario in our environment accordingly and address it?
public static void Run()
{
//ExStart: AddAndSearchHiddenText
// The path to the documents directory.
//string dataDir = RunExamples.GetDataDir_AsposePdf_Text();
string dataDir = "X:\\Formulaires\\YoozingonLine\\02-Input convert\\";
if (!Directory.Exists(dataDir))
{
Console.WriteLine("directory don't exist!!!");
return;
}
string[] fileEntries = Directory.GetFiles(dataDir);
foreach(string fileName in fileEntries)
{
string extension = Path.GetExtension(fileName);
if (extension.ToUpper() != ".PDF")
continue;
string textfile = Path.ChangeExtension(fileName,".txt");
//Search text in the document
//Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(dataDir + "26025923_26025923_org.pdf");
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(fileName);
TextFragmentAbsorber absorber = new TextFragmentAbsorber();
// absorber.Visit(doc.Pages[1]);
pdfDocument.Pages.Accept(absorber);
try
{
Console.WriteLine("****** pdf--> " + fileName +" ---> ******");
//Pass the filepath and filename to the StreamWriter Constructor
StreamWriter sw = new StreamWriter( textfile);
foreach (TextFragment fragment in absorber.TextFragments)
{
//Do something with fragments
Console.WriteLine("Text '{0}' on pos {1} invisibility: {2} ",
fragment.Text, fragment.Position.ToString(), fragment.TextState.Invisible);
//Write a line of text
sw.WriteLine("Text '{0}' on pos {1} invisibility: {2} ",
fragment.Text, fragment.Position.ToString(), fragment.TextState.Invisible);
}
sw.Close();
}
catch (Exception e)
{
Console.WriteLine("Exception: " + e.Message);
}
finally
{
Console.WriteLine("Executing finally block.");
}
pdfDocument.Dispose();
}//ExEnd: AddAndSearchHiddenText
}
@Magali_ISAIA
An issue as PDFNET-53159 has been logged in our issue tracking system for further investigation. We will look into details of this scenario and let you know as soon as the ticket is resolved. Please be patient and spare us some time.
We are sorry for the inconvenience.
The issues you have found earlier (filed as PDFNET-53159) have been fixed in Aspose.PDF for .NET 23.11.