Hi
I want to split a PDF document into individual pages, except where I find “2PW” on a page. Where “2PW“ is found then that page and the following page are to be output to a single file.
I am searching individual pages in the PDF document looking for “2PW”. The problem I am having is when the phrase is found all the subsequent pages also find the phrase, even when it isn’t present on those pages
Please find example code and a TEST document. I have downloaded the most recent .Net ASPOSE PDF dll 17.9.0.0
Dim licPDF As Aspose.Pdf.License = New Aspose.Pdf.License
licPDF.SetLicense(clsF2F.Globals.PathLicences)
Dim txtInputPath As String = “C:\Temp\In”
Dim txtOutputPath As String = “C:\Temp\Out”
Dim strFiles As String() = System.IO.Directory.GetFiles(txtInputPath, “*.pdf”, IO.SearchOption.TopDirectoryOnly)
For Each strFile As String In strFiles
Dim pdfDoc As Aspose.Pdf.Document = Nothing
pdfDoc = New Aspose.Pdf.Document(strFile)
Dim textFragmentAbsorber As New Aspose.Pdf.Text.TextFragmentAbsorber(“2PW”)
Dim pdfNewDoc As Aspose.Pdf.Document = Nothing
Dim intPage As Integer = 0
Dim int2PageWarrant As Integer = 0
'int2PageWarrant meaning 0= Not in it
’ 1= found it and on the first page
’ 2= On the second page
For Each PdfP As Aspose.Pdf.Page In pdfDoc.Pages
intPage += 1
'Search inside a page in the PDF for ‘2PW’. Which is the indicator to say this is a 2-page Warrant document
pdfDoc.Pages(intPage).Accept(textFragmentAbsorber)
Dim textFragmentCollection As Aspose.Pdf.Text.TextFragmentCollection = Nothing
textFragmentCollection = textFragmentAbsorber.TextFragments
If Not IsNothing(textFragmentCollection) Then
’ Loop through the fragments
For Each textFragment As Aspose.Pdf.Text.TextFragment In textFragmentCollection
int2PageWarrant = 1
Next
End If
Dim strOutputFile As String = ""
If int2PageWarrant = 0 Then
pdfNewDoc = New Aspose.Pdf.Document
pdfNewDoc.Pages.Add(PdfP)
strOutputFile = txtOutputPath & System.IO.Path.GetFileNameWithoutExtension(strFile) & "_" & intPage.ToString.PadLeft(4, "0") & ".pdf"
pdfNewDoc.Save(strOutputFile)
ElseIf int2PageWarrant = 1 Then
pdfNewDoc = New Aspose.Pdf.Document
pdfNewDoc.Pages.Add(PdfP)
int2PageWarrant += 1
ElseIf int2PageWarrant = 2 Then
pdfNewDoc.Pages.Add(PdfP)
strOutputFile = txtOutputPath & System.IO.Path.GetFileNameWithoutExtension(strFile) & "_" & intPage.ToString.PadLeft(4, "0") & "II.pdf"
pdfNewDoc.Save(strOutputFile)
int2PageWarrant = 0
Else
Throw New Exception("UnExpected Warrant Page count")
End If
Next
Next
TEST2.pdf (87.6 KB)
Many Thanks
Ian Tyrrell