ReadToEnd in PDFExtractor is no longer working properly in version 5.4.0. This function used to get all the text for a page, but now only gets the first character. Here is my sample code:
Public Sub GetTextFromPDF(ByVal sPDF As String, ByVal sTempFolder As String)
Dim oPDFText As New Aspose.Pdf.Kit.PdfExtractor
oPDFText.BindPdf(sPDF)
oPDFText.ExtractText()
Dim nPage As Integer = 1
Do While oPDFText.HasNextPageText
Dim sTextFile As String
sTextFile = sTempFolder & "PdfToTextPage" & nPage.ToString & ".txt"
oPDFText.GetNextPageText(sTextFile)
Dim oRead As System.IO.StreamReader
oRead = System.IO.File.OpenText(sTextFile)
Dim sText = oRead.ReadToEnd.Trim
If sText = "" Then
oRead.Dispose()
oRead = Nothing
System.IO.File.Delete(sTextFile)
nPage = nPage + 1
MsgBox("No Text.", MsgBoxStyle.OkOnly)
Else
MsgBox("Page " & nPage.ToString & vbCrLf & sText, MsgBoxStyle.OkOnly)
oRead.Dispose()
oRead = Nothing
System.IO.File.Delete(sTextFile)
End If
Loop
End Sub
Also I have attached my test file.