ReadToEnd in PDFExtractor is no longer working properly in version 5.4.0

ReadToEnd in PDFExtractor is no longer working properly in version 5.4.0. This function used to get all the text for a page, but now only gets the first character. Here is my sample code:

Public Sub GetTextFromPDF(ByVal sPDF As String, ByVal sTempFolder As String)

Dim oPDFText As New Aspose.Pdf.Kit.PdfExtractor

oPDFText.BindPdf(sPDF)
oPDFText.ExtractText()
Dim nPage As Integer = 1
Do While oPDFText.HasNextPageText
Dim sTextFile As String
sTextFile = sTempFolder & "PdfToTextPage" & nPage.ToString & ".txt"
oPDFText.GetNextPageText(sTextFile)
Dim oRead As System.IO.StreamReader
oRead = System.IO.File.OpenText(sTextFile)
Dim sText = oRead.ReadToEnd.Trim
If sText = "" Then
oRead.Dispose()
oRead = Nothing
System.IO.File.Delete(sTextFile)
nPage = nPage + 1
MsgBox("No Text.", MsgBoxStyle.OkOnly)
Else
MsgBox("Page " & nPage.ToString & vbCrLf & sText, MsgBoxStyle.OkOnly)
oRead.Dispose()
oRead = Nothing
System.IO.File.Delete(sTextFile)
End If
Loop

End Sub

Also I have attached my test file.

Hi,

I have tested the issue using your file and the code snippet and a bit of a change in the code worked at my end with the latest version. You only need to specify the encoding type while extracting text as shown below:

oPDFText.ExtractText(Encoding.UTF8)

Actually, we improved the text extraction mechanism since version 4.2.0; you may check the details in this post. Also, please make sure that you’re using the license while extracting text, as the text extraction feature is quite limited in evaluation mode.

I hope this helps. If you have any further questions, please do let us know.
Regards,

Thanks for the updated code, that worked.