Aspose.Pdf.InvalidPdfFileFormatException Aspose.PDF Trailer not found when call through c++\CLI

I have a .net assembly(472) that uses aspose pdf to extract text out of a PDF file. I call this assembly from an MFC/C++ application using a managed c++ application in the middle(C++/CLI). That is, C++ application calls C++/CLI wrapper which in turn calls the .net assembly. The version i was using until now was Aspsose.dpf 24.1.0 and it was working fine. Now i have decided to upgrade it to 25.6.1 and it is throwing an exception of ‘trailer not found’ when i tried to open the PDF file(Document document = new Document(@“C:\file1.pdf”)), but the same PDF file works when i directly use the same assembly in a sample C#(472) application. That tells me there is nothing wrong with the file, also i tried with multiple PDF files all failed. Something going wrong when it getting called through the C++/CLI. Any recommendation/suggestions will be greatly appreciated.

@subhupk1, could you provide an example of code and a corresponding PDF file?

Here the stripped down version of the code snippet, also attached the pdf here. Hope this will do otherwise pleaselet me know

File1.pdf (17.1 KB)

C# assembly code
ExtractText(string file, string outputTextFile)
{
using (Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(file) )
{
if (pdfDocument != null && pdfDocument.Pages != null && pdfDocument.Pages.Count > 0)
{
// Create TextAbsorber object to extract text
Aspose.Pdf.Text.TextAbsorber textAbsorber = new Aspose.Pdf.Text.TextAbsorber();
// Accept the absorber for all the pages
pdfDocument.Pages.Accept(textAbsorber);
// Get the extracted text
string extractedText = textAbsorber.Text;
// Create a writer and open the file
TextWriter tw = new StreamWriter(outputTextFile);
// Write a line of text to the file
tw.WriteLine(extractedText);
// Close the stream
tw.Close();
}
}

C++/CLI code
bool FileConvertorCLR::ExtractText(const wchar_t* sourceFile, const wchar_t* targetFile)
{
bool ret = false;

	if (sourceFile == NULL || targetFile == NULL)
		ret = false;

	System::String^ clrSource = gcnew System::String(sourceFile);
	System::String^ clrTarget = gcnew System::String(targetFile);

	FileConvertor^ fc = gcnew  FileConvertor();

	Results::ResultStatus^ result = fc->ExtractText(clrSource, clrTarget);
}

Client C++/MFC code

void ConvertFile
{

CString csError;

	FileConvertorCLR fl;
	if (!fl.ConvertFile(L"C:\\File1.pdf", L"C:\\Text1.txt"))
	{
		const wchar_t* error = fl.getLastError();

		if(error!=NULL)
			csError.Format(L"%s", error);
	}
}

@subhupk1 There is a working example of a wrapper in the attached zip file.
Instruction:

  1. Unzip wrapper.zip.
  2. Open Visual Studio as an Administrator. This is necessary to register the tlb file in the system registry.
  3. Open Wrapper.sln.
  4. Select the “Release” configuration.
  5. Build the solution. This step will generate and register TextRetriever.tlb file as well as corresponding tlh and tli files.
  6. Open wrapper.cpp in the editor.
  7. Delete the empty main function and uncomment the commented code.
  8. Build the Wrapper project and run it (Ctrl-F5).

Notes.

  1. TextRetriever is a wrapper around the Aspose.PDF library. It exposes the ITextRetriever interface to COM.
  2. The Wrapper project interacts with TextRetriever through COM.

Wrapper.zip (23.0 KB)

wrapper.png (116.1 KB)

Thank you Alexander, the sample works, but we have to have at least .net4.8 though(472 doesn’t work)