Try to read .pdf file with Aspose and on get "Evaluation Only"

I’ve got the following code to read a .pdf file, and the text only contains: “Evaluation Only. Created with Aspose.PDF. Copyright 2002-2023 Aspose Pty Ltd.
Levi-Civita symbo”.

        // Open PDF document
        Document pdfDocument = new Document(@"c:\kiersten\levi-civita.pdf");

        // Create TextAbsorber object to extract text
        TextAbsorber textAbsorber = new TextAbsorber();
        // Accept the absorber for all pages
        pdfDocument.Pages.Accept(textAbsorber);
        // Get the extracted text
        string extractedText = textAbsorber.Text;
        // Create a writer and open the file
        TextWriter tw = new StreamWriter(dataDir + "extracted-text.txt");
        // Write a line of text to the file
        tw.WriteLine(extractedText);
        // Close the stream
        tw.Close();

@karkiekie

Please make sure to use a valid or 30 days free temporary license in order to evaluate the API in its full capacity. In case issue still persists, please share your sample file with us so that we can test the scenario in our environment and address it accordingly.

Thank you! I got it and it is now reading the entire .pdf file. Now here comes the next part. The .pdf file has TeX and I need to read that in and convert to raw TeX. So I got the Aspose.TeX for .NET temporary license, and followed the code example as below, but I get error message that “the license is not valid for this product” inside my .net c# windows form app.

      //PDF to TEX
        string pathToOutputDirectory = "c:\\kiersten\\";

        //Aspose.TeX.License = new Aspose.TeX.License();

        Aspose.Pdf.License lic = new Aspose.Pdf.License();
        lic.SetLicense(@"c:\projects\Aspose.TeX.NET.lic");

        // Load input PDF
        Aspose.Pdf.Document doc = new Aspose.Pdf.Document("c:\\kiersten\\levi-civita.pdf");

        // Create Tex save option          
        TeXSaveOptions saveOptions = new TeXSaveOptions();

        // Set path in TeXSaveOptions
        saveOptions.OutDirectoryPath = pathToOutputDirectory;

        // Save the source PDF file as TEX file           
        doc.Save(pathToOutputDirectory + "PDFToTeX_out.tex", saveOptions);

        System.Console.WriteLine("Done");

image.png (45.4 KB)levi-civita.pdf (85.6 KB)

@karkiekie

The most possible reason could be that you applied and obtained license for Aspose.PDF. That is why is not compatible with Aspose.TeX. Please obtain temporary license for Aspose.TeX as well so that you can perform testing using it.

You are correct. This example here would not work: https://kb.aspose.com/pdf/net/how-to-convert-pdf-to-latex-in-csharp/

When I changed it to:

        //PDF to TEX
        string pathToOutputDirectory = "c:\\kiersten\\";

        Aspose.TeX.License lic = new Aspose.TeX.License();
        lic.SetLicense(@"c:\projects\Aspose.TeX.NET.lic");

        // Load input PDF
        Document doc = new Document("c:\\kiersten\\levi-civita.pdf");

        // Create Tex save option          
        TeXSaveOptions saveOptions = new TeXSaveOptions();

        // Set path in TeXSaveOptions
        saveOptions.OutDirectoryPath = pathToOutputDirectory;

        // Save the source PDF file as TEX file           
        doc.Save(pathToOutputDirectory + "PDFToTeX_out.tex", saveOptions);

        System.Console.WriteLine("Done");

It works!

Do you have a way for me to only pull out the equations from the .pdf file so that I don’t get all of that font stuff and other non-math text in my resultant .TeX file?

Thank you for all of your help Ali!

@karkiekie

We are afraid that this would not be possible. Tha mathematical content is present in form of plain text inside PDF and the API does not offer any feature to recognize and isolate only mathematical equations in a PDF document