While converting XPS to PDF, The Saved Searchable PDF File is empty

Hi Team,

I am getting an empty pdf while running the below code, Can you assist me with correct settings.
I am using Aspose.Ocr 22.10 version and Aspose.Page 22.10 version
Input
XPS docs.7z (325.2 KB)
Output
converted.pdf (1.2 KB)

        AsposeOcr api = new AsposeOcr();
        License lic = new License();
        lic.SetLicense("Aspose.OCR.NET.lic");

        string path = @"D:\OCR\TIFFs\XPS docs\Sample2.xps";
        // Initialize PDF output stream
        using (System.IO.MemoryStream xpsStream = new MemoryStream(File.ReadAllBytes(path)))
        // Initialize XPS input stream
        using (System.IO.MemoryStream pdfStream = new MemoryStream())
        {
            {
                // Load XPS document form the stream
                XpsDocument document = new XpsDocument(path, new XpsLoadOptions());
                
                
                Aspose.Page.XPS.Presentation.Pdf.PdfSaveOptions options = new Aspose.Page.XPS.Presentation.Pdf.PdfSaveOptions()
                {
                    JpegQualityLevel = 100,
                    ImageCompression = Aspose.Page.XPS.Presentation.Pdf.PdfImageCompression.LzwBaselinePredictor,
                    TextCompression = Aspose.Page.XPS.Presentation.Pdf.PdfTextCompression.Flate,
                    PageNumbers = new int[] { 1,2}
                };

                // Create rendering device for PDF format
                Aspose.Page.XPS.Presentation.Pdf.PdfDevice device = new Aspose.Page.XPS.Presentation.Pdf.PdfDevice(pdfStream);

                document.Save(device,options);
                
                List<RecognitionResult> result = new List<RecognitionResult>();

                pdfStream.Position = 0;

                result.AddRange(api.RecognizePdf(pdfStream, new DocumentRecognitionSettings()));
                
                AsposeOcr.SaveMultipageDocument(@"D:\OCR\TIFFs\XPS docs\converted.pdf", SaveFormat.Pdf, result);
            }
        }

@Gpatil

We have reproduced this issue in our environment and have logged it as OCRNET-611 in our issue tracking system. We will further look into its details and keep you posted with its rectification status. Please be patient and spare us some time.

We are sorry for the inconvenience.

Hi @asad.ali

Will it be possible to get this fix in upcoming release :slight_smile:

@Gpatil

At the moment the ticket is under the phase of investigation and we have not come up with an ETA yet. As soon as we have some updates, we will let you know. Please spare us some time.

Hi @asad.ali, This ticket status seems resolved . When will this fix get release. I tried using Aspose.OCR 22.11.1 seems it is not part of this version

@Gpatil

Aspose.OCR is the image recognition library. So you can use xps to image convertion, and than image recognition will be better:

 AsposeOcr api = new AsposeOcr();
            License lic = new License();
            lic.SetLicense("Aspose.Total.Product.Family.lic");

            string path = @"D:\imgs\ISSUES\NET611\XPS docs\Sample2.xps";
            // Initialize PDF output stream
            using (System.IO.MemoryStream xpsStream = new MemoryStream(File.ReadAllBytes(path)))
            // Initialize XPS input stream
            using (System.IO.MemoryStream pdfStream = new MemoryStream())
            {
                {
                    // Load XPS document form the stream
                    XpsDocument document = new XpsDocument(path, new XpsLoadOptions());
                    PngSaveOptions options = new PngSaveOptions()
                    {
                        SmoothingMode = System.Drawing.Drawing2D.SmoothingMode.HighQuality,
                        Resolution = 300
                    };

                    // Create rendering device for image
                    ImageDevice device = new ImageDevice();
                    document.Save(device, options);

                    List<RecognitionResult> result = new List<RecognitionResult>();
                    for (int i = 0; i < device.Result.Length; i++)
                    { 
                        // Iterate through partition pages
                        for (int j = 0; j < device.Result[i].Length; j++)
                        {
                            // Write image
                            using (MemoryStream imageStream = new MemoryStream())
                            {
                                imageStream.Write(device.Result[i][j], 0, device.Result[i][j].Length);
                                var res = api.RecognizeImage(imageStream, new RecognitionSettings());
                                result.Add(res);
                            }
                            }
                        }
                    AsposeOcr.SaveMultipageDocument(@"converted.pdf", SaveFormat.Pdf, result);
                }
            }
        } 

converted.pdf (131.0 KB)

Thanks @asad.ali , will try this and let you know

@Gpatil

Sure, please take your time.

Thanks @asad.ali It is working as expected :slight_smile: