Free Support Forum - aspose.com

Convert mhtml to pdf in blocks in single page

HI,
I need to fill html/mhtml to pdf in **blocks/ColumnsAspose.zip (2.8 MB)
**.
I should be able to read data and copy data to pdf in blocks.
Attached input mhtml and output pdf for reference

@TejKamal_Thotakuri

Thank you for contacting support.

You can convert the MHTML file to a PDF document and then convert generated document to a multi-column PDF file while extracting text from former PDF document. Please refer to below code snippet to convert the MHTML file to PDF document and visit the documentation articles for your kind reference.

        // Load document
        Document document = new Document(dataDir + "Source.mht", new pdf.MhtLoadOptions());
        // Save the output as PDF document
        document.Save(dataDir + "MHTMLtoPDF_out.pdf");

We hope this will be helpful. Please feel free to contact us if you need any further assistance.

Thanks for the reply.
Can you please share example on reading and copy data from existing pdf to pdf with multi columns keeping formats

@TejKamal_Thotakuri

We are looking into details and will get back to you soon with our findings.

Hi,
Any update on this please.

@TejKamal_Thotakuri

We have investigated the scenario and found out that you can extract text from a PDF document but the text is extracted in plain format. You can then create a multi-column PDF document as in the code snippet below:

        // Open document
        Document pdfDocument = new Document(dataDir + "MHTML2PDF_out.pdf");

        // Create TextAbsorber object to extract text
        TextAbsorber textAbsorber = new TextAbsorber();
        // Accept the absorber for all the pages
        pdfDocument.Pages.Accept(textAbsorber);
        // Get the extracted text
        string extractedText = textAbsorber.Text;
        
        Document doc = new Document();
        // Specify the left margin info for the PDF file
        doc.PageInfo.Margin.Left = 40;
        // Specify the Right margin info for the PDF file
        doc.PageInfo.Margin.Right = 40;
        Aspose.Pdf.Page page = doc.Pages.Add();

        Aspose.Pdf.Drawing.Graph graph1 = new Aspose.Pdf.Drawing.Graph(500, 2);
        // Add the line to paraphraphs collection of section object
        page.Paragraphs.Add(graph1);

        // Specify the coordinates for the line
        float[] posArr = new float[] { 1, 2, 500, 2 };
        Aspose.Pdf.Drawing.Line l1 = new Aspose.Pdf.Drawing.Line(posArr);
        graph1.Shapes.Add(l1);

        Aspose.Pdf.FloatingBox box = new Aspose.Pdf.FloatingBox();
        // Add five columns in the section
        box.ColumnInfo.ColumnCount = 5;
        // Set the spacing between the columns
        box.ColumnInfo.ColumnSpacing = "5";

        box.ColumnInfo.ColumnWidths = "105 105 105 105 105";
        // Create a graphs object to draw a line
        Aspose.Pdf.Drawing.Graph graph2 = new Aspose.Pdf.Drawing.Graph(50, 10);
        // Specify the coordinates for the line
        float[] posArr2 = new float[] { 1, 10, 100, 10 };
        Aspose.Pdf.Drawing.Line l2 = new Aspose.Pdf.Drawing.Line(posArr2);
        graph2.Shapes.Add(l2);

        // Add the line to paragraphs collection of section object
        box.Paragraphs.Add(graph2);

        TextFragment text2 = new TextFragment(extractedText);
        box.Paragraphs.Add(text2);

        page.Paragraphs.Add(box);

        dataDir = dataDir + "CreateMultiColumnPdf_out.pdf";
        // Save PDF file
        doc.Save(dataDir);

You may change page and column dimensions as per your requirements. We hope this will be helpful. Please feel free to contact us if you need any further assistance.