How to save excel files as text with page numbers as a line separators

Ramyarao · June 1, 2022, 10:02am

Hi,

We are using aspose.cells to convert excel files to text in java. Suppose a single sheet contains multiple pages, we would like to have line separators in the text files for every new page. How can we achieve this?

In the attached file, there are 92 pages. we want to have line separators in the output text file with the page number as the footer after every new page

we know that iterating through pages in possible in pdf (using Page class). Is it possible in excel sheet?
Can you please share the code to iterate through pages in the sheet?
file_example_XLS_5000.zip (178.8 KB)

We are referring sample code from the below link
Convert XLS to TXT via Java | products.aspose.com

amjad.sahi · June 1, 2022, 11:42am

@Ramyarao,

See the following sample code to accomplish your task for your requirements for your reference:
e.g.
Sample code:

        Workbook workbook = new Workbook("f:\\files\\file_example_XLS_5000.xls");
        Worksheet worksheet = workbook.getWorksheets().get(0);
        worksheet.getPageSetup().setOrientation(PageOrientationType.LANDSCAPE);
        ImageOrPrintOptions printoption = new ImageOrPrintOptions();
        printoption.setPrintingPage(PrintingPageType.DEFAULT);
        CellArea [] area = worksheet.getPrintingPageBreaks(printoption);        

        int strow = 0;
        int stcol = 0;
        for(int i =0 ;i<area.length;i++)
        {
            //Get each page starting range/area.
            strow = area[i].StartRow;
            stcol = area[i].StartColumn;

            worksheet.getCells().insertRow(strow +i);
            worksheet.getCells().get(strow +i,stcol).putValue("            ______Page " + (i+1) + "_____");
        }

        workbook.save("f:\\files\\out1.txt");

Hope, this helps a bit.

Ramyarao · June 2, 2022, 3:48am

Hi,

Thanks for the code. Please confirm if printoption is defined as below.

ImageOrPrintOptions printoption = new ImageOrPrintOptions()

However, this does not add page numbers at the right places.

Example: In the previously attached xlsx file, page 1 ends after the data “54 Kathleen Hanner Female United States 25 15/10/2017”.
image.png (3.3 KB)

But in the text file, page 1 ends after 33 Jona Grindle Female Great Britain 26 21/05/2015 6458
** _Page 2**

Thank you

simon.zhao · June 2, 2022, 7:15am

@Ramyarao
1, The row height is automatic, but the cahed row height is not correct, so you have to autofit row height when loading the file
2, There is an issue in the old codes that print Orientation is changed…
Please try the following codes:
LoadOptions loadOptions = new LoadOptions();
AutoFitterOptions fitterOptions = new AutoFitterOptions();
fitterOptions.setOnlyAuto(true);
loadOptions.setAutoFitterOptions(fitterOptions);
Workbook workbook = new Workbook(“d:\Filetemp\file_example_XLS_5000.xls”,loadOptions);
Worksheet worksheet = workbook.getWorksheets().get(0);
ImageOrPrintOptions printoption = new ImageOrPrintOptions();
CellArea [] area = worksheet.getPrintingPageBreaks(printoption);

        int strow = 0;
        int stcol = 0;
        int count = 0;
        HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
        for(int i =0 ;i<area.length;i++)
        {
            //Get each page starting range/area.
            strow = area[i].StartRow;
            if(map.containsKey(strow))
            {
            	continue;
            }
            map.put(strow, strow);
            stcol = area[i].StartColumn;

            worksheet.getCells().insertRow(strow + count);
        
            worksheet.getCells().get(strow + count,stcol).putValue("            ______Page " + (count+1) + "_____");
            count++;
        }

        workbook.save("d:\\Filetemp\\dest.txt");

Ramyarao · June 2, 2022, 10:17am

Hi,

This code logic does not work for the attached xlsx file.

In the file, page 1 ends “Dummy value 40 694 1,105 9/12/2019 14:15”. But in the output txt file page 1 ends at “Dummy value 42 619 1,665 9/12/2019 18:35
_Page 2”

Please let me know what is missing heretest 3.zip (14.8 KB)

amjad.sahi · June 2, 2022, 11:42am

@Ramyarao,

This is due to the fact that you are doing in code:

1). To place text like “______Page x _____”, we are inserting blank row on every page in the code. By inserting blank row, it will consequently shift rows down a bit. I think you should understand that to place/show line breaks in a text file means you got to insert (new) row at the specific location/spot first and then add some text/line dashes to that row cell(s). That would surely move below rows down further a bit and consequently (due to those extra inserted rows) you cannot get similar page breaks for text file to its original XLSX file.

2). See the above lines of code. When you are auto-fitting the rows, the hidden rows are also shown and auto-fitted. In your XLSX file, third row is hidden which will be set visible in the output text file after using the above lines. So, surely, unhiding rows will move/shifts rows down further.

In short, perform the task (as per the code segment) manually in MS Excel and you will get similar results.

Ramyarao · June 2, 2022, 1:06pm

Thank you so much for the explanation.

One last question, is there no way to get page number information from the header/footer and then convert the excel file to text along with this page info instead of inserting new row and adding page number info?

amjad.sahi · June 2, 2022, 2:19pm

@Ramyarao,

Please note, headers/footers are only effective if you use native Excel file formats (XLS, XLSX, XLSM, XLSB, etc.) or render to other file types (e.g., PDF, image, SVG, ODS, etc.). For text formats (CSV, Tab Dlimited, TXT), you cannot have/render headers, footers or other binary data in it. Even you may open your file (having headers/footers in tact) into MS Excel and save to text format manually and you will notice headers/footers are missing too.

Ramyarao · June 5, 2022, 9:36am

Hi,

Thanks for the detailed explanation.

This code works fine when run locally after loading with autofit(on windows) for the previously attached file[file_example_XLS_5000.zip|attachment]

But when the same code is run in AWS lambda, the output varies.(some rows shifts down or up)

Is there any difference in the way of loading excel file in local and in AWS Lambda?

amjad.sahi · June 5, 2022, 2:06pm

@Ramyarao,

There should be no significant difference if you have installed underlying fonts (used in the workbook) and gdiplus library in AWS lambda. Anyways you may share sample files and screenshots for reference. Also, you may skip auto-fit rows operation if it makes any difference?