Invalid cell when using HtmlString and a LightCellsDataProvider

I have the following code in a .Net 4.5 console application. For some reason, it does not like my HtmlString when using the LightCellsDataProvider interface. It allows the excel file to be written but then I get an error when I try and open it. But, it works just fine if I am not using the LightCellsDataProvider interface. Here is some sample code. I am using version 17.9. Any ideas on how to get this working? I need to do some rich text formatting inside of the cells.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

using Aspose.Cells;

namespace ExcelDocumentCreator
{
	class Program
	{
		static void Main(string[] args)
		{

			try
			{
				// Setup License.
				Aspose.Cells.License license = new Aspose.Cells.License();
				license.SetLicense("Aspose.Total.lic");

				string tempFolder = @"C:\Temp";

				Workbook workbook = new Workbook();
				Worksheet worksheet = workbook.Worksheets[0];

				//workbook.Save(tempFolder + "\\output_out.xlsx");

				//Cell cell = worksheet.Cells["A1"];
				//cell.HtmlString = "<Font Style=\"FONT-WEIGHT: bold;FONT-FAMILY: Arial;FONT-SIZE: 10.5pt;COLOR: #222222;\">Other</Font><Font Style=\"FONT-FAMILY: Arial;FONT-SIZE: 10.5pt;COLOR: #222222;\">: &lt;15% of body: $100; ≥15% of body: $500. This benefit is payable only once per covered person, </Font><Font Style=\"FONT-WEIGHT: bold;FONT-STYLE: italic;FONT-FAMILY: Arial;FONT-SIZE: 10.5pt;COLOR: #222222;\">per accident</Font><Font Style=\"FONT-FAMILY: Arial;FONT-SIZE: 10.5pt;COLOR: #222222;\">.</Font>";
				
				ZenLightCellsDataProvider provider = new ZenLightCellsDataProvider();
				OoxmlSaveOptions saveOptions = new OoxmlSaveOptions(SaveFormat.Xlsx);
				saveOptions.LightCellsDataProvider = (LightCellsDataProvider)provider;

				workbook.Save(tempFolder + "\\output_out.xlsx", saveOptions);

			}
			catch (Exception exc)
			{
				Console.Write(exc);
			}
		}
	}

	public class ZenLightCellsDataProvider : LightCellsDataProvider
	{
		int currentRow = -1;
		int currentColumn = -1;

		int maxCellLength = 32767;
		int maxExportItems = 1048500;

		#region Constructor

		public ZenLightCellsDataProvider()
			: base()
		{

		}

		#endregion

		#region Properties

		public long Rows
		{
			get { return (long)currentRow; }
		}

		#endregion

		#region LightCellsDataProvider Members

		public bool IsGatherString()
		{
			return false;
		}

		public int NextCell()
		{
			currentColumn++;

			if (currentColumn > 0)
				return -1;
			else
				return currentColumn;
		}

		public int NextRow()
		{
			currentRow++;

			// if we've passed the max rows, stop getting more
			if (currentRow > 0)
				return -1;


			currentColumn = -1;
			return currentRow;
		}

		public void StartCell(Cell cell)
		{
			//cell.HtmlString = "<Font Style=\"FONT-WEIGHT: bold;FONT-STYLE: italic;TEXT-DECORATION: underline;FONT-FAMILY: Arial;FONT-SIZE: 11pt;COLOR: #ff0000;\">This is simple HTML formatted text.</Font>";
			cell.HtmlString = "<Font Style=\"FONT-WEIGHT: bold;FONT-FAMILY: Arial;FONT-SIZE: 10.5pt;COLOR: #222222;\">Other</Font><Font Style=\"FONT-FAMILY: Arial;FONT-SIZE: 10.5pt;COLOR: #222222;\">: &lt;15% of body: $100; ≥15% of body: $500. This benefit is payable only once per covered person, </Font><Font Style=\"FONT-WEIGHT: bold;FONT-STYLE: italic;FONT-FAMILY: Arial;FONT-SIZE: 10.5pt;COLOR: #222222;\">per accident</Font><Font Style=\"FONT-FAMILY: Arial;FONT-SIZE: 10.5pt;COLOR: #222222;\">.</Font>";

		}

		public void StartRow(Row row)
		{
		}

		public bool StartSheet(int sheetIndex)
		{
			if (sheetIndex == 0)
			{
				return true;
			}
			else
				return false;
		}

		#endregion


	}
}

@chris.curtis.mozenda

Thanks for using Aspose APis.

Cell.HtmlString is not supported in LightCells mode because it will cause other objects (such as the inner styles) changed and then gives unexpected results. LightCells mode is designed mainly for saving large data set of cell’s simple value, the special optimization make it impossible to support all complex features. However, user still can set the value and style separately for Cell in LightCellsDataProvider.

@shakeel.faiz,

Thank you for the reply. In my case, I need to have the flexibility to write very large data sets to the excel workbook and I need the ability to do in cell formatting like underline specific text and change the font color. What do you suggest?

@chris.curtis.mozenda

Please provide us some simplified two Excel files. The first Excel file should be generated with LightCellsDataProvider and then for the second file, you open the output Excel file generated by LightCellsDataProvider in Microsoft Excel and then format the cells manually using Microsoft Excel. These two Excel files will help us understand the nature of your issue better and we will help you asap.

AlbertaAngus-ChangeDetection-SampleChangeMarkup.zip (16.5 KB)

I have attached a zip file with two excel files inside. One shows the spreadsheet with the html markup in it that I want to be able to use as formatting in the cell. The other one has the in cell formatting that I am trying to achieve with the html markup.

These are only two sample files. I need to able to achieve this on large data sets up to 1 Million rows and with content of up to 30,000 characters in any particular cell. The app that we write allows customers to change the contents of the spreadsheet and we are flexible to allow for it.

I look forward to your suggestions. Please let me know if I can provide any more details.

@chris.curtis.mozenda

Thanks for your files. Please spare us some time, we will update you in next week most probably on Monday or Tuesday.

@chris.curtis.mozenda,

For your provided sample, we found you need rich text for cell values which is not supported by LightCells. Rich text settings(including the content itself) must be gathered and written into a string pool when generating the resultant file. It means all such kind of data must be kept in memory while saving the workbook and I am afraid, LightCells API can give no help in this case. So, we think you can filter your datasource before saving the workbook, i.e., find those cells which need to use rich text, and then instantiate them with the correct data and formatting (by Cell.HtmlString or Cell.Characters). After all the rich text cells have been set correctly, you can now use Workbook.Save with LightCells to fill other cells data accordingly and generate the resultant file.

Hope, this helps a bit.

Thank you for the reply.

I just want to make sure that I understand this entirely. Are you saying that there is no way to write to the excel spreadsheet as a stream during the save process if I need to use rich cell formatting?

I am confused by this because there is a save option that allows the caller to pass in a HttpResponse. There is also another one that allows you to pass in a generic stream to the save method. However, I don’t see a way to add content to the worksheet when using this save option other than to use the LightCells interface.

As far as I can tell, the spreadsheet has to be fully populated before calling these save functions. If this is the case, then I would assume the entire spreadsheet has to be stored in memory.

Here are the two save functions I was referencing:

Do I understand this correctly? In my mind, it seems to defeat the purpose of using a stream if I have to fully prepare the spreadsheet in memory before calling the save method.

Am I missing something?

@chris.curtis.mozenda,

Well, we are not sure how you mix the usage of saving workbook to a Stream and saving a workbook with LightCells APIs. You can save a workbook to streams with or without LightCells APIs. What we cannot support is set the rich text in the implementation of LightCellsDataProvider. For example, assume you have three cells, A1, A2, A3 and only A2 needs to be set as richly formatted text, then you should set A2’s content and formatting before calling workbook.Save(). For A1, A3, you can still set their content in the implementation of LightCells.

Saving the workbook to streams or file path has nothing to do with LightCells API. You can save the workbook to stream with LightCells or normal mode for sure.

I don’t think I made myself clear. Let’s assume I choose to not use the LightCellsApi at all so I can use the rich text feature. Now, from what I can tell, I have to load all of the content of the new spreadsheet into memory before calling the worksheet.save function. Is this true? And, if it is true, does that mean that all of the content that I add to that spreadsheet will be stored in memory until I save the spreadsheet to a file and then destroy the object in memory?

@chris.curtis.mozenda,

Yes, your understanding is correct. If there is memory issue, you may try to use MemoryPreference options if it makes any difference, see the document for your reference:

Hope, this helps a bit.

As a follow-up question: We currently are using your library in a 32 bit application which brings about some memory constraints for large worksheets. Am I right to assume that if I change our code to compile in a 64 bit application that your dll would be able take advance of the extra memory available to the application?

@chris.curtis.mozenda,

Well, our Dlls are compiled with option “Any CPU”, I think it should take the advantage when using 64-bits environment
so we think your understanding is correct.