Exception "An item with the same key has already been added" when converting PDF to HTML(C#)

Sri79 · January 25, 2020, 5:50pm

When Pdf files are exported to Html in Parallel, getting error saying “An item with the same key has already been added. Key: Adobe-CNS1-UCS2”.

You can use any Pdf samples or use the sample provided in my other thread.

Even if we are not using Parallel Tasking, only one request is getting processed successfully. The other request is getting failed, if there are multiple requests at the same time.

We are using the latest version of Aspose.Pdf .net component.

static void Main()
{
	SetLicense();

	List<Task> runningTasks = new List<Task>();

	runningTasks.Add(Task.Run(() =>
		{
			var outFolder2 = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "out");
			var pdfFile = @"C:\sample1.pdf";
			var pdfFileBytes1 = File.ReadAllBytes(pdfFile);
			using (MemoryStream pdfStream = new MemoryStream(pdfFileBytes1))
			{
				var pdfFilePath = Path.Combine(outFolder2, $"{Path.GetFileNameWithoutExtension(pdfFile)}_%NUM%.pdf");
				PdfFileEditor pdfEditor = new PdfFileEditor();
				pdfEditor.SplitToPages(pdfStream, pdfFilePath);
			}
			var splittedFiles = new DirectoryInfo(outFolder2).GetFiles($"{Path.GetFileNameWithoutExtension(pdfFile)}_*.pdf");
			foreach (var sf in splittedFiles)
			{
				var outFile = Path.ChangeExtension(sf.FullName, "html");
				using (Document doc = new Document(sf.FullName))
				{
					doc.Save(outFile, new HtmlSaveOptions
					{
						FixedLayout = true,
						SplitIntoPages = false,
					});
				}
			}
		})
		);
	runningTasks.Add(Task.Run(() =>
	{
		var outFolder2 = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "out");
		var pdfFile = @"C:\sample2.pdf";
		var pdfFileBytes = File.ReadAllBytes(pdfFile);
		using (MemoryStream pdfStream = new MemoryStream(pdfFileBytes))
		{
			var pdfFilePath = Path.Combine(outFolder2, $"c_{Path.GetFileNameWithoutExtension(pdfFile)}_%NUM%.pdf");
			PdfFileEditor pdfEditor = new PdfFileEditor();
			pdfEditor.SplitToPages(pdfStream, pdfFilePath);
		}
		var splittedFiles = new DirectoryInfo(outFolder2).GetFiles($"c_{Path.GetFileNameWithoutExtension(pdfFile)}_*.pdf");
		foreach (var sf in splittedFiles)
		{
			var outFile = Path.ChangeExtension(sf.FullName, "html");
			using (Document doc = new Document(sf.FullName))
			{
				doc.Save(outFile, new HtmlSaveOptions
				{
					FixedLayout = true,
					SplitIntoPages = false,
				});
			}
		}
	})
		);
	Task.WaitAll(runningTasks.ToArray());	
}

Adnan.Ahmad · January 25, 2020, 9:25pm

@Sri79,

Can you please share source files along with generated result so that we may further investigate to help you out.

Sri79 · January 26, 2020, 9:07am

@Adnan.Ahmad

As mentioned earlier you can use any PDF file and try the code given above. Or you can try the pdf file from my other post Exception"An item with same key has already been added" using Aspose.PDF for .NET.

Adnan.Ahmad · January 26, 2020, 8:23pm

@Sri79,

Thanks for contacting support.

I have observed your issue and like to inform that I have created investigation ticket with ID PDFNET-47616 in our issue tracking system to investigate and resolve this issue as soon possible.

Sri79 · April 29, 2020, 8:39am

Any update on this? Its been few months.

Adnan.Ahmad · April 29, 2020, 9:48pm

@Sri79,

I regret to inform that issue is still unresolved. I request for your patience and we will share good news with you soon.