Hi,
Using 2011.12.01 version of Aspose.PDF.dll we have encountered a multithreading problem.
According to documentation all public methods of Page class should be ThreadSafe, but attached code fails with an ArgumentException "Item has already been added. Key in dictionary: ' ' Key being added: ' '". If there is a lock on page.Accept call everything is working as expected.
We would like to run similiar code in order to improve performance. Page.Accept method is the most time-consuming operation and we are working with multiple files of > 300 pages.
Code do reproduce: (assuming we have 10 sample files, each of them has 3 pages):
private static readonly object _lock = new object();
static void Main(string[] args){
List<string> docs = new List<string>();
for (int i = 0; i < 10; i++) docs.Add(String.Format("TestData\\sample{0}.pdf", i));
Parallel.ForEach(docs, ParseDocument);
}
static void ParseDocument(string path)
{
Document document = new Document(path);
foreach(Page page in document.Pages)
{
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber();
//lock (_lock)
page.Accept(textFragmentAbsorber);
}
}