# .net core c# beginner, needing to parse word doc

#1

I have a project where I would like to have a form upload a word doc to then parse it into clauses/paragraphs (HTML) that will be stored in SQL in the same clause/paragraph parts as HTML. I am currently tasked to write a program similar to one written before me where the programmer chose to use mammoth and store it to a noSQL solution. Where do i start with aspose.words to achieve this? thanks so for your time.

I would also like to use the aspose pdf product to do the same thing.

#2

For this case, Aspose.Words for .NET API’s code is pretty straightforward. You may first store HTML strings of every Paragraph found in Word document in a ArrayList and then store list items in database.

ArrayList htmls = new ArrayList();
Document doc = new Document("E:\\temp\\in.docx");

HtmlSaveOptions opts = new HtmlSaveOptions(SaveFormat.Html);
opts.PrettyFormat = true;
opts.ExportImagesAsBase64 = true; // etc
foreach (Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
{
}

Please also refer to Aspose.Words’ documentation.

Please post your Aspose.PDF related queries in Aspose.PDF forum where you will be guided appropriately.

#3

thank you very much, how would i use new Document() to add the uploaded document as a stream from `
[HttpPost]
public async Task PostFormData([FromForm] IFormFile file)
{