Hi,
I have attached a testprogram and a test RTF. I want to convert the RTF into PDF with a specific width and height, but if the page is to narrow I cannot calculate the height any more, because the LayoutCollector generates wrong LayoutEnumerators with rectangles. What do I wrong? Or is that a bug?
If you start the program with the following arguments, it works: “generate --fileType pdf -i [InputDirectory] -o [OutputDirectory] --wh 160 100 -u mm -l aspose -c 100 -p RTF_1_”
But with this goes wrong: “generate --fileType pdf -i [InputDirectory] -o [OutputDirectory] --wh 159 100 -u mm -l aspose -c 100 -p RTF_1_”
You can find the code snippet in the CommonTest.Aspose/AsposeDocumentConverter.cs/GetContentHeight (292)
There in the 160x100 case the FirstSestion, FirstParagraph, Rectangle.Top is 0, and the LastSection, LastParagraph, Rectangle.Bottom is 278. But in the second case, the Top is 267, and the Bottom is 278.
Unfortunately, your requirement and issue detail is not clear. Please share some more detail about your issue along with simple console application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing. We will investigate the issue and provide you more information on it.
Hi Tahir,
sorry if I was not clear enough!
Everything is there: an RTF for the test, and in the other zip the source code.
As I see, the LayoutCollector does not work properly, if the RTF contains some king of embedded objects.
Some other details about our use case, maybe it helps to understand the issue: I need to split and convert the RTF documents into small PDFs with a specific page size, and yes, every each page has a different size. I make the following: I set the page size, take the first page, save it. After that I take the rest of the document, I set the page size again, I take the first page, save it, and so on… This procedure is extended with an another feature: a page should be just as height, as the content. For that reason I calculate this height, but if the page.width is smaller than the image.width, then I cannot calculate the height any more, because of the LayoutCollector.
I hope it helps!
Please share the steps that you are using to execute your application. Perhaps, there is some other way to achieve your requirement. So, please also share your expected output.
Morning Tahir,
the steps to execute: open a command line, navigate to the exe, write “CommonTest.exe” and paste these arguments, set the right input directory where the RTF is located, set the output directory, enter (The -p RTF_1_ is the prefix used to select a subset of the RTFs, you can remove it if you rename the RTF file)
Expected result: if a page contains only an image, resizeing of the page schould not change the Rectangle.Top of the image on the same page.
Please give the code to a developer, he/she can debug it and than will see, what is the difference in the two cases on the given place, and maybe can decide if its a bug, or I do something wrong when I ask the coordinates.
Thanks for sharing the detail. You are extracting document’s pages using PageSplitter utility. We suggest you please use Document.ExtractPages instead.
We have tested the scenario using your project and have not found any issue with attached output PDF files. CurrentDocument_1.Pdf (439.3 KB) output.pdf (442.9 KB)
If this output is not correct according to your requirement, please ZIP and attach your expected output Word/PDF document. We will then investigate this issue further and share the code example according to your requirement.
Hi Tahir,
thanks for your reply!
With which argument did you get this output? I suppose with the “… --wh 160 100 …”. Try also with “… --wh 159 100 …” too, then it runs into a continuous loop because of the LayoutCollector.
Hi Tahir,
good news, it works!
A few days/weeks ago I asked you, how could I calculate the height of the content on a page of the document - or maybe I have found it in the forum -, I have got a solution, but it was apparently wrong.
private static float GetContentHeight(Document currentDocument, float pageHeight)
{
LayoutCollector layoutCollector = new LayoutCollector(currentDocument);
LayoutEnumerator layoutEnumerator = new LayoutEnumerator(currentDocument);
layoutEnumerator.Current = layoutCollector.GetEntity(currentDocument.FirstSection.Body.FirstParagraph);
var top = layoutEnumerator.Rectangle.Top;
var pageIndex = layoutEnumerator.PageIndex;
layoutEnumerator.Current = layoutCollector.GetEntity(currentDocument.LastSection.Body.LastParagraph);
var bottom = layoutEnumerator.Rectangle.Bottom;
if (layoutEnumerator.PageIndex > pageIndex)
{
bottom += pageHeight;
}
return bottom - top;
}
Here the right return is just bottom instead of bottom - top.
Thank you for your help and for the support!
Hi Tahir,
a question is still open: how could I calculate the width of the content on a page? I would like to crop the white areas from the small pages if it e.g. contains only a narrower image than the page self or the page contains only short lines, or a list of short items, than could be the page also narrower.
Please ZIP and attach your input and expected output Word documents. Please manually create your expected Word document using Microsoft Word and attach it here for our reference. We will investigate how you want your final Word output be generated like. We will then provide you more information on this along with code.
Hi Tahir,
I have attached the current test solution, in the launch.settings you can set the -i [InputDirectory], where the attached RTF is located, and the -o [OutputDirectiy], otherwise with the current arguments you can test it.
Since I am using the Document.ExtractPage, I have some other problems too, so here are my expectations/problems:
With the current argument when I set the Pagesetup.PageHeight (AsposeDocumentConverter.cs/SetPageSize [line 229]) on the first page, the third sqare moves to the next page, even the GetContentHeight serves the right height, that is smaller than the original size. Why? And if I set the height with Bottom+1, it works, bit it would be a dummy workaround
Since I am using Document.ExtracPages instead of DocumenSplitter, some pages are empty. How could I avoid that?
As you can see, this sample RTF document contains some squares with different widths, and in the output the first few pages are wider than the content. I would like to make these pages as wide, as their content.
To ensure a timely and accurate response, please attach the requested resources here for our reference. We will then provide you code example according to your requirement.
We have tested the Document.ExtractPages method using following code example and have not found any issue with it. Please make sure that you are using it correctly.
Document doc = new Document(MyDir + "RTF_0_2_ImageSplitTest.rtf");
for (int i = 0; i < doc.PageCount; i++)
{
Document dst = doc.ExtractPages(i, 1);
dst.Save(MyDir + "output"+i+".docx");
}