How do you stop words from being split across two lines

Dear Team


We have an issue in using Aspose.pdf. We need to convert HTML to PDF. The issue is words are getting split to next lines. Attaching the code & sample output for reference. Please help

Code<o:p></o:p>

Dim pdf As Aspose.Pdf.Generator.Pdf = New Aspose.Pdf.Generator.Pdf()

'set the license file

Dim lic As New Aspose.Pdf.License()

lic.SetLicense(System.Web.HttpContext.Current.Server.MapPath("AsposeLicense/Aspose.Total.lic"))

' add the section to PDF document sections collection

Dim section As Aspose.Pdf.Generator.Section = pdf.Sections.Add()

section.PageInfo.PageWidth = Aspose.Pdf.Generator.PageSize.LetterWidth

section.PageInfo.PageHeight = Aspose.Pdf.Generator.PageSize.LetterHeight

Dim marginInfo As Aspose.Pdf.Generator.MarginInfo = New Aspose.Pdf.Generator.MarginInfo()

marginInfo.Top = 0

marginInfo.Bottom = 0

marginInfo.Left = 0

marginInfo.Right = 0

Dim marginInfotext As Aspose.Pdf.Generator.MarginInfo = New Aspose.Pdf.Generator.MarginInfo()

marginInfotext.Top = 95

marginInfotext.Bottom = 95

marginInfotext.Left = 35

marginInfotext.Right = 35

section.PageInfo.Margin = marginInfotext

Dim txt1 As Aspose.Pdf.Text.TextFragment = New Aspose.Pdf.Text.TextFragment()

txt1.HorizontalAlignment = Aspose.Pdf.HorizontalAlignment.Justify

Dim graph1 As Aspose.Pdf.Generator.Graph = New Aspose.Pdf.Generator.Graph(section)

graph1.Margin.Top = 0

graph1.Margin.Bottom = 0

section.Paragraphs.Add(graph1)

Dim text As Aspose.Pdf.Generator.Text = New Aspose.Pdf.Generator.Text(section, sHtml)

text.IsHtmlTagSupported = True

section.Paragraphs.Add(text)

section.IsSpaced = True

'set page size

pdf.PageSetup.PageWidth = Aspose.Pdf.Generator.PageSize.LetterWidth

pdf.PageSetup.PageHeight = Aspose.Pdf.Generator.PageSize.LetterHeight

pdf.PageSetup.Margin = marginInfo

pdf.Save(sTempPDF)

HTML Source sHtml =

<span style=“font-size:11.0pt;font-family:“Calibri”,“sans-serif”;mso-fareast-font-family:
Calibri;mso-fareast-theme-font:minor-latin;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA”>

1.         This is an example of a decision, including special characters like “ô” and “í” and “ã” and other features (like parentheses) and a noun’s possessive forms. –

a.         Some parts of the text will need to be indented.

i).         and some, even further;

2.         All paragraphs will also need a tab character between the number and text, which transfers to the IR intact, and paragraph breaks between each.

<span style=“font-size:11.0pt;font-family:“Calibri”,“sans-serif”;mso-fareast-font-family:
Calibri;mso-fareast-theme-font:minor-latin;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA”>
Thanks
Anish

Hi Anish,


Thanks for contacting support.

In order to generate correct output, I would recommend you to please try using the DOM approach of Aspose.Pdf namespace. Please try using the following code snippet.

For your reference, I have also attached the PDF file generated over my end. In the event of any further query, please feel free to contact.

[VB.NET]


load source HTML
<o:p></o:p>

Dim doc As Document = New Document("c:/pdftest/input.html", New HtmlLoadOptions())

' save output in PDF

doc.Save(“c:/pdftest/output.pdf)

Thanks for the reply. We are still having issues. Please confirm which version we need to use.

We are using Aspose.pdf version 7.6 and the output is not appearing correctly (attached) can you please confirm if the above feature will work in 7.6 version or not.
We are not in a position to upgrade to latest license at this point. Any help will be much appreciated.

Code<o:p></o:p>

Dim doc As Aspose.Pdf.Document = New Aspose.Pdf.Document("C:\text.html", New Aspose.Pdf.HtmlLoadOptions())
doc.Save(sTempPDF)

Version : 7.6

INPUT:


1.         This is an example of a decision, including special characters like "ô" and "í" and "ã" and other features (like parentheses) and a noun's possessive forms. --


a.         Some parts of the text will need to be indented.


i).         and some, even further;


2.         All paragraphs will also need a tab character between the number and text, which transfers to the IR intact, and paragraph breaks between each.



Thanks
Anish

Hi Anish,


Thanks for sharing the details.

I have tested the scenario using Aspose.Pdf for .NET 7.6.0 and have managed to reproduce the same problem as you have stated above. However when using the latest release of Aspose.Pdf for .NET 9.2.0, I am unable to notice any issue. PDF File is properly being generated. Therefore in order to resolve this problem, you need to try using the latest release.

Dear Team,

I guess you have intepreted our requirement wrong. When we compared our PDF and PDF your team had shared it shows difference in display of text. In output you have shared, complete paragraph comes in a single line, which is not what we need. We would like to keep the horizontal size of the PDF page fixed but word should not get broken across lines. We have attached screenshot with highlighted area to show the actual issue.

Request your response at the earliest.

Thanks & Regards,

Anish

Hi Anish,


In your first post, you indicated that some characters from text line are breaking improperly (moving few characters to next line), so in order to resolve this problem, I suggested you to use the new DOM approach of Aspose.Pdf namespace. When using this approach, the line break issue is resolved and page dimensions are set automatically.

However from your last post, you have shared that you need to have fixed page width while preserving characters to break from undesirable manner. So in order to accomplish this requirement, we needto specify the page dimensions for resultant PDF. I am preparing the required code snippet and will get back to you soon.

Hi Anish,


I have further tested the scenario and have observed that when using Document Object Model of Aspose.Pdf namespace, the contents are being rendered in PDF in same format as they appear in source HTML. However in order to render to render the text in larger font so that contents can wrap to next line, we can search through complete PDF document and try increasing the font size of each TextFragment. However during my testing, I have observed that size of TextFragments is being increased but the characters are overlapping on each other, instead of wrapping the contents to subsequent lines. For the sake of correction, I have logged this problem as PDFNEWNET-36886 in our issue
tracking system. We will further look into the details of this problem and will
keep you updated on the status of correction. Please be patient and spare us
little time. We are sorry for this inconvenience.

[C#]

// load source HTML<o:p></o:p>

Document doc = new Document("c:/pdftest/Research+Report.html",load);

//create TextAbsorber object to find all the phrases matching the regular expression

TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(@"[\S]+");

//set text search option to specify regular expression usage

TextSearchOptions textSearchOptions = new TextSearchOptions(true);

textFragmentAbsorber.TextSearchOptions = textSearchOptions;

//accept the absorber for all the pages

doc.Pages.Accept(textFragmentAbsorber);

//get the extracted text fragments

TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

//loop through the fragments

foreach (TextFragment textFragment in textFragmentCollection)

{

Console.WriteLine("Font Size : {0} ", textFragment.TextState.FontSize);

// update font size

textFragment.TextState.FontSize = 20;

} // save output in PDF

doc.Save(“c:/pdftest/DOM_HTML_output.pdf”);

Hi,

Increasing the font doesnot seem to be a good solution for our requirement as this content which we provided is just a sample text. The text varies and we have to maintain a constant font across the whole document. Also while we try out the trial version of Aspose 9.2 we get an error that maximum of 4 text components can only be added in trail version. Can we expect a solution soon and also we would like to confirm functionality works with v 9.2 before we go with purchase of latest version.

Thanks,

Anish

Hi Anish,

In order to test the latest version without any limitation, you may consider requesting a 30 days temporary license. For further details, please visit Get a temporary license.

I also have this issue of having words split on the next line.
I was using an older version of Aspose.DLL.

I have read the above post and I’ve downloaded latest version.

I replaced the code line:
Dim doc = New Aspose.Pdf.Generator.Pdf(fs)

with

Dim doc= New Aspose.Pdf.Document(filePath, New Aspose.Pdf.HtmlLoadOptions))

and I am getting all kinds of errors:

  1. doc.PageSetup.PageWidth = Aspose.Pdf.Generator.PageSize.LetterWidth (error: PageSetup is not a member of Aspose.Pdf.Document)
  2. Dim RecommendationSection = doc.Sections.Add() (error message: Sections is not a member of Aspose.Pdf.Document)
  3. RecommendationSection.PageInfo.Margin.Top = marginTop

I am new with Aspose.
For no 1 issue, is it safe to replace that line with:
doc.PageInfo.Width = Aspose.Pdf.Generator.PageSize.LetterWidth?

How do I fix Line No 2? I guess if I fix no 2 it will probably fix No 3 too.

Thank you!

lbusuioc:
I also have this issue of having words split on the next line.
I was using an older version of Aspose.DLL.

I have read the above post and I’ve downloaded latest version.

I replaced the code line:
Dim doc = New Aspose.Pdf.Generator.Pdf(fs)

with

Dim doc= New Aspose.Pdf.Document(filePath, New Aspose.Pdf.HtmlLoadOptions))

and I am getting all kinds of errors:

  1. doc.PageSetup.PageWidth = Aspose.Pdf.Generator.PageSize.LetterWidth (error: PageSetup is not a member of Aspose.Pdf.Document)
Hi Laura,

Thanks for using our API’s.

Aspose.Pdf and Aspose.Pdf.Generator are separate classes and when using class from particular name space, the subsequent classes should also be from same namespace. Concerning to above stated issue, you are using Document object from Aspose.Pdf namespace and trying to use the value of enumeration from Aspose.Pdf.Generator namespace. please try using

[VB.NET]

Dim doc As Document = New Document()<o:p></o:p>

doc.Pages.Add()<o:p></o:p>

doc.Pages(1).PageInfo.Width = Aspose.Pdf.PageSize.A4.Width


lbusuioc:
2) Dim RecommendationSection = doc.Sections.Add() (error message: Sections is not a member of Aspose.Pdf.Document)
Please try using following code lines

[VB.NET]

Dim doc As Document = New Document()

Dim RecommendationSection =
doc.Pages.Add()


lbusuioc:
3) RecommendationSection.PageInfo.Margin.Top = marginTop

I am new with Aspose.
For no 1 issue, is it safe to replace that line with:
doc.PageInfo.Width = Aspose.Pdf.Generator.PageSize.LetterWidth?

How do I fix Line No 2? I guess if I fix no 2 it will probably fix No 3 too.

Thank you!

Please try using following code lines to set the margin of first page of PDF file.

[VB.NET]

Dim doc As Document = New Document()<o:p></o:p>

Dim RecommendationSection = doc.Pages.Add()

RecommendationSection.PageInfo.Margin.Top = 10


In the event of any further query, please feel free to contact.

Thank you for the prompt response.

I do not think that

Dim RecommendationSection = doc.Pages.Add()

would work for me because this is not really supposed to be a new page but a section on the page. See the code below that has property IsNewPage=false.

Is there any other way?



Dim RecommendationSection=agenda.Sections.Add()

RecommendationSection.PageInfo.Margin.Top = marginTop

RecommendationSection.PageInfo.Margin.Left = marginleft

RecommendationSection.PageInfo.Margin.Right = marginRight

RecommendationSection.PageInfo.Margin.Bottom = marginBottom

RecommendationSection.IsNewPage = False



I am also adding an attachment with the code sections that I have issues in using the new classes. Can you please provide feedback?

I spent about a day without much luck in converting my complex code to work with the new classes in Aspose.Pdf.Document.

I tested my application (the code is using Aspose.Pdf.Generator classes) using the latest version of Aspose.Pdf dll.
The split text issue is still there with your latest dll version. Is there any way you could fix this bug in a future release?
Using the Aspose.Pdf.Document classes instead of Aspose.Pdf.Generator is a big development effort for us.
Please let me know.

Thank you

lbusuioc:
Thank you for the prompt response. I do not think that Dim RecommendationSection = doc.Pages.Add() would work for me because this is not really supposed to be a new page but a section on the page. See the code below that has property IsNewPage=false. Is there any other way?

Dim RecommendationSection=agenda.Sections.Add()
RecommendationSection.PageInfo.Margin.Top = marginTop
RecommendationSection.PageInfo.Margin.Left = marginleft
RecommendationSection.PageInfo.Margin.Right = marginRight
RecommendationSection.PageInfo.Margin.Bottom = marginBottom
RecommendationSection.IsNewPage = False

I am also adding an attachment with the code sections that I have issues in using the new classes. Can you please provide feedback?

Hi Laura,

The method doc.Pages.Add() is from new Document Object Model (DOM) of Aspose.Pdf namespace and agenda.Sections.Add() is legacy Aspose.Pdf.Generator approach. Please note that Pages.Add() method infect adds a new page to PDF document. As suggested earlier, please try using the new DOM as its a better approach as compared to Generator namespace.
lbusuioc:
I spent about a day without much luck in converting my complex code to work with the new classes in Aspose.Pdf.Document.

I tested my application (the code is using Aspose.Pdf.Generator classes) using the latest version of Aspose.Pdf dll.
The split text issue is still there with your latest dll version. Is there any way you could fix this bug in a future release?
Using the Aspose.Pdf.Document classes instead of Aspose.Pdf.Generator is a big development effort for us.
Please let me know.

Thank you

Hi Laura,

Aspose.Pdf.Generator is a legacy approach and please note that all the fixes and enhancements are being introduced in new Document Object Model (DOM) of Aspose.Pdf namespace. In case you encounter any issue while migrating your code to new DOM approach, please share some details and we will try assisting you in this regard. We are sorry for your inconvenience.

I have attached a document with some code sections I am having issues with.

It is two posts above this post.
Can you look at that document and let me know the new classes/methods I should use?

Thank you

I have some questions I posted above related to migration to new classes.

It is an attachment to my post. Can you take a look and respond?

Thank you!

Hi there,


Thanks for your inquiry. Our support team will review this inquiry shortly and get back to you as soon as possible. Please hold tight.

Thanks,

Hi Laura,


Thanks for your patience.

Please find below the codes updated to new Document Object Model.

[VB.NET]

Dim agenda = New Aspose.Pdf.Document(New FileStream(“”,
FileMode.Open))<o:p></o:p>

Dim RecommendationSection = agenda.Pages.Add()

RecommendationSection.PageInfo.Margin.Top = 10

RecommendationSection.PageInfo.Margin.Left = 10

RecommendationSection.PageInfo.Margin.Right = 10

RecommendationSection.PageInfo.Margin.Bottom = 10

RecommendationSection.IsNewPage = False


The line highlighted is yellow cannot be used because Section object does not exist in new DOM and as we already have created a new page in second line, so using this line does not make any sense. We will be converting rest of the codes to new DOM and will share them shortly.

Hi Laura,


Please find below the conversion of second code snippet to DOM approach. I have also noticed that second and third code snippet look identical, except for couple of code lines. Specified below is the code snippet against new DOM approach.

[VB.NET]

Dim
agenda = New Aspose.Pdf.Document()<o:p></o:p>

Dim RecommendationSection = agenda.Pages.Add()

Dim itemCategoryTextObj = New Aspose.Pdf.Text.TextFragment("regItemCategText")

itemCategoryTextObj.TextState.Underline = True

itemCategoryTextObj.TextState.FontStyle = FontStyles.Bold

itemCategoryTextObj.TextState.Font = FontRepository.FindFont("TimesNewRoman")

itemCategoryTextObj.TextState.FontSize = 14

itemCategoryTextObj.TextState.HorizontalAlignment = HorizontalAlignment.Center

itemCategoryTextObj.Margin.Top = 0

itemCategoryTextObj.Margin.Bottom = 2.5 'Changed bottom margin

'itemCategoryTextObj..FirstLineIndent = -9.0

RecommendationSection.Paragraphs.Add(itemCategoryTextObj)

agenda.Save(“c:/pdftest/TextState_utilization.pdf”)