Getting some problem when using DocumentBuilder.InsertHTML

Hi

Thanks for your request. Please attach your input and output documents and code here for testing. I will check the issue and provide you more information.

Best regards.

i am giving information.

Hi

Thanks for your request. Could you please attach your code here for testing? I will check the problem on my side and provide you more information.

Best regards,

ok check it now

Hi

Thank you for additional information. There is no direct way to set width of a table using Aspose.Words. But you can set width of cells. So you can calculate width of each column. Please see code below, which demonstrate how to calculate width of columns.

// Calculate width of the page
PageSetup ps = builder.CurrentSection.PageSetup;
double pageWidth = ps.PageWidth - ps.RightMargin - ps.LeftMargin;
// Since we know number of columns in our table we can calculate width of each column.
double colWidth = pageWidth / 4;
// Set default column width
builder.CellFormat.Width = colWidth;

Hope this helps.

Best regards,

hi ,
i have some in regular expression
Regex r = new Regex("href\s\*=\s\*(?:(?:\"(?[^\"]*)\")|(?[^\s]* ))"); this is is my code but itis error

in my html page i want links(hrefs).that link write in the word document.

and i am giving staring and ending html tags i want middle links .
that means href link in one line and that middle title in another line

plz help me

Hi

Thanks for your inquiry. Unfortunately, I am not sure what you would like to achieve. Could you please show me your code and attach your output and expected documents? I will check the issue and provide you more information.

Best regards,

i already attached the(input/output) document plz check it out once again

Hi

Thank you for additional information. What I can see in the document you have attached is HTML code. I suppose, this is your input. Is that right? Then, if I understand you correctly, you use your Regular Expression to somehow process this HTML. Please provide me simple code, which shows how you process your HTML.

Actually, it is not quite clear from the document you have attached, what the expected output is. Please clarify.

Best regards,

http://highoncoding.com/Articles/105\_HTML\_Screen\_Scraping\_using\_C\_\_\_Net\_WebClient.aspx

like this

System.Net;

System.Text;

System.IO // If you plan to write in a file

// creates a button protected System.Web.UI.WebControls.Button Button1; 
// creates a byte array private byte[] aRequestHTML; 
// creates a string private string myString = null; 
// creates a datagrid protected System.Web.UI.WebControls.DataGrid DataGrid1; 
// creates a textbox protected System.Web.UI.WebControls.TextBox TextBox1; 
// creates the label protected System.Web.UI.WebControls.Label Label1; 
// creates the arraylist private ArrayList a = new ArrayList();

Okay now lets see some button click code that does the actual work.

private void Button1_Click(object sender, System.EventArgs e)
{
    // make an object of the WebClient class
    WebClient objWebClient = new WebClient();
    // gets the HTML from the url written in the textbox
    aRequestHTML = objWebClient.DownloadData(TextBox1.Text);
    // creates UTf8 encoding object
    UTF8Encoding utf8 = new UTF8Encoding();
    // gets the UTF8 encoding of all the html we got in aRequestHTML
    myString = utf8.GetString(aRequestHTML);
    // this is a regular expression to check for the urls
    Regex r = new Regex("href\\s *=\\s * (?: (?:\"(?[^\"]*)\")|(?[^\\s]* ))");
    // get all the matches depending upon the regular expression
    MatchCollection mcl = r.Matches(myString);

    foreach (Match ml in mcl)
    {
        foreach (Group g in ml.Groups)
        {
            string b = g.Value + "";
            // Add the extracted urls to the array list
            a.Add(b);

        }
    }
    // assign arraylist to the datasource
    DataGrid1.DataSource = a;
    // binds the databind
    DataGrid1.DataBind();

    // The following lines of code writes the extracted Urls to the file named test.txt
    StreamWriter sw = new StreamWriter(Server.MapPath("test.txt"));
    sw.Write(myString);
    sw.Close();

Hi

Thank you for additional information. Now, it is clearer what you would like to achieve. However, I have another question. How is this related to Aspose.Words?

If you need to extract all links from your HTML and write them into a Word document, you can try using code like the following:

// Get your HTML string
string html = File.ReadAllText(@"Test001\test.html");
// Create gegular expression, which will match links.
Regex urlRegex = new Regex("href\\s\*=\\s\*[\"']+(http(s)?://([\\w-]+\\.)+[\\w-]+(/[\\w- ./?%&=]\*)?)[\"']+");
MatchCollection matchs = urlRegex.Matches(html);
// Write matched URLS into the Word document
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
foreach (Match match in matchs)
{
    builder.Writeln(match.Groups[1].Value);
    // If you need to insert URLs into the Word docuemnt as Links, uncomment the next 4 lines and comment the previouse.
    //builder.Font.Underline = Underline.Single;
    //builder.Font.Color = Color.Blue;
    //builder.InsertHyperlink(match.Groups[1].Value, match.Groups[1].Value, false);
    //builder.Writeln();
}
// Save output document
doc.Save(@"Test001\out.doc");

Best regards,

The issues you have found earlier (filed as WORDSNET-228) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as WORDSNET-866) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as ) have been fixed in this update. This message was posted using BugNotificationTool from Downloads module by MuzammilKhan