这里有个 非常好的分析 html的 类。 节约了不少时间。 项目地址 http://www.codeplex.com/Wiki/View.aspx?ProjectName=htmlagilitypack
For example, here is how you would fix all hrefs in an HTML file:
HtmlDocument doc = new HtmlDocument(); doc.Load("file.htm"); foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a@href") { HtmlAttribute att = link"href"; att.Value = FixLink(att); } doc.Save("file.htm"); If you want to participate to the project - because that's the whole purpose of putting the source there, right - use the forums or drop me a note (simon underscore mourier at hotmail dot com)! Happy coding, scraping, scanning, html-ing, xhtml-ing, etc... :^) Simon Mourier. |