How to get first-level elements from HTML file with HTML Agility Pack & c# -
i want first-level elements via parsing html file html agility pack ,for example result this:
<html> <body> <div class="header">....</div> <div class="main">.....</div> <div class="right">...</div> <div class="left">....</div> <div class="footer">...</div> </body> </html>
that each contains other tag... want extract text exist in website,but separately . example right side separate,left side separate , footer , so...
can me?
thanks...
use htmlagilitypack load webpage given url, parse selecting correct corresponding tags.
htmlweb page = new htmlweb(); htmldocument doc = new htmldocument(); docc = page.load("http://www.google.com");
if want select specific div
class name 'header
', using documentnode
property of document object.
string maintext = doc.documentnode.selectsinglenode("//div[@class=\"main\"]").innertext;
chances though have several tags in html members of 'main' class, have select them iterate on collection, or more precise when select single node.
to collection representation of tags i.e. in class 'main
', use documentnode.selectnodes
property instead.
i suggest take @ question @ of basics , links tutorials available.
Comments
Post a Comment