python - BeautifulSoup doesnt return all HTML -
so im trying extract value of line of html looks this:
<input type="hidden" name="_ref_ck" value="41d875b47692bb0211ada153004a663f"> and value im doing:
self.ref = soup.find("input",{"name":"_ref_ck"}).get("value") and working fine me gave friend of mine program beta , getting error this:
traceback (most recent call last): file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 262, in onok self.main = gui(none, -1, 'inventory manager') file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 284, in __init__ self.inv.login(log.user) file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 34, in login self.get_ref_ck() file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 43, in get_ref_ck self.ref = soup.find('input',{'name':'_ref_ck'}).get("value") attributeerror: 'nonetype' object has no attribute 'get' which means beautifulsoup returning nonetype reason
so told him send me html request returns , fine told him give me soup , had the top part of page , cant figure out why
this means bs returning part of html recieving
my question why or if there easy way regex or else thanks!
here's quick pyparsing-based solution walkthrough:
import html parsing helpers pyparsing
>>> pyparsing import makehtmltags, withattribute define desired tag expression (makehtmltags returns starting , ending tag matching expressions, want starting expression, take 0'th returned value).
>>> inputtag = makehtmltags("input")[0] only want input tags having name attribute = "_ref_ck", use withattribute filtering
>>> inputtag.setparseaction(withattribute(name="_ref_ck")) now define sample input, , use inputtag expression definition search match.
>>> html = '''<input type="hidden" name="_ref_ck" value="41d875b47692bb0211ada153004a663f">''' >>> tagdata = inputtag.searchstring(html)[0] call tagdata.dump() see parsed tokens , available named results.
>>> print (tagdata.dump()) ['input', ['type', 'hidden'], ['name', '_ref_ck'], ['value', '41d875b47692bb0211ada153004a663f'], false] - empty: false - name: _ref_ck - startinput: ['input', ['type', 'hidden'], ['name', '_ref_ck'], ['value', '41d875b47692bb0211ada153004a663f'], false] - empty: false - name: _ref_ck - tag: input - type: hidden - value: 41d875b47692bb0211ada153004a663f - tag: input - type: hidden - value: 41d875b47692bb0211ada153004a663f use tagdata.value value attribute:
>>> print (tagdata.value) 41d875b47692bb0211ada153004a663f
Comments
Post a Comment