python - BeautifulSoup doesnt return all HTML -
so im trying extract value of line of html looks this:
<input type="hidden" name="_ref_ck" value="41d875b47692bb0211ada153004a663f">
and value im doing:
self.ref = soup.find("input",{"name":"_ref_ck"}).get("value")
and working fine me gave friend of mine program beta , getting error this:
traceback (most recent call last): file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 262, in onok self.main = gui(none, -1, 'inventory manager') file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 284, in __init__ self.inv.login(log.user) file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 34, in login self.get_ref_ck() file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 43, in get_ref_ck self.ref = soup.find('input',{'name':'_ref_ck'}).get("value") attributeerror: 'nonetype' object has no attribute 'get'
which means beautifulsoup returning nonetype reason
so told him send me html request returns , fine told him give me soup , had the top part of page , cant figure out why
this means bs returning part of html recieving
my question why or if there easy way regex or else thanks!
here's quick pyparsing-based solution walkthrough:
import html parsing helpers pyparsing
>>> pyparsing import makehtmltags, withattribute
define desired tag expression (makehtmltags
returns starting , ending tag matching expressions, want starting expression, take 0'th returned value).
>>> inputtag = makehtmltags("input")[0]
only want input tags having name
attribute = "_ref_ck"
, use withattribute
filtering
>>> inputtag.setparseaction(withattribute(name="_ref_ck"))
now define sample input, , use inputtag
expression definition search match.
>>> html = '''<input type="hidden" name="_ref_ck" value="41d875b47692bb0211ada153004a663f">''' >>> tagdata = inputtag.searchstring(html)[0]
call tagdata.dump()
see parsed tokens , available named results.
>>> print (tagdata.dump()) ['input', ['type', 'hidden'], ['name', '_ref_ck'], ['value', '41d875b47692bb0211ada153004a663f'], false] - empty: false - name: _ref_ck - startinput: ['input', ['type', 'hidden'], ['name', '_ref_ck'], ['value', '41d875b47692bb0211ada153004a663f'], false] - empty: false - name: _ref_ck - tag: input - type: hidden - value: 41d875b47692bb0211ada153004a663f - tag: input - type: hidden - value: 41d875b47692bb0211ada153004a663f
use tagdata.value
value
attribute:
>>> print (tagdata.value) 41d875b47692bb0211ada153004a663f
Comments
Post a Comment