python - BeautifulSoup doesnt return all HTML -


so im trying extract value of line of html looks this:

<input type="hidden" name="_ref_ck" value="41d875b47692bb0211ada153004a663f"> 

and value im doing:

self.ref = soup.find("input",{"name":"_ref_ck"}).get("value") 

and working fine me gave friend of mine program beta , getting error this:

traceback (most recent call last):   file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 262, in onok     self.main = gui(none, -1, 'inventory manager')   file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 284, in __init__     self.inv.login(log.user)   file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 34, in login     self.get_ref_ck()   file "c:\users\daniel\appdata\local\temp\rar$di85.192\invent manager.py", line 43, in get_ref_ck     self.ref = soup.find('input',{'name':'_ref_ck'}).get("value") attributeerror: 'nonetype' object has no attribute 'get' 

which means beautifulsoup returning nonetype reason

so told him send me html request returns , fine told him give me soup , had the top part of page , cant figure out why

this means bs returning part of html recieving

my question why or if there easy way regex or else thanks!

here's quick pyparsing-based solution walkthrough:

import html parsing helpers pyparsing

>>> pyparsing import makehtmltags, withattribute 

define desired tag expression (makehtmltags returns starting , ending tag matching expressions, want starting expression, take 0'th returned value).

>>> inputtag = makehtmltags("input")[0] 

only want input tags having name attribute = "_ref_ck", use withattribute filtering

>>> inputtag.setparseaction(withattribute(name="_ref_ck")) 

now define sample input, , use inputtag expression definition search match.

>>> html = '''<input type="hidden" name="_ref_ck" value="41d875b47692bb0211ada153004a663f">''' >>> tagdata = inputtag.searchstring(html)[0] 

call tagdata.dump() see parsed tokens , available named results.

>>> print (tagdata.dump()) ['input', ['type', 'hidden'], ['name', '_ref_ck'], ['value', '41d875b47692bb0211ada153004a663f'], false] - empty: false - name: _ref_ck - startinput: ['input', ['type', 'hidden'], ['name', '_ref_ck'], ['value', '41d875b47692bb0211ada153004a663f'], false]   - empty: false   - name: _ref_ck   - tag: input   - type: hidden   - value: 41d875b47692bb0211ada153004a663f - tag: input - type: hidden - value: 41d875b47692bb0211ada153004a663f 

use tagdata.value value attribute:

>>> print (tagdata.value) 41d875b47692bb0211ada153004a663f 

Comments

Popular posts from this blog

Detect support for Shoutcast ICY MP3 without navigator.userAgent in Firefox? -

web - SVG not rendering properly in Firefox -

java - JavaFX 2 slider labelFormatter not being used -