python - Can't read line from html page -

- July 15, 2013

i trying cut time format specific site. regex working (tried regex tester , worked), when try run code in python get:

import urllib,re  sock = urllib.urlopen("http://www.wolframalpha.com/input/?i=time") htmlsource = sock.read() sock.close() ips = re.findall( r'([01]?[0-9]{1}|2[0-3]{1}):[0-5]{1}[0-9]{1}:[0-5]{1}[0-9]{1}',htmlsource) print ips

the result:

>>> ['7', '4'] >>>

the time on regextester.com marked red color want extract time in following format: xx:xx:xx (24h).

why happening? thank you!

you have redundant quantifiers in regexp (those {1}). can remove them.

another thing re.findall returning captures, hours. change first capture non-caturing group (?: ... ) , capture whole regex:

((?:[01]?[0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9])

this should doing think.

Search This Blog

Sher

python - Can't read line from html page -

Comments

Post a Comment

Popular posts from this blog

java - How to Configure JAXRS and Spring With Annotations -

visual studio - TFS will not accept changes I've made to a Java project -

php - Create image in codeigniter on the fly -