python - How can I test this script that accesses urls through several different proxy servers? -
right script:
import json import urllib2 open('urls.txt') f: urls = [line.rstrip() line in f] open('proxies.txt') proxies: line in proxies: proxy = json.loads(line) proxy_handler = urllib2.proxyhandler(proxy) opener = urllib2.build_opener(proxy_handler) urllib2.install_opener(opener) url in urls: data = urllib2.urlopen(url).read() print data
this urls.txt file:
http://myipaddress.com
and proxies.txt file:
{"https": "https://87.98.216.22:3128"} {"https": "http://190.153.7.189:8080"} {"https": "http://125.39.68.181:80"}
that got @ http://hidemyass.com
i have been trying test going through terminal output (a bunch of html) , looking see if shows ip address somewhere , hoping 1 of proxy ip's. doesn't seem work. depending on ip recognition site, either throws connection error or tells me have enter validation letters (though site viewed through browser works fine).
so going in best way? there simpler way check ip address url seeing?
edit: heard elsewhere (on forum) 1 way check if url being accessed different ip check cross headers (like html header indicates redirected). can't find more info.
you can use simpler site this. example:
code:
import json import urllib2 open('urls.txt') f: urls = [line.rstrip() line in f] open('proxies.txt') proxies: line in proxies: proxy = json.loads(line) proxy_handler = urllib2.proxyhandler(proxy) opener = urllib2.build_opener(proxy_handler) urllib2.install_opener(opener) url in urls: try: data = urllib2.urlopen(url).read() print proxy, "-", data except: print proxy, "- not working"
urls.txt:
http://api.exip.org/?call=ip
proxies.txt:
{"http": "http://218.108.114.140:8080"} {"http": "http://59.47.43.93:8080"} {"http": "http://218.108.170.172:80"}
output:
{u'http': u'http://218.108.114.140:8080'} - 218.108.114.140 {u'http': u'http://59.47.43.93:8080'} - 118.207.240.161 {u'http': u'http://218.108.170.172:80'} - not working [finished in 25.4s]
note: none of real ip.
or if want use http://myipaddress.com can beautifulsoup, extracting exact html element contains ip
Comments
Post a Comment