urllib2 (version 2.0a1)
index
/usr/lib/python2.2/urllib2.py

An extensible library for opening URLs using a variety of protocols
 
The simplest way to use this module is to call the urlopen function,
which accepts a string containing a URL or a Request object (described
below).  It opens the URL and returns the results as file-like
object; the returned object has some extra methods described below.
 
The OpenerDirectory manages a collection of Handler objects that do
all the actual work.  Each Handler implements a particular protocol or
option.  The OpenerDirector is a composite object that invokes the
Handlers needed to open the requested URL.  For example, the
HTTPHandler performs HTTP GET and POST requests and deals with
non-error returns.  The HTTPRedirectHandler automatically deals with
HTTP 301 & 302 redirect errors, and the HTTPDigestAuthHandler deals
with digest authentication.
 
urlopen(url, data=None) -- basic usage is that same as original
urllib.  pass the url and optionally data to post to an HTTP URL, and
get a file-like object back.  One difference is that you can also pass
Request instance instead of URL.  Raises a URLError (subclass of
IOError); for HTTP errors, raises an HTTPError, which can also be
treated as a valid response.
 
build_opener -- function that creates a new OpenerDirector instance.
will install the default handlers.  accepts one or more Handlers as
arguments, either instances or Handler classes that it will
instantiate.  if one of the argument is a subclass of the default
handler, the argument will be installed instead of the default.
 
install_opener -- installs a new opener as the default opener.
 
objects of interest:
OpenerDirector --
 
Request -- an object that encapsulates the state of a request.  the
state can be a simple as the URL.  it can also include extra HTTP
headers, e.g. a User-Agent.
 
BaseHandler --
 
exceptions:
URLError-- a subclass of IOError, individual protocols have their own
specific subclass
 
HTTPError-- also a valid HTTP response, so you can treat an HTTP error
as an exceptional event or valid response
 
internals:
BaseHandler and parent
_call_chain conventions
 
Example usage:
 
import urllib2
 
# set up authentication info
authinfo = urllib2.HTTPBasicAuthHandler()
authinfo.add_password('realm', 'host', 'username', 'password')
 
proxy_support = urllib2.ProxyHandler({"http" : "http://ahad-haam:3128"})
 
# build a new opener that adds authentication and caching FTP handlers
opener = urllib2.build_opener(proxy_support, authinfo, urllib2.CacheFTPHandler)
 
# install it
urllib2.install_opener(opener)
 
f = urllib2.urlopen('http://www.python.org/')

 
Modules
            
base64
ftplib
gopherlib
httplib
inspect
md5
mimetools
mimetypes
os
posixpath
re
rfc822
sha
socket
stat
sys
time
types
urlparse
 
Classes
            
AbstractBasicAuthHandler
HTTPBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler)
ProxyBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler)
AbstractDigestAuthHandler
BaseHandler
AbstractHTTPHandler
HTTPHandler
HTTPSHandler
CustomProxyHandler
FTPHandler
CacheFTPHandler
FileHandler
GopherHandler
HTTPDefaultErrorHandler
HTTPDigestAuthHandler(BaseHandler, AbstractDigestAuthHandler)
HTTPRedirectHandler
ProxyDigestAuthHandler(BaseHandler, AbstractDigestAuthHandler)
ProxyHandler
UnknownHandler
CustomProxy
HTTPPasswordMgr
HTTPPasswordMgrWithDefaultRealm
exceptions.IOError(exceptions.EnvironmentError)
URLError
GopherError
HTTPError(URLError, urllib.addinfourl)
OpenerDirector
OpenerFactory
Request
 
class AbstractBasicAuthHandler
       
   Methods defined here:
__init__(self, password_mgr=None)
http_error_auth_reqed(self, authreq, host, req, headers)
retry_http_basic_auth(self, host, req, realm)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
rx = <_sre.SRE_Pattern object>
 
class AbstractDigestAuthHandler
       
   Methods defined here:
__init__(self, passwd=None)
get_algorithm_impls(self, algorithm)
get_authorization(self, req, chal)
get_entity_digest(self, data, chal)
http_error_auth_reqed(self, authreq, host, req, headers)
retry_http_digest_auth(self, req, auth)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
 
class AbstractHTTPHandler(BaseHandler)
       
   Methods defined here:
do_open(self, http_class, req)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class BaseHandler
       
   Methods defined here:
add_parent(self, parent)
close(self)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
 
class CacheFTPHandler(FTPHandler)
       
  
Method resolution order:
CacheFTPHandler
FTPHandler
BaseHandler

Methods defined here:
__init__(self)
# XXX would be nice to have pluggable cache strategies
# XXX this stuff is definitely not thread safe
check_cache(self)
connect_ftp(self, user, passwd, host, port, dirs)
setMaxConns(self, m)
setTimeout(self, t)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from FTPHandler:
ftp_open(self, req)

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class CustomProxy
      # feature suggested by Duncan Booth
# XXX custom is not a good name
 
   Methods defined here:
__init__(self, proto, func=None, proxy_addr=None)
# either pass a function to the constructor or override handle
get_proxy(self)
handle(self, req)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
 
class CustomProxyHandler(BaseHandler)
       
   Methods defined here:
__init__(self, *proxies)
add_proxy(self, cpo)
do_proxy(self, p, req)
proxy_open(self, req)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class FTPHandler(BaseHandler)
       
   Methods defined here:
connect_ftp(self, user, passwd, host, port, dirs)
ftp_open(self, req)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class FileHandler(BaseHandler)
       
   Methods defined here:
file_open(self, req)
# Use local file or FTP depending on form of URL
get_names(self)
open_local_file(self, req)
# not entirely sure what the rules are here

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
names = None

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class GopherError(URLError)
       
  
Method resolution order:
GopherError
URLError
exceptions.IOError
exceptions.EnvironmentError
exceptions.StandardError
exceptions.Exception

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from URLError:
__init__(self, reason)
URLError is a sub-type of IOError, but it doesn't share any of
# the implementation.  need to override __init__ and __str__
__str__(self)

Methods inherited from exceptions.Exception:
__getitem__(...)
 
class GopherHandler(BaseHandler)
       
   Methods defined here:
gopher_open(self, req)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class HTTPBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler)
       
  
Method resolution order:
HTTPBasicAuthHandler
AbstractBasicAuthHandler
BaseHandler

Methods defined here:
http_error_401(self, req, fp, code, msg, headers)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
auth_header = 'Authorization'

Methods inherited from AbstractBasicAuthHandler:
__init__(self, password_mgr=None)
http_error_auth_reqed(self, authreq, host, req, headers)
retry_http_basic_auth(self, host, req, realm)

Data and non-method functions inherited from AbstractBasicAuthHandler:
rx = <_sre.SRE_Pattern object>

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class HTTPDefaultErrorHandler(BaseHandler)
       
   Methods defined here:
http_error_default(self, req, fp, code, msg, hdrs)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class HTTPDigestAuthHandler(BaseHandler, AbstractDigestAuthHandler)
      An authentication protocol defined by RFC 2069
 
Digest authentication improves on basic authentication because it
does not transmit passwords in the clear.
 
  
Method resolution order:
HTTPDigestAuthHandler
BaseHandler
AbstractDigestAuthHandler

Methods defined here:
http_error_401(self, req, fp, code, msg, headers)

Data and non-method functions defined here:
__doc__ = 'An authentication protocol defined by RFC 2069\... does not transmit passwords in the clear.\n '
__module__ = 'urllib2'
header = 'Authorization'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)

Methods inherited from AbstractDigestAuthHandler:
__init__(self, passwd=None)
get_algorithm_impls(self, algorithm)
get_authorization(self, req, chal)
get_entity_digest(self, data, chal)
http_error_auth_reqed(self, authreq, host, req, headers)
retry_http_digest_auth(self, req, auth)
 
class HTTPError(URLError, urllib.addinfourl)
      Raised when HTTP error occurs, but also acts like non-error return
 
  
Method resolution order:
HTTPError
URLError
exceptions.IOError
exceptions.EnvironmentError
exceptions.StandardError
exceptions.Exception
urllib.addinfourl
urllib.addbase

Methods defined here:
_HTTPError__super_init = __init__(self, fp, headers, url)
__del__(self)
__init__(self, url, code, msg, hdrs, fp)
__str__(self)

Data and non-method functions defined here:
__doc__ = 'Raised when HTTP error occurs, but also acts like non-error return'
__module__ = 'urllib2'

Methods inherited from exceptions.Exception:
__getitem__(...)

Methods inherited from urllib.addinfourl:
geturl(self)
info(self)

Methods inherited from urllib.addbase:
__repr__(self)
close(self)
 
class HTTPHandler(AbstractHTTPHandler)
       
  
Method resolution order:
HTTPHandler
AbstractHTTPHandler
BaseHandler

Methods defined here:
http_open(self, req)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from AbstractHTTPHandler:
do_open(self, http_class, req)

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class HTTPPasswordMgr
       
   Methods defined here:
__init__(self)
add_password(self, realm, uri, user, passwd)
find_user_password(self, realm, authuri)
is_suburi(self, base, test)
Check if test is below base in a URI tree
 
Both args must be URIs in reduced form.
reduce_uri(self, uri)
Accept netloc or URI and extract only the netloc and path

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
 
class HTTPPasswordMgrWithDefaultRealm(HTTPPasswordMgr)
       
   Methods defined here:
find_user_password(self, realm, authuri)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from HTTPPasswordMgr:
__init__(self)
add_password(self, realm, uri, user, passwd)
is_suburi(self, base, test)
Check if test is below base in a URI tree
 
Both args must be URIs in reduced form.
reduce_uri(self, uri)
Accept netloc or URI and extract only the netloc and path
 
class HTTPRedirectHandler(BaseHandler)
       
   Methods defined here:
http_error_301 = http_error_302(self, req, fp, code, msg, headers)
http_error_302(self, req, fp, code, msg, headers)
# Implementation note: To avoid the server sending us into an
# infinite loop, the request object needs to track what URLs we
# have already seen.  Do this by adding a handler-specific
# attribute to the Request object.

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
inf_msg = 'The HTTP server returned a redirect error that ...nfinite loop.\nThe last 302 error message was:\n'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class HTTPSHandler(AbstractHTTPHandler)
       
  
Method resolution order:
HTTPSHandler
AbstractHTTPHandler
BaseHandler

Methods defined here:
https_open(self, req)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from AbstractHTTPHandler:
do_open(self, http_class, req)

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class OpenerDirector
       
   Methods defined here:
__del__(self)
__init__(self)
_call_chain(self, chain, kind, meth_name, *args)
add_handler(self, handler)
close(self)
error(self, proto, *args)
open(self, fullurl, data=None)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
 
class OpenerFactory
      #bleck! don't use this yet
 
   Methods defined here:
add_handler(self, h)
add_proxy_handler(self, ph)
build_opener(self)
replace_handler(self, h)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
default_handlers = [<class urllib2.UnknownHandler>, <class urllib2.HTTPHandler>, <class urllib2.HTTPDefaultErrorHandler>, <class urllib2.HTTPRedirectHandler>, <class urllib2.FTPHandler>, <class urllib2.FileHandler>]
handlers = []
proxy_handlers = [<class urllib2.ProxyHandler>]
replacement_handlers = []
 
class ProxyBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler)
       
  
Method resolution order:
ProxyBasicAuthHandler
AbstractBasicAuthHandler
BaseHandler

Methods defined here:
http_error_407(self, req, fp, code, msg, headers)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
auth_header = 'Proxy-Authorization'

Methods inherited from AbstractBasicAuthHandler:
__init__(self, password_mgr=None)
http_error_auth_reqed(self, authreq, host, req, headers)
retry_http_basic_auth(self, host, req, realm)

Data and non-method functions inherited from AbstractBasicAuthHandler:
rx = <_sre.SRE_Pattern object>

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class ProxyDigestAuthHandler(BaseHandler, AbstractDigestAuthHandler)
       
  
Method resolution order:
ProxyDigestAuthHandler
BaseHandler
AbstractDigestAuthHandler

Methods defined here:
http_error_407(self, req, fp, code, msg, headers)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
header = 'Proxy-Authorization'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)

Methods inherited from AbstractDigestAuthHandler:
__init__(self, passwd=None)
get_algorithm_impls(self, algorithm)
get_authorization(self, req, chal)
get_entity_digest(self, data, chal)
http_error_auth_reqed(self, authreq, host, req, headers)
retry_http_digest_auth(self, req, auth)
 
class ProxyHandler(BaseHandler)
       
   Methods defined here:
__init__(self, proxies=None)
proxy_open(self, req, proxy, type)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
class Request
       
   Methods defined here:
__getattr__(self, attr)
__init__(self, url, data=None, headers={})
add_data(self, data)
add_header(self, key, val)
get_data(self)
get_full_url(self)
get_host(self)
get_selector(self)
get_type(self)
has_data(self)
set_proxy(self, host, type)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'
 
class URLError(exceptions.IOError)
       
  
Method resolution order:
URLError
exceptions.IOError
exceptions.EnvironmentError
exceptions.StandardError
exceptions.Exception

Methods defined here:
__init__(self, reason)
URLError is a sub-type of IOError, but it doesn't share any of
# the implementation.  need to override __init__ and __str__
__str__(self)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from exceptions.Exception:
__getitem__(...)
 
class UnknownHandler(BaseHandler)
       
   Methods defined here:
unknown_open(self, req)

Data and non-method functions defined here:
__doc__ = None
__module__ = 'urllib2'

Methods inherited from BaseHandler:
add_parent(self, parent)
close(self)
 
Functions
            
StringIO(...)
StringIO([s]) -- Return a StringIO-like stream for reading or writing
build_opener(*handlers)
Create an opener object from a list of handlers.
 
The opener will use several default handlers, including support
for HTTP and FTP.  If there is a ProxyHandler, it must be at the
front of the list of handlers.  (Yuck.)
 
If any of the handlers passed as arguments are subclasses of the
default handlers, the default handlers will not be used.
encode_digest(digest)
install_opener(opener)
parse_http_list(s)
Parse lists as described by RFC 2068 Section 2.
 
In particular, parse comman-separated lists where the elements of
the list may include quoted-strings.  A quoted-string could
contain a comma.
parse_keqv_list(l)
Parse list of key=value strings where keys are not duplicated.
urlopen(url, data=None)
 
Data
             __file__ = '/usr/lib/python2.2/urllib2.pyc'
__name__ = 'urllib2'
__version__ = '2.0a1'
_opener = None