tornado源码查看-代码结构和请求流向

2013-04-18 刘太华 更多博文 » 博客 » GitHub »

move from old blog

原文链接 https://liutaihua.github.io/2013/04/18/tornado-code-source-structure.html
注:以下为加速网络访问所做的原文缓存,经过重新格式化,可能存在格式方面的问题,或偶有遗漏信息,请以原文为准。


源码里的结构: tornado ├── auth.py ├── autoreload.py ├── ca-certificates.crt ├── curl_httpclient.py ├── database.py ├── escape.py ├── gen.py ├── httpclient.py ├── httpserver.py ├── httputil.py ├── init.py ├── ioloop.py ├── iostream.py ├── locale.py ├── netutil.py ├── options.py ├── platform │ ├── auto.py │ ├── common.py │ ├── init.py │ ├── interface.py │ ├── posix.py │ ├── twisted.py │ ├── windows.py ├── process.py ├── simple_httpclient.py ├── stack_context.py ├── template.py │ ├── csv_translations │ │ └── fr_FR.csv │ ├── gettext_translations │ │ └── fr_FR │ │ └── LC_MESSAGES │ ├── init.py │ ├── README │ ├── static │ │ └── robots.txt │ ├── templates │ │ └── utf8.html ├── util.py ├── web.py ├── web.py~ ├── websocket.py ├── wsgi.py

花了一些时间,准备看tornado的源码, 下午只看了部分http相关的, 结构理出来, google了部分资料, 然后自己理了理思维, 发现自己理解基本是对的, 在http://ispe54.blogspot.com/2013/04/tornado-1.html这篇文章理解下更加清晰了.

由简单的hello world程序开始, 看进去源码:

class MainHandler(tornado.web.RequestHandler): def get(self): return self.finish('hello world')

#初始化一个application类的实例 class Application(tornado.web.Application): def init(self): handlers.append((r'/', MainHandler)) tornado.web.Application.init(self, handlers, **config.web_config.settings) self.session_manager = common.session.TornadoSessionManager(config.web_config.settings["session_secret"], config.web_config.settings["session_dir"]) self.db = common.util.get_user_db() ….

def main(): http_server = tornado.httpserver.HTTPServer(Application(), xheaders=True) http_server.listen(8888) tornado.ioloop.IOLoop.instance().start()

1, 先说Application类, 它和RequestHandler同时位于tornado.web模块内, 它总共没多少代码, Application主要是初始化一些option.settings的参数, 将实例代码中的handlers加入到self.handlers中, 最后重写了call函数, 在后面将Application实例传给HTTPServer作为callback, HTTPServer内会有一系列的方法, 将会调用callback(), 实际就会运行这个call:

2, tornado.httpserver.HTTPServer 类只是提供一个基础httpserver的方法, httpserver.py和netutil.py 内的TCPServer,这两个文件主要是实现http协议,解析header 和 body, 生成request,回调给appliaction, 在httpserver.py内有一个HTTPConnection, 实现http协议的连接部分. 对于底层的socket, io缓冲等, 是由TCPServer中, 将ioloop, iostream关联在一起实现的.

源码里说HTTPServer类只是个简单的http协议实现: A server is defined by a request callback that takes an HTTPRequest instance as an argument and writes a valid HTTP response with HTTPRequest.write. HTTPRequest.finish finishes the request (but does not necessarily close the connection in the case of HTTP/1.1 keep-alive requests). A simple example server that echoes back the URI you requested::

    def handle_request(request):
       message = "You requested %s\n" % request.uri
       request.write("HTTP/1.1 200 OK\r\nContent-Length: %d\r\n\r\n%s" % (
                     len(message), message))
       request.finish()

    http_server = httpserver.HTTPServer(handle_request)

HTTPServer是tornado.netutil.TCPServer的子类, HTTPServer在构造函数_init里增加了一些属性, 然后重写了TCPServer的handle_stream: def handle_stream(self, stream, address): HTTPConnection(stream, address, self.request_callback, self.no_keep_alive, self.xheaders)

handle_stream 这个方法, 会在TCPServer里被_handle_connection方法调用: def handle_stream(self, stream, address): """Override to handle a new IOStream from an incoming connection.""" raise NotImplementedError()

def _handle_connection(self, connection, address):
    if self.ssl_options is not None:
        assert ssl, "Python 2.6+ and OpenSSL required for SSL"
        try:
            connection = ssl.wrap_socket(connection,
                                         server_side=True,
                                         do_handshake_on_connect=False,
                                         **self.ssl_options)
        except ssl.SSLError, err:
            if err.args[0] == ssl.SSL_ERROR_EOF:
                return connection.close()
            else:
                raise
        except socket.error, err:
            if err.args[0] == errno.ECONNABORTED:
                return connection.close()
            else:
                raise
    try:
        if self.ssl_options is not None:
            stream = SSLIOStream(connection, io_loop=self.io_loop)
        else:
            stream = IOStream(connection, io_loop=self.io_loop)
        self.handle_stream(stream, address)
    except Exception:
        logging.error("Error in connection callback", exc_info=True)

handle_connection会被add_socket调用, 回到最上层, 其实是hello world程序中的http_server.listen(port) 这句, 发起listen, listen方法内会调用add_socket, 而handle_stream中实现的是调用HTTPConnection来处理一系列http协议中的connection部分. 在HTTPConnection中会处理callback, 这个callback就是Application类的call_, 最后request数据会传给最终的逻辑处理类web.RequestHandler.

说起来特别费劲, 做了个思维导图(可能需要翻墙):