Docs.

2025-01-30 23:09:44 +00:00 · 2013-04-05 11:55:28 +13:00 · 2013-04-05 11:55:28 +13:00 · ca9c60d2eb
commit ca9c60d2eb
parent e3fd0e838d
4 changed files with 58 additions and 101 deletions
--- a/doc-src/howmitmproxy.html
+++ b/doc-src/howmitmproxy.html
@ -17,13 +17,13 @@ and most reliable way to intercept traffic. The proxy protocol is codified in
 the [HTTP RFC](http://www.ietf.org/rfc/rfc2068.txt), so the behaviour of both
 the client and the server is well defined, and usually reliable. In the
 simplest possible interaction with mitmproxy, a client connects directly to the
-proxy, and makes a request that looks like this: 
+proxy, and makes a request that looks like this:
+
+<pre>GET http://example.com/index.html HTTP/1.1</pre>

-<pre>GET http://example.com/index.html HTTP/1.1</pre> 
-                
 This is a proxy GET request - an extended form of the vanilla HTTP GET request
 that includes a schema and host specification, and it includes all the
-information mitmproxy needs to proceed. 
+information mitmproxy needs to proceed.

 <img src="explicit.png"/>

@ -39,7 +39,7 @@ information mitmproxy needs to proceed.

        <tr>

-            <td><b>2</b></td> 
+            <td><b>2</b></td>

            <td>Mitmproxy connects to the upstream server and simply forwards
            the request on.</td>
@ -67,49 +67,17 @@ flow of requests and responses are completely opaque to the proxy.

 ## The MITM in mitmproxy

-This is where mitmproxy's fundamental trick comes into play. The MITM in its
-name stands for Man-In-The-Middle - a reference to the process we use to
-intercept and interfere with these theoretially opaque data streams. The basic
-idea is to pretend to be the server to the client, and pretend to be the client
-to the server, while we sit in the middle decoding traffic from both sides. The
-tricky part is that the [Certificate
-Authority](http://en.wikipedia.org/wiki/Certificate_authority) system is
-designed to prevent exactly this attack, by allowing a trusted third-party to
-cryptographically sign a server's SSL certificates to verify that they are
-legit. If this signature is from a non-trusted party, a secure client will
-simply drop the connection and refuse to proceed. Despite the many shortcomings
-of the CA system as it exists today, this is usually fatal to attempts to MITM
-an SSL connection for analysis.
+This is where mitmproxy's fundamental trick comes into play. The MITM in its name stands for Man-In-The-Middle - a reference to the process we use to intercept and interfere with these theoretially opaque data streams. The basic idea is to pretend to be the server to the client, and pretend to be the client to the server, while we sit in the middle decoding traffic from both sides. The tricky part is that the [Certificate Authority](http://en.wikipedia.org/wiki/Certificate_authority) system is designed to prevent exactly this attack, by allowing a trusted third-party to cryptographically sign a server's SSL certificates to verify that they are legit. If this signature doesn't match or is from a non-trusted party, a secure client will simply drop the connection and refuse to proceed. Despite the many shortcomings of the CA system as it exists today, this is usually fatal to attempts to MITM an SSL connection for analysis. Our answer to this conundrum is to become a trusted Certificate Authority ourselves. Mitmproxy includes a full CA implementation that generates interception certificates on the fly. To get the client to trust these certificates, we [register mitmproxy as a trusted CA with the device manually](@!urlTo("ssl.html")!@).

-Our answer to this conundrum is to become a trusted Certificate Authority
-ourselves. Mitmproxy includes a full CA implementation that generates
-interception certificates on the fly. To get the client to trust these
-certificates, we [register mitmproxy as a trusted CA with the device
-manually](@!urlTo("ssl.html")!@).
+## Complication 1: What's the remote hostname?

-## Complication 1: What's the remote hostname? 
-
-To proceed with this plan, we need to know the domain name to use in the
-interception certificate - the client will verify that the certificate is for
-the domain it's connecting to, and abort if this is not the case. At first
-blush, it seems that the CONNECT request above gives us all we need - in this
-example, both of these values are "example.com".  But what if the client had
-initiated the connection as follows: 
+To proceed with this plan, we need to know the domain name to use in the interception certificate - the client will verify that the certificate is for the domain it's connecting to, and abort if this is not the case. At first blush, it seems that the CONNECT request above gives us all we need - in this example, both of these values are "example.com".  But what if the client had initiated the connection as follows:

 <pre>CONNECT 10.1.1.1:443 HTTP/1.1</pre>

-Using the IP address is perfectly legitimate because it gives us enough
-information to initiate the pipe, even though it doesn't reveal the remote
-hostname. 
+Using the IP address is perfectly legitimate because it gives us enough information to initiate the pipe, even though it doesn't reveal the remote hostname.

-Mitmproxy has a cunning mechanism that smooths this over - [upstream
-certificate sniffing](@!urlTo("features/upstreamcerts.html")!@). As soon as we
-see the CONNECT request, we pause the client part of the conversation, and
-initiate a simultaneous connection to the server. We complete the SSL handshake
-with the server, and inspect the certificates it used. Now, we use the Common
-Name in the upstream SSL certificates to generate the dummy certificate for the
-client. Voila, we have the correct hostname to present to the client, even if
-it was never specified.
+Mitmproxy has a cunning mechanism that smooths this over - [upstream certificate sniffing](@!urlTo("features/upstreamcerts.html")!@). As soon as we see the CONNECT request, we pause the client part of the conversation, and initiate a simultaneous connection to the server. We complete the SSL handshake with the server, and inspect the certificates it used. Now, we use the Common Name in the upstream SSL certificates to generate the dummy certificate for the client. Voila, we have the correct hostname to present to the client, even if it was never specified.


 ## Complication 2: Subject Alternative Name
@ -127,56 +95,33 @@ them to the generated dummy certificate.

 ## Complication 3: Server Name Indication

-One of the big limitations of conventional SSL is that each certificate
-requires its own IP address. This means that you couldn't do virtual hosting
-where multiple domains with independent certificates share the same IP address.
-In a world with a rapidly shrinking IPv4 address pool this is a problem, and we
-have a solution in the form of the [Server Name
-Indication](http://en.wikipedia.org/wiki/Server_Name_Indication) extension to
-the SSL and TLS protocols. This lets the client specify the remote server name
-at the start of the SSL handshake, which then lets the server select the right
-certificate to complete the process.
+One of the big limitations of vanilla SSL is that each certificate requires its own IP address. This means that you couldn't do virtual hosting where multiple domains with independent certificates share the same IP address. In a world with a rapidly shrinking IPv4 address pool this is a problem, and we have a solution in the form of the [Server Name Indication](http://en.wikipedia.org/wiki/Server_Name_Indication) extension to the SSL and TLS protocols. This lets the client specify the remote server name at the start of the SSL handshake, which then lets the server select the right certificate to complete the process.

-SNI breaks our upstream certificate sniffing process, because when we connect
-without using SNI, we get served a default certificate that may have nothing to
-do with the certificate expected by the client. The solution is another tricky
-complication to the client connection process. After the client connects, we
-allow the SSL handshake to continue until just _after_ the SNI value has been
-passed to us. Now we can pause the conversation, and initiate an upstream
-connection using the correct SNI value, which then serves us the correct
-upstream certificate, from which we can extract the expected CN and SANs.
-
-There's another wrinkle here. Due to a limitation of the SSL library mitmproxy
-uses, we can't detect that a connection _hasn't_ sent an SNI request until it's
-too late for upstream certificate sniffing. In practice, we therefore make a
-vanilla SSL connection upstream to sniff non-SNI certificates, and then discard
-the connection if the client sends an SNI notification. If you're watching your
-traffic with a packet sniffer, you'll see two connections to the server when an
-SNI request is made, the first of which is immediately closed after the SSL
-handshake. Luckily, this is almost never an issue in practice.
+SNI breaks our upstream certificate sniffing process, because when we connect without using SNI, we get served a default certificate that may have nothing to do with the certificate expected by the client. The solution is another tricky complication to the client connection process. After the client connects, we allow the SSL handshake to continue until just _after_ the SNI value has been passed to us. Now we can pause the conversation, and initiate an upstream connection using the correct SNI value, which then serves us the correct upstream certificate, from which we can extract the expected CN and SANs.

+There's another wrinkle here. Due to a limitation of the SSL library mitmproxy uses, we can't detect that a connection _hasn't_ sent an SNI request until it's too late for upstream certificate sniffing. In practice, we therefore make a vanilla SSL connection upstream to sniff non-SNI certificates, and then discard the connection if the client sends an SNI notification. If you're watching your traffic with a packet sniffer, you'll see two connections to the server when an SNI request is made, the first of which is immediately closed after the SSL handshake. Luckily, this is almost never an issue in practice.

 ## Putting it all together

-Lets put all of this together into the complete explicitly proxied HTTPS flow. 
+Lets put all of this together into the complete explicitly proxied HTTPS flow.

 <img src="explicit_https.png"/>

 <table class="table">
    <tbody>
        <tr>
-            <td><b>1</b></td> 
+            <td><b>1</b></td>
            <td>The client makes a connection to mitmproxy, and issues an HTTP
            CONNECT request.</td>
        </tr>
        <tr>
-            <td><b>2</b></td> 
+            <td><b>2</b></td>

            <td>Mitmproxy responds with a 200 Connection Established, as if it
            has set up the CONNECT pipe.</td>
        </tr>
        <tr>
-            <td><b>3</b></td> 
+            <td><b>3</b></td>

            <td>The client believes it's talking to the remote server, and
            initiates the SSL connection. It uses SNI to indicate the hostname
@ -184,33 +129,33 @@ Lets put all of this together into the complete explicitly proxied HTTPS flow.
        </tr>

        <tr>
-            <td><b>4</b></td> 
+            <td><b>4</b></td>

            <td>Mitmproxy connects to the server, and establishes an SSL
            connection using the SNI hostname indicated by the client.</td>
-            
+
        </tr>
        <tr>
-            <td><b>5</b></td> 
+            <td><b>5</b></td>

            <td>The server responds with the matching SSL certificate, which
            contains the CN and SAN values needed to generate the interception
            certificate.</td>
        </tr>
        <tr>
-            <td><b>6</b></td> 
+            <td><b>6</b></td>

            <td>Mitmproxy generates the interception cert, and continues the
            client SSL handshake paused in step 3.</td>
        </tr>
        <tr>
-            <td><b>7</b></td> 
+            <td><b>7</b></td>

            <td>The client sends the request over the established SSL
            connection.</td>
        </tr>
        <tr>
-            <td><b>7</b></td> 
+            <td><b>7</b></td>

            <td>Mitmproxy passes the request on to the server over the SSL
            connection initiated in step 4.</td>
@ -234,11 +179,11 @@ redirection mechanism that transparently reroutes a TCP connection destined for
 a server on the Internet to a listening proxy server. This usually takes the
 form of a firewall on the same host as the proxy server -
 [iptables](http://www.netfilter.org/) on Linux or
-[pf](http://en.wikipedia.org/wiki/PF_(firewall)) on OSX. Once the client has
+[pf](http://en.wikipedia.org/wiki/PF_\(firewall\)) on OSX. Once the client has
 initiated the connection, it makes a vanilla HTTP request, which might look
 something like this:

-<pre>GET /index.html HTTP/1.1</pre> 
+<pre>GET /index.html HTTP/1.1</pre>

 Note that this request differs from the explicit proxy variation, in that it
 omits the scheme and hostname. How, then, do we know which upstream host to
@ -258,11 +203,11 @@ this information, the process is fairly straight-forward.
 <table class="table">
    <tbody>
        <tr>
-            <td><b>1</b></td> 
+            <td><b>1</b></td>
            <td>The client makes a connection to the server.</td>
        </tr>
        <tr>
-            <td><b>2</b></td> 
+            <td><b>2</b></td>

            <td>The router redirects the connection to mitmproxy, which is
            typically listening on a local port of the same host. Mitmproxy
@ -270,16 +215,16 @@ this information, the process is fairly straight-forward.
            destination was.</td>
        </tr>
        <tr>
-            <td><b>3</b></td> 
+            <td><b>3</b></td>

            <td>Now, we simply read the client's request...</td>
        </tr>

        <tr>
-            <td><b>4</b></td> 
+            <td><b>4</b></td>

            <td>... and forward it upstream.</td>
-            
+
        </tr>
    </tbody>
 </table>
@ -300,11 +245,11 @@ and cope with SNI.
 <table class="table">
    <tbody>
        <tr>
-            <td><b>1</b></td> 
+            <td><b>1</b></td>
            <td>The client makes a connection to the server.</td>
        </tr>
        <tr>
-            <td><b>2</b></td> 
+            <td><b>2</b></td>

            <td>The router redirects the connection to mitmproxy, which is
            typically listening on a local port of the same host. Mitmproxy
@ -312,7 +257,7 @@ and cope with SNI.
            destination was.</td>
        </tr>
        <tr>
-            <td><b>3</b></td> 
+            <td><b>3</b></td>

            <td>The client believes it's talking to the remote server, and
            initiates the SSL connection. It uses SNI to indicate the hostname
@ -320,33 +265,33 @@ and cope with SNI.
        </tr>

        <tr>
-            <td><b>4</b></td> 
+            <td><b>4</b></td>

            <td>Mitmproxy connects to the server, and establishes an SSL
            connection using the SNI hostname indicated by the client.</td>
-            
+
        </tr>
        <tr>
-            <td><b>5</b></td> 
+            <td><b>5</b></td>

            <td>The server responds with the matching SSL certificate, which
            contains the CN and SAN values needed to generate the interception
            certificate.</td>
        </tr>
        <tr>
-            <td><b>6</b></td> 
+            <td><b>6</b></td>

            <td>Mitmproxy generates the interception cert, and continues the
            client SSL handshake paused in step 3.</td>
        </tr>
        <tr>
-            <td><b>7</b></td> 
+            <td><b>7</b></td>

            <td>The client sends the request over the established SSL
            connection.</td>
        </tr>
        <tr>
-            <td><b>7</b></td> 
+            <td><b>7</b></td>

            <td>Mitmproxy passes the request on to the server over the SSL
            connection initiated in step 4.</td>
--- a/doc-src/transparent.html
+++ b/doc-src/transparent.html
@ -1,3 +1,15 @@


+When a transparent proxy is used, traffic is redirected into a proxy at the network layer, without
+any client configuration being required. This makes transparent proxying ideal for those situations
+where you can't change client behaviour - proxy-oblivious Android applications being a common
+example.

+To set up transparent proxying, we need two new components. The first is a
+redirection mechanism that transparently reroutes a TCP connection destined for
+a server on the Internet to a listening proxy server. This usually takes the
+form of a firewall on the same host as the proxy server -
+[iptables](http://www.netfilter.org/) on Linux or
+[pf](http://en.wikipedia.org/wiki/PF_\(firewall\)) on OSX. When the proxy receives a redirected connection, it sees a vanilla HTTP request, without a host specification. This is where the second new component comes in - a host module that allows us to query the redirector for the original destination of the TCP connection.
+
+At the moment, mitmproxy supports transparent proxying on OSX Lion and above, and all current flavors of Linux.kkkkk
--- a/doc-src/transparent/osx.html
+++ b/doc-src/transparent/osx.html
@ -20,7 +20,7 @@ OSX.
 <pre class="terminal">rdr on en2 inet proto tcp to any port 80 -&gt; 127.0.0.1 port 8080
 rdr on en2 inet proto tcp to any port 443 -&gt; 127.0.0.1 port 8080
 </pre>
-        
+
        These rules tell pf to redirect all traffic destined for port 80 or 443
        to the local mitmproxy instance running on port 8080. You should
        replace <b>en2</b> with the interface on which your test device will
@ -28,7 +28,7 @@ rdr on en2 inet proto tcp to any port 443 -&gt; 127.0.0.1 port 8080

    </li>

-    <li> Configure pf with the rules: 
+    <li> Configure pf with the rules:

        <pre class="terminal">sudo pfctl -f pf.conf</pre>

@ -40,9 +40,6 @@ rdr on en2 inet proto tcp to any port 443 -&gt; 127.0.0.1 port 8080

    </li>

-    <li> Configure your test device to use the host on which mitmproxy is
-    running as the default gateway.</li>
-
    <li> Configure sudoers to allow mitmproxy to access pfctl. Edit the file
    <b>/etc/sudoers</b> on your system as root. Add the following line to the end
    of the file:
@ -55,7 +52,7 @@ rdr on en2 inet proto tcp to any port 443 -&gt; 127.0.0.1 port 8080
    you're special feel free to tighten the restriction up to the user running
    mitmproxy.</li>

-    <li> Finally, fire up mitmproxy. You probably want a command like this:
+    <li> Fire up mitmproxy. You probably want a command like this:

        <pre class="terminal">mitmproxy -T --host</pre>

@ -65,4 +62,8 @@ rdr on en2 inet proto tcp to any port 443 -&gt; 127.0.0.1 port 8080

    </li>

+    <li> Finally, configure your test device to use the host on which mitmproxy is
+    running as the default gateway.</li>
+
+
 </ol>
--- a/libmproxy/app.py
+++ b/libmproxy/app.py
@ -5,4 +5,3 @@ mapp = flask.Flask(__name__)
@mapp.route("/")
 def hello():
    return "mitmproxy"
-