mitmproxy/README.mkd
2012-04-30 10:09:16 +12:00

308 lines
8.2 KiB
Markdown

__pathod__ is a pathological HTTP/S daemon, useful for testing and torturing
HTTP clients. At __pathod__'s heart is a tiny, terse language for crafting HTTP
responses. The simplest way to use __pathod__ is to fire up the daemon, and
specify the response behaviour you want using this language in the request URL.
Here's a minimal example:
http://localhost:9999/p/200
Everything after the "/p/" path component is a response specifier - in this
case just a vanilla 200 OK response. See the docs below to get (much) fancier.
You can also add anchors to the __pathod__ server that serve a fixed response
whenever a matching URL is requested:
pathod --anchor "/foo=200"
Here, "/foo" a regex specifying the anchor path, and the part after the "=" is
a response specifier.
__pathod__ also has a nifty built-in web interface, which lets you play with
the language by previewing responses, exposes activity logs, online help and
various other goodies. Try it by visiting the server root:
http://localhost:9999
Specifying Responses
====================
The general form of a response is as follows:
code[MESSAGE]:[colon-separated list of features]
Here's the simplest possible response specification, returning just an HTTP 200
OK message with no headers and no content:
200
We can embellish this a bit by specifying an optional custom HTTP response
message (if we don't, __pathod__ automatically creates an appropriate one). By
default for a 200 response code the message is "OK", but we can change it like
this:
200"YAY"
The quoted string here is an example of a Value Specifier, a syntax that is
used throughout the __pathod__ response specification language. In this case, the
quotes mean we're specifying a literal string, but there are many other fun
things we can do. For example, we can tell __pathod__ to generate 100k of random
ASCII letters instead:
200@100k,ascii_letters
Full documentation on the value specification syntax can be found in the next
section.
Following the response code specifier is a colon-separated list of features.
For instance, this specifies a response with a body consisting of 1 megabyte of
random data:
200:b@1m
And this is the same response with an ETag header added:
200:b@1m:h"Etag"="foo"
Both the header name and the header value are full value specifiers. Here's the
same response again, but with a 1k randomly generated header name:
200:b@1m:h@1k,ascii_letters="foo"
A few specific headers have shortcuts, because they're used so often. The
shortcut for the content-type header is "c":
200:b@1m:c"text/json"
That's it for the basic response definition. Now we can start mucking with the
responses to break clients. One common hard-to-test circumstance is hangs or
slow responses. __pathod__ has a pause operator that you can use to define
precisely when and how long the server should hang. Here, for instance, we hang
for 120 seconds after sending 50 bytes (counted from the first byte of the HTTP
response):
200:b@1m:p120,50
If that's not long enough, we can tell __pathod__ to hang forever:
200:b@1m:p120,f
Or to send all data, and then hang without disconnecting:
200:b@1m:p120,a
We can also ask __pathod__ to hang randomly:
200:b@1m:pr,a
There is a similar mechanism for dropping connections mid-response. So, we can
tell __pathod__ to disconnect after sending 50 bytes:
200:b@1m:d50
Or randomly:
200:b@1m:dr
All of these features can be combined. Here's a response that pauses twice,
once at 10 bytes and once at 20, then disconnects at 5000:
200:b@1m:p10,10:p20,10:d5000
Features
========
#### hKEY=VALUE
Set a header. Both KEY and VALUE are full _Value Specifiers_.
#### bVALUE
Set the body. VALUE is a _Value Specifier_. When the body is set, __pathod__ will
automatically set the appropriate Content-Length header.
#### cVALUE
A shortcut for setting the Content-Type header. Equivalent to:
h"Content-Type"=VALUE
#### lVALUE
A shortcut for setting the Location header. Equivalent to:
h"Content-Type"=VALUE
#### dOFFSET
Disconnect after OFFSET bytes. The offset can also be "r", in which case __pathod__
will disconnect at a random point in the response.
#### pSECONDS,OFFSET
Pause for SECONDS seconds after OFFSET bytes. SECONDS can also be "f" to pause
forever. OFFSET can also be "r" to generate a random offset, or "a" for an
offset just after all data has been sent.
Value Specifiers
================
There are three different flavours of value specification.
### Literal
Literal values are specified as a quoted strings:
"foo"
Either single or double quotes are accepted, and quotes can be escaped with
backslashes within the string:
'fo\'o'
### Files
You can load a value from a specified file path. To do so, you have to specify
a _staticdir_ option to __pathod__ on the command-line, like so:
pathod -d ~/myassets
All paths are relative paths under this directory. File loads are indicated by
starting the value specifier with the left angle bracket:
<my/path
The path value can also be a quoted string, with the same syntax as literals:
<"my/path"
### Generated values
An @-symbol lead-in specifies that generated data should be used. There are two
components to a generator specification - a size, and a data type. By default
__pathod__ assumes a data type of "bytes".
Here's a value specifier for generating 100 bytes:
@100
You can use standard suffixes to indicate larger values. Here, for instance, is
a specifier for generating 100 megabytes:
@100m
Data is generated and served efficiently - if you really want to send a
terabyte of data to a client, __pathod__ can do it. The supported suffixes are:
b = 1024**0 (bytes)
k = 1024**1 (kilobytes)
m = 1024**2 (megabytes)
g = 1024**3 (gigabytes)
t = 1024**4 (terabytes)
Data types are separated from the size specification by a comma. This
specification generates 100mb of ASCII:
@100m,ascii
Supported data types are:
ascii_letters
ascii_lowercase
ascii_uppercase
digits
hexdigits
letters
lowercase
octdigits
printable
punctuation
uppercase
whitespace
ascii
bytes
# API
__pathod__ exposes a simple API, intended to make it possible to drive and
inspect the daemon remotely for use in unit testing and the like. The next
release will include a client-side library that makes this transparent.
### /api/log
Returns the current log buffer. At the moment the buffer size is 500 entries -
when the log grows larger than this, older entries are discarded. The returned
data is a JSON dictionary, with the form:
{
'logs': [ ENTRIES ]
}
Where each entry looks like this:
{
# Record of actions taken at specified byte offsets
'actions': [(200, 'disconnect'), (10, 'pause', 1)],
# HTTP return code
'code': 200,
# Request duration in seconds
'duration': 0.00020599365234375,
# ID unique to this invocation of pathod
'id': 2,
# The request that triggered the response
'request': {
'full_url': 'http://testing:9999/p/200:b@1000:p1,10:d200',
'headers': {
'Accept': '*/*',
'Host': 'localhost:9999',
'User-Agent': 'curl/7.21.4'
},
'host': 'localhost:9999',
'method': 'POST',
'path': '/p/200:b@1000:p1,10:d200',
'protocol': 'http',
'query': '',
'remote_address': ('10.0.0.234', 63448),
'uri': '/p/200:b@1000:p1,10:d200',
'version': 'HTTP/1.1'
},
# The response spec that was served. You can re-parse this to get full
# details on the response.
'spec': '200:b@1000:p1,10:d200',
# Time at which response startd.
'started': 1335735586.469218
}
You can preview the JSON data returned for a log entry through the built-in web
interface.
### /api/log/clear
A POST to this URL clears the log buffer.
# Installing
__pathod__ requires Tornado 2.2.1 or later. If you already have __pip__ on your
system, installing __pathod__ and its dependencies is dead simple:
pip install pathod
The project uses the __pry__ unit testing framework, which you can get here:
http://github.com/cortesi/pry