mitmproxy/README.mkd

312 lines
8.3 KiB
Markdown
Raw Normal View History

2012-04-29 05:37:47 +00:00
__pathod__ is a collection of pathological tools for testing and torturing HTTP
clients and servers. The project has three components:
- __pathod__, an pathological HTTP daemon.
- __pathoc__, a fiendishly perverse HTTP client.
- __libpathod.test__, an API for easily using __pathod__ and __pathoc__ in unit tests.
At __pathod__'s heart is a tiny, terse language for crafting HTTP responses.
The simplest way to use __pathod__ is to fire up the daemon, and specify the
response behaviour you want using this language in the request URL. Here's a
minimal example:
2012-04-29 05:37:47 +00:00
http://localhost:9999/p/200
2012-04-29 09:51:03 +00:00
Everything after the "/p/" path component is a response specifier - in this
case just a vanilla 200 OK response. See the docs below to get (much) fancier.
You can also add anchors to the __pathod__ server that serve a fixed response
whenever a matching URL is requested:
2012-04-29 05:37:47 +00:00
2012-04-29 06:28:46 +00:00
pathod --anchor "/foo=200"
2012-04-29 05:37:47 +00:00
2012-04-29 09:51:03 +00:00
Here, "/foo" a regex specifying the anchor path, and the part after the "=" is
a response specifier.
2012-04-29 05:37:47 +00:00
2012-04-29 09:51:03 +00:00
__pathod__ also has a nifty built-in web interface, which lets you play with
the language by previewing responses, exposes activity logs, online help and
various other goodies. Try it by visiting the server root:
2012-04-29 05:37:47 +00:00
http://localhost:9999
Specifying Responses
====================
The general form of a response is as follows:
code[MESSAGE]:[colon-separated list of features]
Here's the simplest possible response specification, returning just an HTTP 200
OK message with no headers and no content:
200
We can embellish this a bit by specifying an optional custom HTTP response
2012-04-29 09:43:28 +00:00
message (if we don't, __pathod__ automatically creates an appropriate one). By
2012-04-29 06:28:46 +00:00
default for a 200 response code the message is "OK", but we can change it like
this:
2012-04-29 05:37:47 +00:00
200"YAY"
2012-04-29 06:28:46 +00:00
The quoted string here is an example of a Value Specifier, a syntax that is
2012-04-29 09:43:28 +00:00
used throughout the __pathod__ response specification language. In this case, the
2012-04-29 06:28:46 +00:00
quotes mean we're specifying a literal string, but there are many other fun
2012-04-29 09:43:28 +00:00
things we can do. For example, we can tell __pathod__ to generate 100k of random
2012-04-29 06:28:46 +00:00
ASCII letters instead:
2012-04-29 05:37:47 +00:00
200@100k,ascii_letters
2012-04-29 06:28:46 +00:00
Full documentation on the value specification syntax can be found in the next
section.
2012-04-29 05:37:47 +00:00
2012-04-29 09:51:03 +00:00
Following the response code specifier is a colon-separated list of features.
2012-04-29 05:37:47 +00:00
For instance, this specifies a response with a body consisting of 1 megabyte of
random data:
200:b@1m
And this is the same response with an ETag header added:
200:b@1m:h"Etag"="foo"
Both the header name and the header value are full value specifiers. Here's the
same response again, but with a 1k randomly generated header name:
200:b@1m:h@1k,ascii_letters="foo"
A few specific headers have shortcuts, because they're used so often. The
2012-04-29 09:51:03 +00:00
shortcut for the content-type header is "c":
2012-04-29 05:37:47 +00:00
200:b@1m:c"text/json"
That's it for the basic response definition. Now we can start mucking with the
responses to break clients. One common hard-to-test circumstance is hangs or
2012-04-29 09:43:28 +00:00
slow responses. __pathod__ has a pause operator that you can use to define
2012-04-29 05:37:47 +00:00
precisely when and how long the server should hang. Here, for instance, we hang
for 120 seconds after sending 50 bytes (counted from the first byte of the HTTP
response):
200:b@1m:p120,50
2012-04-29 09:43:28 +00:00
If that's not long enough, we can tell __pathod__ to hang forever:
2012-04-29 05:37:47 +00:00
200:b@1m:p120,f
Or to send all data, and then hang without disconnecting:
200:b@1m:p120,a
2012-04-29 09:43:28 +00:00
We can also ask __pathod__ to hang randomly:
2012-04-29 05:37:47 +00:00
200:b@1m:pr,a
2012-04-29 06:28:46 +00:00
There is a similar mechanism for dropping connections mid-response. So, we can
2012-04-29 09:43:28 +00:00
tell __pathod__ to disconnect after sending 50 bytes:
2012-04-29 05:37:47 +00:00
200:b@1m:d50
Or randomly:
200:b@1m:dr
All of these features can be combined. Here's a response that pauses twice,
2012-04-29 06:28:46 +00:00
once at 10 bytes and once at 20, then disconnects at 5000:
2012-04-29 05:37:47 +00:00
200:b@1m:p10,10:p20,10:d5000
Features
========
#### hKEY=VALUE
2012-04-29 05:37:47 +00:00
2012-04-29 06:28:46 +00:00
Set a header. Both KEY and VALUE are full _Value Specifiers_.
#### bVALUE
2012-04-29 06:28:46 +00:00
2012-04-29 09:43:28 +00:00
Set the body. VALUE is a _Value Specifier_. When the body is set, __pathod__ will
2012-04-29 06:28:46 +00:00
automatically set the appropriate Content-Length header.
#### cVALUE
2012-04-29 06:28:46 +00:00
A shortcut for setting the Content-Type header. Equivalent to:
h"Content-Type"=VALUE
#### lVALUE
2012-04-29 06:28:46 +00:00
A shortcut for setting the Location header. Equivalent to:
h"Content-Type"=VALUE
#### dOFFSET
2012-04-29 06:28:46 +00:00
2012-04-29 09:43:28 +00:00
Disconnect after OFFSET bytes. The offset can also be "r", in which case __pathod__
2012-04-29 06:28:46 +00:00
will disconnect at a random point in the response.
#### pSECONDS,OFFSET
2012-04-29 06:28:46 +00:00
Pause for SECONDS seconds after OFFSET bytes. SECONDS can also be "f" to pause
forever. OFFSET can also be "r" to generate a random offset, or "a" for an
offset just after all data has been sent.
2012-04-29 05:37:47 +00:00
Value Specifiers
2012-04-29 06:28:46 +00:00
================
2012-04-29 05:37:47 +00:00
2012-04-29 06:42:06 +00:00
There are three different flavours of value specification.
### Literal
Literal values are specified as a quoted strings:
"foo"
Either single or double quotes are accepted, and quotes can be escaped with
backslashes within the string:
'fo\'o'
### Files
You can load a value from a specified file path. To do so, you have to specify
2012-04-29 09:43:28 +00:00
a _staticdir_ option to __pathod__ on the command-line, like so:
2012-04-29 06:42:06 +00:00
pathod -d ~/myassets
All paths are relative paths under this directory. File loads are indicated by
starting the value specifier with the left angle bracket:
<my/path
The path value can also be a quoted string, with the same syntax as literals:
<"my/path"
### Generated values
An @-symbol lead-in specifies that generated data should be used. There are two
components to a generator specification - a size, and a data type. By default
2012-04-29 09:43:28 +00:00
__pathod__ assumes a data type of "bytes".
2012-04-29 06:42:06 +00:00
Here's a value specifier for generating 100 bytes:
@100
You can use standard suffixes to indicate larger values. Here, for instance, is
a specifier for generating 100 megabytes:
@100m
2012-04-29 21:46:49 +00:00
Data is generated and served efficiently - if you really want to send a
terabyte of data to a client, __pathod__ can do it. The supported suffixes are:
2012-04-29 06:42:06 +00:00
b = 1024**0 (bytes)
k = 1024**1 (kilobytes)
m = 1024**2 (megabytes)
g = 1024**3 (gigabytes)
t = 1024**4 (terabytes)
Data types are separated from the size specification by a comma. This
specification generates 100mb of ASCII:
@100m,ascii
Supported data types are:
ascii_letters
ascii_lowercase
ascii_uppercase
digits
hexdigits
letters
lowercase
octdigits
printable
punctuation
uppercase
whitespace
ascii
bytes
2012-04-29 10:08:35 +00:00
2012-04-29 21:46:49 +00:00
# API
__pathod__ exposes a simple API, intended to make it possible to drive and
inspect the daemon remotely for use in unit testing and the like.
2012-04-29 21:46:49 +00:00
### /api/log
Returns the current log buffer. At the moment the buffer size is 500 entries -
when the log grows larger than this, older entries are discarded. The returned
data is a JSON dictionary, with the form:
{
'logs': [ ENTRIES ]
}
Where each entry looks like this:
{
# Record of actions taken at specified byte offsets
'actions': [(200, 'disconnect'), (10, 'pause', 1)],
# HTTP return code
'code': 200,
# Request duration in seconds
'duration': 0.00020599365234375,
# ID unique to this invocation of pathod
'id': 2,
# The request that triggered the response
'request': {
'full_url': 'http://testing:9999/p/200:b@1000:p1,10:d200',
'headers': {
'Accept': '*/*',
'Host': 'localhost:9999',
'User-Agent': 'curl/7.21.4'
},
'host': 'localhost:9999',
'method': 'POST',
'path': '/p/200:b@1000:p1,10:d200',
'protocol': 'http',
'query': '',
'remote_address': ('10.0.0.234', 63448),
'uri': '/p/200:b@1000:p1,10:d200',
'version': 'HTTP/1.1'
},
# The response spec that was served. You can re-parse this to get full
# details on the response.
'spec': '200:b@1000:p1,10:d200',
# Time at which response startd.
'started': 1335735586.469218
}
2012-04-29 21:54:49 +00:00
You can preview the JSON data returned for a log entry through the built-in web
interface.
2012-04-29 21:54:49 +00:00
### /api/log/clear
A POST to this URL clears the log buffer.
2012-04-29 21:46:49 +00:00
2012-04-29 10:08:35 +00:00
2012-04-29 10:13:47 +00:00
# Installing
2012-04-29 10:08:35 +00:00
2012-06-21 04:58:10 +00:00
If you already have __pip__ on your system, installing __pathod__ and its
dependencies is dead simple:
2012-04-29 10:08:35 +00:00
pip install pathod
The project uses the __nose__ unit testing framework, which you can get here:
2012-04-29 10:13:47 +00:00