The liveweb-proxy can be configured using various command-line options and/or a config file.
Config file can be specified as:
$ liveweb-proxy -c liveweb.ini
or:
$ liveweb-proxy --config liveweb.ini
This section describes the available config settings. For each config setting, there is a command line option with the same name.
For example, config setting archive-format is available as command line argument –archive-format.
The config file is specified in INI format. Here is a sample config file.
[liveweb]
archive-format = arc
output-directory = /tmp/records
dns-timeout = 2s
archive-format
Specifies the archive format. Should be one if arc or warc.
The default value is arc.
Warning
As of now only arc is supported.
output-directory
Output directory to write ARC/WRC files. Default value is “records”.
filename-pattern
The pattern of the filename specified as Python string formatting template. The default value is live-%(timestamp)s-%(serial)05d.arc.gz.
Available substitutions are timestamp, serial, pid, fqdn (fully qualified domain name) and port.
filesize-limit
The limit on the size of file. If a file crosses this size, it will be closed a new file will be created to write new records.
num-writers
The number of concurrent writers.
The default value is 1.
cache
Type of cache to use. Available options are redis, sqlite and none.
The default value is none.
redis-host
redis-port
redis-db
Redis host, port and db number. Used only when cache=redis.
redis-expire-time
Expire time to set in redis. Used only when cache=redis.
The default value is 1h (1 hour).
redis-max-record-size
Maximum allowed size of a record that can be cached. Used only when cache=redis.
The default value is 100KB.
sqlite-db
Path to the sqlite database to use. This option is valid only when cache=sqlite.
The default value is liveweb.db.
default-timeout
This is the default timeout value for connect-timeout, initial-data-timeout and read-timeout.
The default value is 10s.
dns-timeout
Specifies the max amount of time can a DNS resolution can take.
Python doesn’t support a way to specify DNS timeout. On Linux, the dns timeout can be specified via the RES_OPTIONS environment variable. This enviroment variable is set at the startup of the application based on this config setting.
If unspecified, the DNS timeout is decided by the system default behavior.
See resolv.conf man page for more details.
connect-timeout
Specifies the connect timeout in seconds. Connections that take longer to establish will be aborted.
initial-data-timeout
Specifies the maximum time allowed before receiving initial data (HTTP headers) from the remote server.
read-timeout
Specifies the read timeout in seconds. This indicates the idle time. If no data is received for more than this time, the request will fail.
max-request-time
Specifies the total amout of time a HTTP request can take. If it takes more than this, the current request will fail.
The default value is 2m.
max-response-size
Specifies the maximum allowed size of response.
The default value is 100MB.
user-agent
Specifies the value of the User-Agent request header.
The default value is ia_archiver(OS-Wayback).
http-passthrough
This is a boolean parameter, setting it to true will make it work like a http proxy with archiving. Useful for testing and recording personal browsing.