The configuration file should be located at:
/etc/apache2/apache2.conf
The line to configure access logs is the one starting with "LogFormat" followed by the list of fields codes.
Most commonly used format strings are:
LogFormat "%h %l %u %t \"%r\" %>s %O" common
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-agent}i\"" combined
A suggested format string, to allow using the complete set of functionalities of LogDoctor, is:
LogFormat "%{%F %T}t %H %m %U %q %>s %I %O %D \"%{Referer}i\" \"%{Cookie}i\" \"%{User-agent}i\" %{c}h" combined
The string above should be preferred, but alternatives can be used as well, like:
LogFormat "%{sec}t \"%r\" %q %<s %I %O %D \"%{Referer}i\" \"%{Cookie}i\" \"%{User-agent}i\" %h" combined
If you're using your own custom string, please keep in mind that parsing is not magic. When you define your own string, think about which characters can be there in a field and use separators accordingly to not conflict with the field itself.
As an example: an URI (%U) can't contain whitespaces, so it is safe to use a space to separe this field by the previous and next one. Instead, the User-Agent (*%{User-agent}i*) may contain spaces, as well as parenthesis, brackets, dashes, etc, so it's better to pick an appropriate separator (double-quotes are a good choice, since they get escaped while logging).
Although Apache2 does support some control-characters (aka escape sequences), it is reccomended to not use them inside format strings.
In particular, the carriage return will most-likely overwrite previous fields data, making it very difficult to understand where the current field ends (specially for fields like URIs, queries, user-agents, etc) and nearly impossible to retrieve the overwritten data, which will lead in having a wasted database, un-realistic statistics and/or crashes during execution.
About the new line character, it has no sense to use it, if not for testing purposes. The same is true for the horizontal tab, for which is better to use a simple whitespace instead.
The only control-characters supported by Apache2 are \n, \t and \c. Any other character will be ignored and treated as text.
Only the following fields will be considered, meaning that only these fields' data will be stored and used for the statistics.
Code | Informations | ||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
%% | The percent sign character, will result in a single percent sign and treated as normal text (from both Apache and LogDoctor). | ||||||||||||||||||||||||||||||||||||||||||||||
%t | Time the request was received, in the format [DD/Mon/YYYY:hh:mm:ss ±TZ]. The last number (TZ) indicates the timezone offset from GMT. | ||||||||||||||||||||||||||||||||||||||||||||||
%{FORMAT}t | Time the request was received, in the form given by FORMAT, which should be in an extended strftime format. The following format tokens are supported (by LogDoctor, any other than the following will be discarded, even if valid):
|
||||||||||||||||||||||||||||||||||||||||||||||
%r | First line of request, equivalent to: %m %U?%q %H. | ||||||||||||||||||||||||||||||||||||||||||||||
%H | The request protocol (HTTP/v, HTTPS/v). | ||||||||||||||||||||||||||||||||||||||||||||||
%m | The request method (GET, POST, HEAD, ...). | ||||||||||||||||||||||||||||||||||||||||||||||
%U | The URI path requested, not including any query string. | ||||||||||||||||||||||||||||||||||||||||||||||
%q | Query string (if any). | ||||||||||||||||||||||||||||||||||||||||||||||
%s | HTTP Status code at the beginning of the request (exclude redirections statuses). | ||||||||||||||||||||||||||||||||||||||||||||||
%>s | Final HTTP Status code (in case requests have been internally redirected). | ||||||||||||||||||||||||||||||||||||||||||||||
%I | Bytes received, including request and headers (you need to enable mod_logio to use this). | ||||||||||||||||||||||||||||||||||||||||||||||
%O | Bytes sent, including headers (you need to enable mod_logio to use this). | ||||||||||||||||||||||||||||||||||||||||||||||
%T | The time taken to serve the request, in seconds. | ||||||||||||||||||||||||||||||||||||||||||||||
%{UNIT}T | The time taken to serve the request, in a time unit given by UNIT (only available in 2.4.13 and later). Valid units are:
|
||||||||||||||||||||||||||||||||||||||||||||||
%D | The time taken to serve the request, in milliseconds. | ||||||||||||||||||||||||||||||||||||||||||||||
%h | IP Address of the client (remote hostname). | ||||||||||||||||||||||||||||||||||||||||||||||
%{c}h | Like %h, but always reports on the hostname of the underlying TCP connection and not any modifications to the remote hostname by modules like mod_remoteip. | ||||||||||||||||||||||||||||||||||||||||||||||
%{VARNAME}i | The contents of VARNAME: header line(s) in the request sent to the server. Supported varnames (by LogDoctor) are:
|
Any field than the ones above won't be considered by LogDoctor.
When generating a log sample, these fields will appear as 'DISCARDED'.
If you aint using logs for any other purpose, please remove unnecessary fields to make the process faster and reduce the possibility of errors.