The main advantages are that you are using an address database system that can easily be accessed from any number of other programs via ODBC, perl, java, PHP, ... You can easily hook up ezmlm with your customer database, etc. ezmlm programs compiled with SQL support (and when available also those compiled with support for other SQL servers) are entirely backwards compatible. You can mix SQL dbs with normal ezmlm dbs, and convert lists between them.
The main disadvantages of the SQL version are that you need to be familiar with the SQL server, the binaries are quite a bit larger, and you are trusting your addresses to a large database program, rather than a small and easily audited set of ezmlm programs. Also, the SQL server becomes a single point of failure.
Ezmlm with SQL support continues to rely on qmail stability. If connection fails, ezmlm aborts with a temporary error causing redelivery at a later time point.
The basic philosophy is that the database can be on any host (if you use SENDER restrictions, connectivity to the main host is more important than to the sublists), and you choose the database and ``table root'' names. The default database is ``ezmlm'' and the default table root is ``list''. Each list has a separate table root. Any number of lists can share a database.
The main list address table is named with the table root only, others have that name with various suffixes. In the following ``list'' is used as the table root.
List subscriber addresses.
Digest list subscriber addresses.
List subscriber alias addresses. Used only if SENDER restrictions are used for the list. This is configured in the default SQL list setup, but a local (ezmlm-style non-SQL) database could also be used.
List deny addresses. This table is created, but the default configuration, if it uses the ``deny'' addresses at all, will do so with a local database.
Moderator addresses. Created for completeness, but not used in the default configuration. If moderators are used, the addresses are stored in a local database.
For each of the above tables, there is a ``*_slog'' table that contains one row per transaction against the corresponding address table. The entries contain a time stamp, the subscription address; a direction indicator (``-'' for removals, ``+'' for additions); a type indicator (blank for ezmlm-manage, ``m'' for ``manual'', ``p'' for ``probe, i.e. bounce handling; and the subscriber ``From:'' line contents (only additions and only when made by ezmlm-manage or by ``ezmlm-sub(1) -n'').
For both the list and the digest list, there are a pair of tables that log messages:
The main list stores the message number and a pseudo-random cookie in this table when it processes the message. The cookie is derived from the secret DIR/key, the message sender and the message number. Thus, it is non-repeating and virtually impossible to guess beforehand. Sublists will check that the cookie sent with the message is the same as the one received with the message.
The digest list is created similarly, except that it is ezmlm-get(1) that originates the message and creates the cookie. This is done in ``list_digest_cookie''.
Both the main list and the sublists make entries in this table. Each entry consists of a time stamp, a message number, a list number, and a code. The code is 0 for message arrival, 1 for ``finished processing'', 2 for ``receipt received'' and -1 for bounce. The lists will refuse to process messages that do not have the correct cookie, or if the message already has an entry with a code of greater than 0. To inject a message at the sublist, an attacker would have to inject a message with the correct code before the list has processed the ``real'' message, or subvert the SQL server. In practice, this is very hard to do, unless the attacker has broken security at the database server or a sublist. This authentication mechanism is intended to make it safe to sublist moderated lists. It also blocks any message duplication between main list and sublist from being propagated to the subscribers.
The codes 2 for ``receipt received'' and -1 for bounce are entered by
ezmlm-receipt(1) at the main list. This program is configured instead
of ezmlm-return(1) if the main list was set up with ``ezmlm-make -w6''.
ezmlm-receipt(1) checks the cookie of messages addresses to
mainlocal-return-receipt@mainhost
and if correct enters the ``receipt
received'' code. This address is normally in the subscriber database with
a hash of 98, so that each list sends a message to the address after
all subscriber addresses.
Bounces of sublist messages should not lead to removal of the sublist from the database. ezmlm-receipt(1) will instead log the bounce to the ``list_mlog'' table. It will also store up to 50 bounces in the bounce directory. This helps error detection and diagnosis. After the first 50 bounces, no more bounces are stored, until you manually remove the old ones. This is to prevent filling up your hard disk in case a configuration error causes a deluge of bounces.
The digest list is treated in the same manner. Here, the tables
is ``list_digest_mlog'' and the feedback address is
mainlocal-digest-return-receipt@mainhost
.
To use SQL database support, you have to compile the programs with SQL support. Currently, only MySQL support is available. See INSTALL.idx in the package on how to do this.
The programs with SQL support will work exactly like the normal programs for standard lists. However, if the file sql exists in the basedir, it turns on the SQL mode and it is expected to contain SQL server connect info in the format
``host:port:user:password:database:table''
Here, ``Host''
is the SQL database server host, ``port'' can be left blank to use
the default port, ``user'' and ``password'' are connection credentials
for a user you need to define and grant access to the database. ``Table'' is
the name of the address table (``list'' in the examples above
and ``list_digest'' for the corresponding digest list). For list
clusters, ``:sublist'' is suffixed to this info and it is the name/address
of the sublist.
For each address database, you also need to create the address table as well as the ``*_slog'' subscription log table. In addition, you should create a ``*_cookie'' and ``*_mlog'' table for message logging. This is all it takes to start using an SQL database.
Two programs are supplied in the distribution to make it easier to create the database user and tables. Also, ezmlm-make(1) has support for setting up SQL-enabled lists.
ezmlm-mktab(1) will create the necessary tables:
% ezmlm-mktab -d table
Pipe this into the SQL client with
the appropriate administrator credentials needed to create
tables (see MySQL documentation, e.g.
http://www.tcx.se/).
For most lists, the only addresses that are stored in the SQL database are the subscribers of list and digest, and the ``allow'' aliases. It is NOT normally advisable to store moderator addresses there, since they are needed only at the main list and secrecy is more important. ``Deny'' addresses are few and again only needed at the main list. ``Allow'' are put in the SQL database when using the default ezmlmrc file only to make all relevant addresses manipulatable via the SQL server. The other tables are created, in case they are wanted (the cost for having them as empty table is zero). The basedir/sql file is the decision point. If it exists, an SQL table is used; if not a local ezmlm db is used.
Create a user that has full access to the database from the list host. How to do this depends on the RDBMS.
ezmlm-make(1) supports SQL-enabled lists with the ``-6'' switch:
% ezmlm-make other_switches -6 'host:port:user:pw:db:table' \
dir dot local host
Will create an SQL-enabled list that uses the SQL server for the main list
subscribers, digest list subscribers (if configured) and ``allow'' poster
alias addresses (if configured).
ezmlm-sub(1), ezmlm-unsub(1), and ezmlm-list(1) work as you would expect also with a SQL-enabled list. ezmlm-list(1) may be minimally slower (depending on network speed) if the SQL server is not local. ezmlm-sub(1) and ezmlm-unsub(1) will be faster, but this is noticeable only with very large subscriber lists and addition/removal of large numbers of addresses (more than several thousands).
Just like other programs, ezmlm-list(1), ezmlm-sub(1), and ezmlm-unsub(1) will work with normal address databases in the absence of DIR/sql. However, they also have a ``-M'' switch to force this behavior even in the presence of DIR/sql. This is used to convert an address database from the standard type to the SQL type:
% ezmlm-list -M dir | xargs ezmlm-sub dir
or from the SQL version to the standard type:
% ezmlm-list dir | xargs ezmlm-sub -M dir
To synchronize the two, remove one and then update it with ezmlm-sub(1) from
the other. Alternatively, sort the ezmlm-list(1) output for both, use
diff and sed/awk to get separate files of the differences, and use ezmlm-sub(1)
and ezmlm-unsub(1) to apply the differences to the appropriate database.
This type of conversion can serve as a convenient means to convert a list from one type to another, to back up databases, and to move subscriber addresses from a standard list to a SQL table for other purposes, or from a SQL database to a standard mailing list (you may need to use addresses from a SQL table, without wanting your lists to be dependent on an SQL server for day to day operation).
Note: This inter-conversion requires the DIR/sql file. If you do not run the list against an SQL server, you need to disable deliveries before you temporarily create this file. Otherwise, the list will run against the SQL database during the time DIR/sql exists.
ezmlm-idx-0.40 simplifies the SQL support and queries over ezmlm-idx-0.32 at the cost of dropping distributed sublist support. We have figured out a simpler way to support the latter, which hopefully will be incorporated into ezmlm in the future (written under contract).
With the simplification, the queries are very straight forward, and tuning is indicated only under extreme circumstances (very many very large and busy lists or constant addition/removal of many addresses).
Weekly to monthly error checks on MySQL tables is recommended. Best is to use:
# isamchk -s -O readbuffer=2M */*.ISM
Other options allow automatic correction of errors, but are dangerous if
tables are accessed while isamchk is running.
Other isamchk options allow recovery of space after frequent insert/delete of addresses (can also be done with ``OPTIMIZE TABLE''), key optimization, etc. See the MySQL documentation ( http://www.tcx.se) for more info.