--- spamprobe-1.4d.orig/spamprobe.1 +++ spamprobe-1.4d/spamprobe.1 @@ -1,907 +1,927 @@ -.TH spamprobe 1 "December 2005" "Version 1.4" SpamProbe -.SH NAME -spamprobe \- a bayesian spam filter -.SH SYNOPSIS -.B spamprobe -[options] [filename...] - -.SH INTRODUCTION - -SpamProbe can be used in conjunction with procmail or similar program -to filter email. SpamProbe uses a statistical algorithm to identify -the key words and phrases in email and determine which emails are -legitimate and which are spam. The algorithm used by SpamProbe is -based on an excellent article by Paul Graham. He describes the basic -idea and his results. You can read his article here: - - http://www.paulgraham.com/spam.html - - -.SH COMMAND LINE USAGE - -SpamProbe accepts a small set of commands and a growing set of options -on the command line in addition to zero or more file names of mboxes. -The general usage is: - - spamprobe [options] [filename...] - +'\" t +.\" Title: SPAMPROBE +.\" Author: [see the "AUTHOR" section] +.\" Generator: DocBook XSL Stylesheets v1.75.2 +.\" Date: 05/24/2010 +.\" Manual: User commands +.\" Source: User commands +.\" Language: English +.\" +.TH "SPAMPROBE" "1" "05/24/2010" "User commands" "User commands" +.\" ----------------------------------------------------------------- +.\" * Define some portability stuff +.\" ----------------------------------------------------------------- +.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.\" http://bugs.debian.org/507673 +.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html +.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" ----------------------------------------------------------------- +.\" * set default formatting +.\" ----------------------------------------------------------------- +.\" disable hyphenation +.nh +.\" disable justification (adjust text to left margin only) +.ad l +.\" ----------------------------------------------------------------- +.\" * MAIN CONTENT STARTS HERE * +.\" ----------------------------------------------------------------- +.SH "NAME" +spamprobe \- A Bayesian spam filter +.SH "SYNOPSIS" +.HP \w'\fBspamprobe\fR\ 'u +\fBspamprobe\fR [\fIoptions\fR] \fI\ command\ \fR [\fIfiles\fR\ \&.\&.\&.] +.SH "DESCRIPTION" +.PP +SpamProbe +is a spam filter relying on a Bayesian analysis of the frequency of words used in spam and non\-spam emails received by an individual person\&. The process is completely automatic and tailors itself to the kinds of emails that each person receives\&. +.PP +SpamProbe +recognizes and decodes MIME attachments in quoted\-printable and base64 encoding\&. Image attachments are considered as words that can signal a spam\&. By default, it ignores HTML tags for scoring purpose\&. +.PP +SpamProbe +supports MBOX, MBX and Maildir mailbox formats\&. These formats are automatically detected for mailboxes used as parameters of +SpamProbe +commands\&. +.PP +\fBspamprobe\fR +is designed to be used in mail delivery agents (MDAs) like +\fBprocmail\fR(1) +or +\fBmaildrop\fR(1) +to help in identifying spam\&. +.SH "OPTIONS" +.PP The recognized options are: - - -a char - - By default SpamProbe converts non-ascii characters (characters - with the most significant bit set to 1) into the letter 'z'. This - is useful for lumping all Asian characters into a single word for - easy recognition. The -a option allows you to change the - character to something else if you don't like the letter 'z' for - some reason. - - -c - - Tells spamprobe to create the database directory if it does not - already exist. Normally spamprobe exits with a usage error if - the database directory does not already exist. - - -C number - - Tells SpamProbe to assign a default, somewhat neutral, probability - to any term that does not have a weighted (good count doubled) - count of at least number in the database. This prevents terms - which have been seen only a few times from having an unreasonable - influence on the score of an email containing them. - - The default value is 5. For example if number is 5 then in order - for a term to use its calculated probability it must have been - seen 3 times in good mails, or 2 times in good mails and once in - spam, or 5 times in spam, or some other combination adding up to - at least 5. - - -d directory - - By default SpamProbe stores its database in a directory named - .spamprobe under your home directory. The -d option allows you to - specify a different directory to use. This is necessary if your - home directory is NFS mounted for example. - - The directory name can be prefixed with a special code to force - SpamProbe to use a particular type of data file format. The type - codes depend on how your copy of SpamProbe was compiled. Defined - types include: - - Example Description - -d pbl:path Forces the use of PBL data file. - -d hash:path Forces the use of an mmapped hash file. - -d split:path Forces the use of a hash file and ISAM - file (may provide better precision than - plain hash in some cases). - - The hash: option can also specify a desired file size in megabytes - before the path. For example -d hash:19:path would cause - SpamProbe to use a 19 MB hash file. The size must be in the range - of 1-100. The default hash file size is 16 MB. Because hash - files have a fixed size and capacity they should be cleaned - relatively often using the cleanup command (see below) to prevent - them from becoming full or being slowed by too many hash key - collisions. - - Hash files provide better performance than either of the ISAM - options (PBL or Berkeley DB). However hash files do not store the - original terms. Only a 32 bit hash key is stored with each term. - This prevents a user from exploring the terms in the database - using the dump command to see what words are particularly spammy - or hammy. - - -D directory - - Tells SpamProbe to use the database in the specified directory - (must be different than the one specified with the -d option) as a - shared database from which to draw terms that are not defined in - the user's own database. This can be used to provide a baseline - database shared by all users on a system (in the -D directory) and - a private database unique to each user of the system - ($HOME/.spamprobe or -d directory). - - -g field_name - - Tells SpamProbe what header to look for previous score and message - digest in. Default is X-SpamProbe. Field name is not case - sensitive. Used by all commands except receive. - - -h - - By default SpamProbe removes HTML markup from the text in emails - to help avoid false positives. The -h option allows you to - override this behavior and force SpamProbe to include words from - within HTML tags in its word counts. Note that SpamProbe always - counts any URLs in hrefs within tags whether -h is used or not. - Use of this option is discouraged. It can increase the rate of - spam detection slightly but unless the user receives a significant - amount of HTML emails it also tends to increase the number of - false positives. - - -H option - - By default SpamProbe only scans a meaningful subset of headers - from the email message when searching for words to score. The -H - option allows the user to specify additional headers to scan. - Legal values are "all", "nox", "none", or "normal". "all" scans - all headers, "nox" scans all headers except those starting with - X-, "none" does not scan headers, and "normal" scans the normal - set of headers. - - In addition to those values you can also explicitly add a header - to the list of headers to process by adding the header name in - lower case preceded by a plus sign. Multiple headers can be - specified by using multiple -H options. For example, to include - only the From and Received headers in your train command you could - run spamprobe as follows: - - spamprobe -Hnone -H+from -H+received train - - You can also selectively ignore headers that would otherwise be - processed by using -H-headername. For example to process all - headers except for Subject you could run spamprobe as follows: - - spamprobe -Hall -H-subject train - - To process the normal set of headers but also add the SpamAssassin - header X-SpamStatus you could run spamprobe as follows: - - spamprobe -H+x-spam-status train - - -l number - - Changes the spam probability threshold for emails from the default - (0.7) to number. The number must be a between 0 and 1. Generally - the value should be above 0.5 to avoid a high false positive rate. - Lower numbers tend to produce more false positives while higher - numbers tend to reduce accuracy. - - -m - - Forces SpamProbe to use mbox format for reading emails in receive - mode. Normally SpamProbe assumes that the input to receive mode - contains a single message so it doesn't look for message breaks. - - -M - - Forces SpamProbe to treat the entire input as a single message. - This ignores From lines and Content-Length headers in the input. - - -o option_name - - Enables special options by name. Currently the only special - options are: - - -o graham - - Causes SpamProbe to emulate the filtering algorithm originally - outlined in A Plan For Spam. - - -o honor-status-header - - Causes SpamProbe to ignore messages if they have a Status: - header containing a capital D. Some mail servers use this - status to indicate a message that has been flagged for - deletion but has not yet been purged from the file. - - DO NOT use this option with the receive or train command in - your procmailrc file! Doing so could allow spammers to bypass - the filter. This option is meant to be used with the - train-spam and train-good commands in scripts that - periodically update the database. - - -o honor-xstatus-header - - Causes SpamProbe to ignore messages if they have a X-Status: - header containing a capital D. Some mail servers use this - status to indicate a message that has been flagged for - deletion but has not yet been purged from the file. - - DO NOT use this option with the receive or train command in - your procmailrc file! Doing so could allow spammers to bypass - the filter. This option is meant to be used with the - train-spam and train-good commands in scripts that - periodically update the database. - - -o ignore-body - - Causes SpamProbe to ignore terms from the message body when - computing a score. This is not normally recommended but might - be useful in conjunction with some other filter. For example, - the whitelist option (see below) implicitly ignores the - message body. - - -o orig-score - - Causes SpamProbe to use its original scoring algorithm that - produces excellent results but tends to generate scores of - either 0 or 1 for all messages. - - -o suspicious-tags - - Causes SpamProbe to scan the contents of "suspicious" tags for - tokens rather than simply throwing them out. Currently only - font tags are scanned but other tags may be added to this list - in later versions. - - -o tokenized - - Causes SpamProbe to read tokens one per line rather than - processing the input as mbox format. This allows users to - completely replace the standard spamprobe tokenizer if they - wish and instead use some external program as a tokenizer. - For example in your procmailrc file you could use: - - SCORE=| tokenize.pl | /bin/spamprobe -o tokenized train - - In this mode SpamProbe considers a blank line to indicate the - end of one message's tokens and the start of a new message's - tokens. SpamProbe computes a message digest based on the - lines of text containing the tokens. - - -o whitelist - - Causes SpamProbe to use information from the email's headers - to identify whether or not the email is from a legitimate - correspondent. The message body is ignored as are any never - before seen terms and phrases in the headers. This option can - be used with the score command in a procmailrc file to use a - bayesian white list in conjunction with some other filter or - rule external to SpamProbe. - - The -o option can be used multiple times and all requested options - will be applied. Note that some options might conflict with each - other in which case the last option would take precedence. - - -p number - - Changes the maximum number of words per phrase. Default value is - two. Increasing the limit improves accuracy somewhat but - increases database size. Experiments indicate that increasing - beyond two is not worth the extra cost in space. - - -P number - - Causes spamprobe to perform a purge of all terms with junk count - less than or equal 2 after every number messages are processed. - Using this option when classifying a large collection of spam can - prevent the database from growing overly large at the cost of more - processing time and possible loss of precision. - - -r number - - Changes the number of times that a single word/phrase can occur - in the top words array used to calculate the score for each - message. Allowing repeats reduces the number of words overall - (since a single word occupies more than one slot) but allows words - which occur frequently in the message to have a higher weight. - Generally this is changed only for optimization purposes. - - -R - - Causes spamprobe to treat the input as a single message and to - base its exit code on whether or not that message was spam. The - exit code will be 0 if the message was spam or 1 if the message - was good. - - -s number - - SpamProbe maintains an in memory cache of the words it has seen in - previous messages to reduce disk I/O and improve performance. By - default the cache will contain the most recently accessed 2,500 - terms. This number can be changed using the -s option. Using a - larger the cache size will cause SpamProbe to use more memory and, - potentially, to perform less database I/O. - - A value of zero causes SpamProbe to use 100,000 as the limit which - effectively means that the cache will only be flushed at program - exit (unless you have really enormous mailbox files). The cache - doesn't affect receive, dump, or export but has a significant - impact on the others. - - -T - Causes SpamProbe to write out the top terms associated with each - message in addition to its normal output. Works with find-good, - find-spam, and score. - - -v - - Tells SpamProbe to write debugging information to stderr. This - can be useful for debugging or for seeing which terms SpamProbe - used to score each email. - - -V - - Prints version and copyright information and then exits. - - -w number - - Changes the number of most significant words/phrases used by - SpamProbe to calculate the score for each message. Generally this - is changed only for optimization purposes. - - -x - - Normally SpamProbe uses only a fixed number of top terms (as set - by the -w command line option) when scoring emails. The -x option - can be used to allow the array to be extended past the max size if - more terms are available with probabilities <= 0.1 or >= 0.9. - - -X - - An interesting variation on the scoring settings. Equivalent to - using "-w5 -r5 -x" so that generally only words with probabilites - <= 0.1 or >= 0.9 are used and word frequencies in the email count - heavily towards the score. Tests have shown that this setting - tends to be safer (fewer false positives) and have higher recall - (proper classification of spams previously scored as spam) - although its predictive power isn't quite as good as the default - settings. WARNING: This setting might work best with a fairly - large corpus, it has not been tested with a small corpus so it - might be very inaccurate with fewer than 1000 total messages. - - -Y - - Assume traditional Berkeley mailbox format, ignoring any - Content-Length: fields. - - -7 - - Tells SpamProbe to ignore any characters with the most significant - bit set to 1 instead of mapping them to the letter 'z'. - - -8 - - Tells SpamProbe to store all characters even if their most - significant bit is set to 1. - - -SpamProbe recognizes the following commands: - - spamprobe help [command] - - With no arguments spamprobe lists all of the valid commands. - If one or more commands are specified after the word help, - spamprobe will print a more verbose description of each command. - - spamprobe create-db - - If no database currently exists spamprobe will attempt to create - one and then exit. This can be used to bootstrap a new - installation. Strictly speaking this command is not necessary - since the train-spam, train-good, and auto-train commands will also - create a database if none already exists but some users like to - create a database as a separate installation step. - - spamprobe create-config - - Writes a new configuration file named spamprobe.hdl into the - database directory (normally $HOME/.spamprobe). Any existing - configuration file will be overwritten so be sure to make a copy - before invoking this command. - - spamprobe receive [filename...] - - Tells SpamProbe to read its standard input (or a file specified - after the receive command) and score it using the current - databases. Once the message has been scored the message is - classified as either spam or non-spam and its word counts are - written to the appropriate database. The message's score is - written to stdout along with a single word. For example: - - SPAM 0.9999999 595f0150587edd7b395691964069d7af - - or - - GOOD 0.0200000 595f0150587edd7b395691964069d7af - - The string of numbers and letters after the score is the message's - "digest", a 32 character number which uniquely identifies the - message. The digest is used by SpamProbe to recognize messages - that it has processed previously so that it can keep its word - counts consistent if the message is reclassified. - - Using the -T option additionally lists the terms used to produce - the score along with their counts (number of times they were found - in the message). - - spamprobe train [filename...] - - Functionally identical to receive except that the database is only - modified if the message was "difficult" to classify. In practice - this can reduce the number of database updates to as little as 10% - of messages received. - - spamprobe score [filename...] - - Similar to receive except that the database is not modified in - any way. - - spamprobe summarize [filename...] - - Similar to score except that it prints a short summary and score - for each message. This can be useful when testing. Using the -T - option additionally lists the terms used to produce the score along - with their counts (number of times they were found in the message). - - spamprobe find-spam [filename...] - - Similar to score except that it prints a short summary and score - for each message that is determined to be spam. This can be useful - when testing. Using the -T option additionally lists the terms - used to produce the score along with their counts (number of times - they were found in the message). - - spamprobe find-good [filename...] - - Similar to score except that it prints a short summary and score - for each message that is determined to be good. This can be useful - when testing. Using the -T option additionally lists the terms - used to produce the score along with their counts (number of times - they were found in the message). - - spamprobe auto-train {SPAM|GOOD filename...}... - - Attempts to efficiently build a database from all of the named - files. You may specify one or more file of each type. Prior to - each set of file names you must include the word SPAM or GOOD to - indicate what type of mail is contained in the files which follow - on the command line. - - The case of the SPAM and GOOD keywords is important. Any number of - file names can be specified between the keywords. The command line - format is very flexible. You can even use a find command in - backticks to process whole directory trees of files. For example: - - spamprobe auto-train SPAM spams/* GOOD `find hams -type f` - - SpamProbe pre-scans the files to determine how many emails of each - type exist and then trains on hams and spams in a random sequence - that balances the inflow of each type so that the train command can - work most effectively. For example if you had 400 hams and 400 - spams, auto-train will generally process one spam, then one ham, - etc. If you had 4000 spams and 400 hams then auto-train will - generally process 10 spams, then one ham, etc. - - Since this command will likely take a long time to run it is often - desireable to use it with the -v option to see progress information - as the messages are processed. - - spamprobe -v auto-train SPAM spams/* GOOD hams/* - - spamprobe good [filename...] - - Scans each file (or stdin if no file is specified) and reclassifies - every email in the file as non-spam. The databases are updated - appropriately. Messages previously classified as good (recognized - using their MD5 digest or message ids) are ignored. Messages - previously classified as spam are reclassified as good. - - spamprobe train-good [filename...] - - Functionally identical to "good" command except that it only - updates the database for messages that are either incorrectly - classified (i.e. classified as spam) or are "difficult" to - classify. In practice this can reduce amount of database updates - to as little as 10% of messages. - - spamprobe spam [filename...] - - Scans each file (or stdin if no file is specified) and reclassifies - every email in the file as spam. The databases are updated - appropriately. Messages previously classified as spam (recognized - using their MD5 digest of message ids) are ignored. Messages - previously classified as good are reclassified as spam. - - spamprobe train-spam [filename...] - - Functionally identical to "spam" command except that it only - updates the database for messages that are either incorrectly - classified (i.e. classified as good) or are "difficult" to - classify. In practice this can reduce amount of database updates - to as little as 10% of messages. - - spamprobe remove [filename...] - - Scans each file (or stdin if no file is specified) and removes its - term counts from the database. Messages which are not in the - database (recognized using their MD5 digest of message ids) are - ignored. - - spamprobe cleanup [ junk_count [ max_age ] ]... - - Scans the database and removes all terms with junk_count or less - (default 2) which have not had their counts modified in at least - max_age days (default 7). You can specify multiple count/age pairs - on a single command line but must specify both a count and an age - for all but the last count. This should be run periodically to - keep the database from growing endlessly. - - For my own email I use cron to run the cleanup command every day - and delete all terms with count of 2 or less that have not been - modified in the last two weeks. Here is the excerpt from my - crontab: - - 3 0 * * * /home/brian/bin/spamprobe cleanup 2 14 - - Alternatively you might want to use a much higher count (1000 in - this example) for terms that have not been seen in roughly six - months: - - 3 0 * * * /home/brian/bin/spamprobe cleanup 1000 180 2 14 - - Because of the way that PBL and BerkeleyDB work the database file - will not actually shrink, but newly added terms will be able to use - the space previously occupied by any removed terms so that the - file's growth should be significantly slower if this command is - used. - - To actually shrink the database you can build a new one using the - BerkeleyDB utility programs db_dump and db_load (Berkeley DB only) - or the spamprobe import and export commands (either database - library). For example: - - cd ~ - mkdir new.spamprobe - spamprobe export | spamprobe -d new.spamprobe import - mv .spamprobe old.spamprobe - mv new.spamprobe .spamprobe - - The -P option can also be used to limit the rate of growth of the - database when importing a large number of emails. For example if - you want to classify 1000 emails and want SP to purge rare terms - every 100 messages use a command such as: - - spamprobe -P 100 good goodmailboxname - - Using -P slows down the classification but can avoid the need to - use the db_dump trick. Using -P only makes sense when classifying - a large number of messages. - - spamprobe purge [ junk_count ] - - Similar to cleanup but forces the immediate deletion of all terms - with total count less than junk_count (default is 2) no matter how - long it has been since they were modified (i.e. even if they were - just added today). This could be handy immediately after - classifying a large mailbox of historical spam or good email to - make room for the next batch. - - spamprobe purge-terms regex - - Similar to purge except that it removes from the database all terms - which match the specified regular expression. Be careful with this - command because it could remove many more terms than you expect. - Use dump with the same regex before running this command to see - exactly what will be deleted. - - spamprobe edit-term term good_count spam_count - - Can be used to specifically set the good and spam counts of a term. - Whether this is truly useful is doubtful but it is provided for - completeness sake. For example it could be used to force a - particular word to be very spammy or very good: - - spamprobe edit-term nigeria 0 1000000 - spamprobe edit-term burton 10000000 0 - - spamprobe dump [ regex ] - - Prints the contents of the word counts database one word per line - in human readable format with spam probability, good count, spam - count, flags, and word in columns separated by whitespace. PBL and - Berkeley DB sort terms alphabetically. The standard unix sort - command can be used to sort the terms as desired. For example to - list all words from "most good" to "least good" use this command: - - spamprobe dump | sort -k 1nr -k 3nr - - To list all words from "most spammy" to "least spammy" use this - command: - - spamprobe dump | sort -k 1n -k 2nr - - Optionally you can specify a regular expression. If specified - SpamProbe will only dump terms matching the regular expression. - For example: - - spamprobe dump 'finance' - spamprobe dump '\bfinance\b' - spamprobe dump 'HSubject_.*finance' - - spamprobe tokenize [ filename ] - - Prints the tokens found in the file one word per line in human - readable format with spam probability, good count, spam count, - message count, and word in columns separated by whitespace. Terms - are listed in the order in which they were encountered in the - message. The standard unix sort command can be used to sort the - terms as desired. For example to list all words from "most good" - to "least good" use this command: - - spamprobe tokenize filename | sort -k 1nr -k 3nr - - To list all words from "most spammy" to "least spammy" use this - command: - - spamprobe tokenize filename | sort -k 1n -k 2nr - - spamprobe export - - Similar to the dump command but prints the counts and words in a - comma separated format with the words surrounded by double quotes. - This can be more useful for importing into some databases. - - spamprobe import - - Reads the specified files which must contain export data written by - the export command. The terms and counts from this file are added - to the database. This can be used to convert a database from a - prior version. - - spamprobe exec command - - Obtains an exclusive lock on the database and then executes the - command using system(3). If multiple arguments are given after - "exec" they are combined to form the command to be executed. This - command can be used when you want to perform some operation on the - database without interference from incoming mail. For example, to - back up your .spamprobe directory using tar you could do something - like this: - - cd - spamprobe exec tar cf spamprobe-data.tar.gz .spamprobe - - If you simply want to hold the lock while interactively running - commands in a different xterm you could use "spamprobe exec read". - The linux read program simply reads a line of text from your - terminal so the lock would effectively be held until you pressed - the enter key. Another option would be to use a shell as the - command and type the commands into that shell: - - spamprobe /bin/bash - ls - date - exit - - Be careful not to run spamprobe in the shell though since the - spamprobe in the shell will wind up deadlocked waiting for the - spamprobe running the exec command to release its lock. - - spamprobe exec-shared command - - Same as exec except that a shared lock is used. This may be more - appropriate if you are backing up your database since operations - like score (but not train or receive) could still be performed on - the database while the backup was running. - - -.SH SETUP OF SPAMPROBE FOR USERS - -Once you have a spamprobe executable copy it to someplace in your PATH -so that procmail can find it. Then create a directory for SpamProbe -to store its databases in. By default SpamProbe wants to use the -directory ~/.spamprobe. You must create this directory manually in -order to run SpamProbe or else specify some other directory using the --d option. Something like this should suffice: - - mkdir ~/.spamprobe - -SpamProbe can use either the PBL or Berkeley DB library for its -databases. Both are fast on local file systems but very slow over -NFS. Please ensure that your spamprobe directory is on a local file -system to ensure good performance. - -.SH NOTES USING HASH DATABASE - -SpamProbe can use a simple, fixed size hash data file as an -alternative to PBL or BDB. There are two advantages to the hash -format. The first is speed. In my experiments the hash file format -is around 2x the speed of PBL (ranged from 1.8x to 3.5x). The second -advantage is that the hash data file size is fixed. You choose a size -when you create the file and it never changes. File size can be -anywhere from 1-100 MB. You need to choose a size large enough to hold -your terms with room to spare. More on that later. - -The hash file format also has significant disadvantages. Becuase the -file size is fixed you must monitor the file to ensure that it does -not become overly full. When the file becomes more than half full -performance will suffer. Also the hash format does not store original -terms so you cannot use the dump command to learn what terms are -spammy or hammy in your database. Finally, the hash format is -imprecise. Hash collisions can cause the counts from different terms -to be mixed together which can reduce accuracy. - -To create a hash data file you add a prefix to the directory name in -the -d command line option. You can specify just the directory like -this: - - spamprobe -d hash:$HOME/.spamprobe - -or you can add a size in megabytes for the file like this: - - spamprobe -d hash:42:$HOME/.spamprobe - -The size is only used when a file is first created. SP auto detects -the size of an existing hash file. You need to allow enough space for -twice as many terms as you are likely to have in your file. In my -database I have 2.2 million terms. That required a database of are 53 -MB. SP uses 12 bytes per term in the hash file so you can estimate -the file size you'll need by multiplying the number of terms by 24. - -The hash format does not store the original terms. Instead it stores -the 32 bit hash code for each term. You can do just about anything -with a hash file that you could with a PBL file including -import/export, edit-term, cleanup, purge, etc. You can use export -your PBL database and import it to build a hash file (note that you -cannot go the other direction) and you can export one hash file and -import into a new one to enlarge your file. - -.SH MAILDIR FORMAT - -SpamProbe will accept a maildir directory name anywhere that an Mbox -or MBX file name can be specified. When SpamProbe encounters a -Maildir mailbox (directory) name it will automatically process all of -the non-hidden files in the cur and new subdirectories of the mailbox. -There is no need to individually specify these subdirectories. - - -.SH GETTING STARTED - -SpamProbe is not a stand alone mail filter. It doesn't sort your mail -or split it into different mailboxes. Instead it relies on some other -program such as procmail to actually file your mail for you. What -SpamProbe does do is track the word counts in good and spam emails and -generate a score for each email that indicates whether or not it is -likely to be spam. Scores range from 0 to 1 with any score of 0.9 or -higher indicating a probable spam. - -Personally I use SpamProbe with procmail to filter my incoming email -into mail boxes. I have procmail score each inbound email using -SpamProbe and insert a special header into each email containing its -score. Then I have procmail move spams into a special mailbox. - -No spam filter is perfect and SpamProbe sometimes makes mistakes. To -correct those mistakes I have a special mailbox that I put undetected -spams into. I run SpamProbe periodically and have it reclassify any -emails in that mailbox as spam so that it will make a better guess the -next time around. - -This is not a procmail primer. You will need to ensure that you have -procmail and formail installed before you can use this technique. -Also I recommend that you read the procmail documentation so that you -can fully understand this example and adapt it to your own needs. -That having been said, my .procmailrc file looks like this: - - MAILDIR=$HOME/IMAP - - :0 c - saved - - :0 - SCORE=| /home/brian/bin/spamprobe train - :0 wf - | formail -I "X-SpamProbe: $SCORE" - :0 a: - *^X-SpamProbe: SPAM - spamprobe - -I use IMAP to fetch my email so my mailboxes all live in a directory -named IMAP on my mail server. - -NOTE: The first stanza copies all incoming emails into a special mbox -called saved. SpamProbe IS BETA SOFTWARE and though it works well for -me it is possible that it could somehow lose emails. Caution is -always a good idea. That having been said, with the procmailrc file -as shown above the worst that could happen if SpamProbe crashes is -that the email would not be scored properly and procmail would deliver -it to your inbox. Of course if procmail crashes all bets are off. - -The second stanza runs spamprobe in "train" mode to score the email, -classify it as either spam or good, and possibly update the database. -The train command tries to minimize the number of database updates by -only updating the database with terms from an incoming message if -there was insufficient confidence in the message's score. The train -command always updates the database on the first 1500 of each type -received. This ensures that sufficient email is classified to allow -the filter to operate reliably. - -The next stanza runs formail to add a custom header to the email -containing the SpamProbe score. The final stanza uses the contents of -the custom header to file detected spams into a special mbox named -spamprobe. - -As an alternative to using the train command, you can run spamprobe in -"receive" mode. In that mode SpamProbe scores the email and then -classifies it as either spam or good based on the score. It always -automatically adds the word counts for the email to the appropriate -database. This is essentially like running in score mode followed -immediately by either spam or good mode. It produces more database -I/O and a bigger database but ensures that every message has its terms -reflected in the database. Personally I use train mode. A sample -procmailrc file using the receive command looks like this: - - MAILDIR=$HOME/IMAP - - :0 c - saved - - :0 - SCORE=| /home/brian/bin/spamprobe receive - :0 wf - | formail -I "X-SpamProbe: $SCORE" - :0 a: - *^X-SpamProbe: SPAM - spamprobe - - -.SH MAKING CORRECTIONS - -SpamProbe is not perfect. It is able to detect over 99% of the spams -that I receive but some still slip through. To correct these missed -emails I run SpamProbe periodically and have it scan a special mbox. -Since I use IMAP to retrieve my emails I can simply drop undetected -spams into this mbox from my mail client. If you use POP or some -other system then you will need to find a way get the undetected spams -into a mbox that spamprobe can see. - -Periodically I run a script that scans three special mboxes to correct -errors in judgment: - - #!/bin/sh - - IMAPDIR=$HOME/IMAP - - spamprobe remove $IMAPDIR/remove - spamprobe good $IMAPDIR/nonspam - spamprobe spam $IMAPDIR/spam - spamprobe train-spam $IMAPDIR/spamprobe - -From this example you can see that I use three special mboxes to make -corrections. I copy emails that I don't want spamprobe to store into -the remove mbox. This is useful if you receive email from a friend or -colleague that looks like spam and you don't want it to dilute the -effectiveness of the terms it contains. - -Undetected spams go into the spam mbox. SpamProbe will reclassify -those emails as spam and correct its database accordingly. Note that -doing this does not guarantee that the spam will always be scored as -spam in the future. Some spams are too bland to detect perfectly. -Fortunately those are very rare. - -The nonspam mbox is for any false positives. These are always -possible and it is important to have a way to reclassify them when -they do occur. - -If you are using receive mode rather than train mode then the above -script can be modified to remove the train-spam line. For example: - - #!/bin/sh - - IMAPDIR=$HOME/IMAP - - spamprobe remove $IMAPDIR/remove - spamprobe good $IMAPDIR/nonspam - spamprobe spam $IMAPDIR/spam - -Finally you'll need to build a starting database. Since SpamProbe -relies on word counts from past emails it requires a decent sized -database to be accurate. To build the database select some of your -mboxes containing past emails. Ideally you should have one mbox of -spams and one or more of non-spams. If you don't have any spams handy -then don't worry, SpamProbe will gradually become more accurate as you -receive more spams. Expect a fairly high false negative (i.e. missed -spams) rate as you first start using SpamProbe. - -To import your starting messages use commands such as these. The -example assumes that you have non-spams stored in a file named mbox in -your home directory and some spams stored in a file named nasty-spams. -Replace these names with real ones. - - spamprobe good ~/mbox - spamprobe spam ~/nasty-spams - - -.SH SEE ALSO -procmail(1) +.PP +\fB\-a \fR\fB\fIchar\fR\fR +.RS 4 +By default +SpamProbe +converts non\-ascii characters (characters with the most significant bit set to 1) into the letter \*(Aqz\*(Aq\&. This is useful for lumping all Asian characters into a single word for easy recognition\&. The \-a option allows you to change the character to something else if you don\*(Aqt like the letter \*(Aqz\*(Aq for some reason\&. +.RE +.PP +\fB\-c\fR +.RS 4 +Tells +SpamProbe +to create the database directory if it does not already exist\&. Normally +SpamProbe +exits with a usage error if the database directory does not already exist\&. +.RE +.PP +\fB\-C \fR\fB\fInumber\fR\fR +.RS 4 +Tells +SpamProbe +to assign a default, somewhat neutral, probability to any term that does not have a weighted (good count doubled) count of at least +\fInumber \fRin the database\&. This prevents terms which have been seen only a few times from having an unreasonable influence on the score of an email containing them\&. +.sp +The default value is 5\&. For example if +\fInumber \fRis 5 then in order for a term to use its calculated probability it must have been seen 3 times in good mails, or 2 times in good mails and once in spam, or 5 times in spam, or some other combination adding up to at least 5\&. +.RE +.PP +\fB\-d \fR\fB\fI[type:]directory\fR\fR\fB \fR +.RS 4 +By default +SpamProbe +stores its database in a directory named +\&.spamprobe +under your home directory\&. The +\fB\-d\fR +option allows you to specify a different directory to use\&. This is necessary if your home directory is +NFS +mounted for example\&. +.sp +The directory name can be prefixed with a special code to force +SpamProbe +to use a particular type of data file format\&. +Defined types include: +.PP +\fB\-d bdb:path\fR +.RS 4 +Forces the use of Berkeley DB data file\&. +.RE +.PP +\fB\-d hash:path\fR +.RS 4 +Forces the use of an mmapped hash file\&. +.RE +.PP +\fB\-d split:path\fR +.RS 4 +Forces the use of a hash file and ISAM file (may provide better precision than plain hash in some cases)\&. +.RE +.sp +The +\fIhash:\fR +option can also specify a desired file size in megabytes before the path\&. For example +\fB\-d hash:19:path\fR +would cause +SpamProbe +to use a 19 MB hash file\&. The size must be in the range of 1\-100\&. The default hash file size is 16 MB\&. Because hash files have a fixed size and capacity they should be cleaned relatively often using the +\fBcleanup\fR +command (see below) to prevent them from becoming full or being slowed by too many hash key collisions\&. +.sp +Hash files provide better performance than Berkeley DB\&. +However hash files do not store the original terms\&. Only a 32 bit hash key is stored with each term\&. This prevents a user from exploring the terms in the database using the dump command to see what words are particularly spammy or hammy\&. The default data file format is Berkeley BD (bdb)\&. +.RE +.PP +\fB\-D \fR\fB\fIdirectory\fR\fR +.RS 4 +Tells +SpamProbe +to use the database in the specified directory (must be different than the one specified with the +\fB\-d \fRoption) as a shared database from which to draw terms that are not defined in the user\*(Aqs own database\&. This can be used to provide a baseline database shared by all users on a system (in the +\fB\-D \fR +directory) and a private database unique to each user of the system ($HOME/\&.spamprobe +or +\fB\-d\fR +directory)\&. +.RE +.PP +\fB\-g \fR\fB\fIfieldname\fR\fR +.RS 4 +Tells +SpamProbe +what header to look for previous score and message digest in\&. Default is X\-SpamProbe\&. Field name is not case sensitive\&. Used by all commands except +\fBreceive\fR\&. +.RE +.PP +\fB\-h\fR +.RS 4 +By default +SpamProbe +removes +HTML +markup from the text in emails to help avoid false positives\&. The +\fB\-h\fR +option allows you to override this behavior and force +SpamProbe +to include words from within +HTML +tags in its word counts\&. Note that +SpamProbe +always counts any URLs in hrefs within tags whether +\fB\-h\fR +is used or not\&. Use of this option is discouraged\&. It can increase the rate of spam detection slightly but unless the user receives a significant amount of +HTML +emails it also tends to increase the number of false positives\&. +.RE +.PP +\fB\-H \fR\fB\fIoption\fR\fR +.RS 4 +By default +SpamProbe +only scans a meaningful subset of headers from the email message when searching for words to score\&. The +\fB\-H\fR +option allows the user to specify additional headers to scan\&. Legal values are +\fBall\fR, +\fBnox\fR, +\fBnone\fR, or +\fBnormal\fR\&. +\fBall\fR +scans all headers, +\fBnox \fRscans all headers except those starting with X\-, +\fBnone \fRdoes not scan headers, and +\fBnormal \fRscans the normal set of headers\&. +.sp +In addition to those values you can also explicitly add a header to the list of headers to process by adding the header name in lower case preceded by a plus sign\&. Multiple headers can be specified by using multiple +\fB\-H \fRoptions\&. For example, to include only the +\fBFrom\fR +and +\fBReceived\fR +headers in your +\fBtrain\fR +command you could run +SpamProbe +as follows: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe \-Hnone \-H+from \-H+received train +.fi +.if n \{\ +.RE +.\} +.sp +To process the normal set of headers but also add the SpamAssassin header X\-SpamStatus you could run +SpamProbe +as follows: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe \-H+x\-spam\-status train +.fi +.if n \{\ +.RE +.\} +.sp +.RE +.PP +\fB\-l \fR\fB\fInumber\fR\fR +.RS 4 +Changes the spam probability threshold for emails from the default (\fB0\&.7\fR) to +\fInumber\fR\&. The number must be a value between 0 and 1\&. Generally the value should be above 0\&.5 to avoid a high false positive rate\&. Lower numbers tend to produce more false positives while higher numbers tend to reduce accuracy\&. +.RE +.PP +\fB\-m\fR +.RS 4 +Forces +SpamProbe +to use +mbox +format for reading emails in +\fBreceive\fR +mode\&. Normally +SpamProbe +assumes that the input to +\fBreceive\fR +mode contains a single message so it doesn\*(Aqt look for message breaks\&. +.RE +.PP +\fB\-M\fR +.RS 4 +Forces +SpamProbe +to treat the entire input as a single message\&. This ignores +\fBFrom\fR +lines and +\fBContent\-Length\fR +headers in the input\&. +.RE +.PP +\fB\-o \fR\fB\fIoption\fR\fR +.RS 4 +Enables special options by name\&. Currently the only special options are: +.PP +\fB\-o graham\fR +.RS 4 +Causes +SpamProbe +to emulate the filtering algorithm originally outlined in +[A Plan For Spam]\&. +.RE +.PP +\fB\-o honor\-status\-header\fR +.RS 4 +Causes +SpamProbe +to ignore messages if they have a Status: header containing a capital D\&. Some mail servers use this status to indicate a message that has been flagged for deletion but has not yet been purged from the file\&. +.sp +DO NOT use this option with the receive or train command in your procmailrc file! Doing so could allow spammers to bypass the filter\&. This option is meant to be used with the +\fBtrain\-spam\fR +and +\fBtrain\-good\fR +commands in scripts that periodically update the database\&. +.RE +.PP +\fB\-o orig\-score\fR +.RS 4 +Causes +SpamProbe +to use its original scoring algorithm that produces excellent results but tends to generate scores of either 0 or 1 for all messages\&. +.RE +.PP +\fB\-o suspicious\-tags\fR +.RS 4 +Causes +SpamProbe +to scan the contents of +\(lqsuspicious\(rq +tags for tokens rather than simply throwing them out\&. Currently only font tags are scanned but other tags may be added to this list in later versions\&. +.RE +.PP +\fB\-o tokenized\fR +.RS 4 +Causes +SpamProbe +to read tokens one per line rather than processing the input as mail format\&. This allows users to completely replace the standard +SpamProbe +tokenizer if they wish and instead use some external program as a tokenizer\&. +.sp +In this mode +SpamProbe +considers a blank line to indicate the end of one message\*(Aqs tokens and the start of a new message\*(Aqs tokens\&. +SpamProbe +computes a message digest based on the lines of text containing the tokens\&. +.RE +.sp +The +\fB\-o\fR +option can be used multiple times and all requested options will be applied\&. Note that some options might conflict with each other in which case the last option would take precedence\&. +.RE +.PP +\fB\-p \fR\fB\fInumber\fR\fR +.RS 4 +Changes the maximum number of words per phrase\&. Default value is two\&. Increasing the limit improves accuracy somewhat but increases database size\&. Experiments indicate that increasing beyond two is not worth the extra cost in space\&. +.RE +.PP +\fB\-P \fR\fB\fInumber\fR\fR +.RS 4 +Causes +SpamProbe +to perform a purge of all terms with junk count less than or equal 2 after every number messages are processed\&. Using this option when classifying a large collection of spam can prevent the database from growing overly large at the cost of more processing time and possible loss of precision\&. +.RE +.PP +\fB\-r \fR\fB\fInumber\fR\fR +.RS 4 +Changes the number of times that a single word/phrase can occur in the top words array used to calculate the score for each message\&. Allowing repeats reduces the number of words overall (since a single word occupies more than one slot) but allows words which occur frequently in the message to have a higher weight\&. Generally this is changed only for optimization purposes\&. +.RE +.PP +\fB\-R\fR +.RS 4 +Causes +SpamProbe +to treat the input as a single message and to base its exit code on whether or not that message was spam\&. The exit code will be 0 if the message was spam or 1 if the message was good\&. +.RE +.PP +\fB\-s \fR\fB\fInumber\fR\fR +.RS 4 +SpamProbe +maintains an in memory cache of the words it has seen in previous messages to reduce disk I/O and improve performance\&. By default the cache will contain the most recently accessed 2,500 terms\&. This number can be changed using the +\fB\-s\fR +option\&. Using a larger the cache size will cause +SpamProbe +to use more memory and, potentially, to perform less database I/O\&. A value of zero causes +SpamProbe +to use 100,000 as the limit which effectively means that the cache will only be flushed at program exit (unless you have really enormous mailbox files)\&. The cache doesn\*(Aqt affect receive, dump, or export but has a significant impact on the others\&. +.RE +.PP +\fB\-T\fR +.RS 4 +Causes +SpamProbe +to write out the top terms associated with each message in addition to its normal output\&. Works with +\fBfind\-good\fR, +\fBfind\-spam\fR, and +\fBscore\fR\&. +.RE +.PP +\fB\-v\fR +.RS 4 +When it appears once on the command line this option tells +SpamProbe +to write verbose information during processing\&. When it appears twice on the command line this option tells +SpamProbe +to write debugging information to stderr\&. This can be useful for debugging or for seeing which terms +SpamProbe +used to score each email\&. +.RE +.PP +\fB\-V\fR +.RS 4 +Prints version and copyright information and then exits\&. +.RE +.PP +\fB\-w \fR\fB\fInumber\fR\fR +.RS 4 +Changes the number of most significant words/phrases used by +SpamProbe +to calculate the score for each message\&. Generally this is changed only for optimization purposes\&. +.RE +.PP +\fB\-x\fR +.RS 4 +Normally +SpamProbe +uses only a fixed number of top terms (as set by the +\fB\-w\fR +command line option) when scoring emails\&. The +\fB\-x\fR +option can be used to allow the array to be extended past the max size if more terms are available with probabilities <= 0\&.1 or >= 0\&.9\&. +.RE +.PP +\fB\-X\fR +.RS 4 +An interesting variation on the scoring settings\&. Equivalent to using +\fB\-w5 \-r5 \-x\fR +so that generally only words with probabilites <= 0\&.1 or >= 0\&.9 are used and word frequencies in the email count heavily towards the score\&. Tests have shown that this setting tends to be safer (fewer false positives) and have higher recall (proper classification of spams previously scored as spam) although its predictive power isn\*(Aqt quite as good as the default settings\&. WARNING: This setting might work best with a fairly large corpus, it has not been tested with a small corpus so it might be very inaccurate with fewer than 1000 total messages\&. +.RE +.PP +\fB\-Y\fR +.RS 4 +Assume traditional Berkeley mailbox format, ignoring any Content\-Length: fields\&. +.RE +.PP +\fB\-7\fR +.RS 4 +Tells +SpamProbe +to ignore any characters with the most significant bit set to 1 instead of mapping them to the letter \*(Aqz\*(Aq\&. +.RE +.PP +\fB\-8\fR +.RS 4 +Tells +SpamProbe +to store all characters even if their most significant bit is set to 1\&. +.RE +.SH "COMMANDS" +.PP +SpamProbe +recognizes the following commands: +.PP +.PP +\fBspamprobe help\fR [ \fIcommand\fR ] +.RS 4 +With no arguments +SpamProbe +lists all of the valid commands\&. If one or more commands are specified after the word help, +SpamProbe +will print a more verbose description of each command\&. +.RE +.PP +\fBspamprobe create\-db\fR +.RS 4 +If no database currently exists +SpamProbe +will attempt to create one and then exit\&. This can be used to bootstrap a new installation\&. Strictly speaking this command is not necessary since the +\fBtrain\-spam\fR, +\fBtrain\-good\fR, and +\fBauto\-train\fR +commands will also create a database if none already exists but some users like to create a database as a separate installation step\&. +.RE +.PP +\fBspamprobe create\-config\fR +.RS 4 +Writes a new configuration file named +spamprobe\&.hdl +into the database directory (normally +$HOME/\&.spamprobe)\&. Any existing configuration file will be overwritten so be sure to make a copy before invoking this command\&. +.RE +.PP +\fBspamprobe receive\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Tells +SpamProbe +to read its standard input (or a file specified after the receive command) and score it using the current databases\&. Once the message has been scored the message is classified as either spam or non\-spam and its word counts are written to the appropriate database\&. The message\*(Aqs score is written to stdout along with a single word\&. For example: +.sp +.if n \{\ +.RS 4 +.\} +.nf +SPAM 0\&.9999999 595f0150587edd7b395691964069d7af +GOOD 0\&.0200000 595f0150587edd7b395691964069d7af + +.fi +.if n \{\ +.RE +.\} +.sp +The string of hex digits after the score is the message\*(Aqs +\(lqMD5\-digest\(rq, a 128 bit number which uniquely identifies the message\&. The digest is used by +SpamProbe +to recognize messages that it has processed previously so that it can keep its word counts consistent if the message is reclassified\&. +.sp +Using the +\fB\-T\fR +option additionally lists the terms used to produce the score along with their counts (number of times they were found in the message)\&. +.RE +.PP +\fBspamprobe train\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Functionally identical to +\fBreceive\fR +except that the database is only modified if the message was +\(lqdifficult\(rq +to classify\&. In practice this can reduce the number of database updates to as little as 10% of messages received\&. +.RE +.PP +\fBspamprobe score\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Similar to receive except that the database is not modified in any way\&. +.RE +.PP +\fBspamprobe summarize\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Similar to +\fBscore\fR +except that it prints a short summary and score for each message\&. This can be useful when testing\&. Using the +\fB\-T\fR +option additionally lists the terms used to produce the score along with their counts (number of times they were found in the message)\&. +.RE +.PP +\fBspamprobe find\-spam\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Similar to +\fBscore\fR +except that it prints a short summary and score for each message that is determined to be spam\&. This can be useful when testing\&. Using the +\fB\-T\fR +option additionally lists the terms used to produce the score along with their counts (number of times they were found in the message)\&. +.RE +.PP +\fBspamprobe find\-good\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Similar to +\fBscore\fR +except that it prints a short summary and score for each message that is determined to be good\&. This can be useful when testing\&. Using the +\fB\-T\fR +option additionally lists the terms used to produce the score along with their counts (number of times they were found in the message)\&. +.RE +.PP +\fBspamprobe auto\-train\fR { SPAM|GOOD \fIfilename\fR \&.\&.\&. } \&.\&.\&. +.RS 4 +Attempts to efficiently build a database from all of the named files\&. You may specify one or more file of each type\&. Prior to each set of file names you must include the word +\fBSPAM\fR +or +\fBGOOD\fR +to indicate what type of mail is contained in the files which follow on the command line\&. +.sp +The case of the +\fBSPAM\fR +and +\fBGOOD\fR +keywords is important\&. Any number of file names can be specified between the keywords\&. The command line format is very flexible\&. You can even use a find command in backticks to process whole directory trees of files\&. For example: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe auto\-train SPAM spams/* GOOD `find hams \-type f` +.fi +.if n \{\ +.RE +.\} +.sp +SpamProbe +pre\-scans the files to determine how many emails of each type exist and then trains on hams and spams in a random sequence that balances the inflow of each type so that the train command can work most effectively\&. For example if you had 400 hams and 400 spams, auto\-train will generally process one spam, then one ham, etc\&. If you had 4000 spams and 400 hams then auto\-train will generally process 10 spams, then one ham, etc\&. +.sp +Since this command will likely take a long time to run it is often desireable to use it with the \-v option to see progress information as the messages are processed\&. +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe \-v auto\-train SPAM spams/* GOOD hams/* +.fi +.if n \{\ +.RE +.\} +.sp +.RE +.PP +\fBspamprobe good\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Scans each file (or stdin if no file is specified) and reclassifies every email in the file as non\-spam\&. The databases are updated appropriately\&. Messages previously classified as good (recognized using their MD5 digest) are ignored\&. Messages previously classified as spam are reclassified as good\&. +.RE +.PP +\fBspamprobe train\-good\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Functionally identical to +\fBgood\fR +command except that it only updates the database for messages that are either incorrectly classified (i\&.e\&. classified as spam) or are +\(lqdifficult\(rq +to classify\&. In practice this can reduce amount of database updates to as little as 10% of messages\&. +.RE +.PP +\fBspamprobe spam\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Scans each file (or stdin if no file is specified) and reclassifies every email in the file as spam\&. The databases are updated appropriately\&. Messages previously classified as spam (recognized using their MD5 digest of message ids) are ignored\&. Messages previously classified as good are reclassified as spam\&. +.RE +.PP +\fBspamprobe train\-spam\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Functionally identical to +\fBspam\fR +command except that it only updates the database for messages that are either incorrectly classified (i\&.e\&. classified as good) or are +\(lqdifficult\(rq +to classify\&. In practice this can reduce amount of database updates to as little as 10% of messages\&. +.RE +.PP +\fBspamprobe remove\fR [ \fIfilename\fR\&.\&.\&. ] +.RS 4 +Scans each file (or stdin if no file is specified) and removes its term counts from the database\&. Messages which are not in the database (recognized using their MD5 digest of message ids) are ignored\&. +.RE +.PP +\fBspamprobe cleanup\fR [ \fIjunk_count\fR [ \fImax_age\fR ] ] +.RS 4 +Scans the database and removes all terms with +\fIjunk_count\fR +or less (default 2) which have not had their counts modified in at least +\fImax_age\fR +days (default 7)\&. You can specify multiple count/age pairs on a single command line but must specify both a count and an age for all but the last count\&. This should be run periodically to keep the database from growing endlessly\&. +.RE +.PP +\fBspamprobe purge\fR [ \fIjunk_count\fR ] +.RS 4 +Similar to cleanup but forces the immediate deletion of all terms with total count less than +\fIjunk_count \fR +(default is 2) no matter how long it has been since they were modified (i\&.e\&. even if they were just added today)\&. This could be handy immediately after classifying a large mailbox of historical spam or good email to make room for the next batch\&. +.RE +.PP +\fBspamprobe purge\-terms\fR \fIregex\fR +.RS 4 +Similar to purge except that it removes from the database all terms which match the specified regular expression\&. Be careful with this command because it could remove many more terms than you expect\&. Use +\fBdump\fR +with the same +\fIregex\fR +before running this command to see exactly what will be deleted\&. +.RE +.PP +\fBspamprobe edit\-term\fR \fIterm\fR \fIgood_count\fR \fIspam_count\fR +.RS 4 +Can be used to specifically set the good and spam counts of a term\&. Whether this is truly useful is doubtful but it is provided for completeness sake\&. +.RE +.PP +\fBspamprobe dump\fR [ \fIregex\fR ] +.RS 4 +Prints the contents of the word counts database one word per line in human readable format with spam probability, good count, spam count, flags, and word in columns separated by whitespace\&. When given, the +\fIregex\fR +argument limits output to matching tokens\&. +.RE +.PP +\fBspamprobe tokenize\fR [ \fIfilename\fR ] +.RS 4 +Prints the tokens found in the file one word per line in human readable format with spam probability, good count, spam count, message count, and word in columns separated by whitespace\&. Terms are listed in the order in which they were encountered in the message\&. The standard unix sort command can be used to sort the terms as desired\&. +.RE +.PP +\fBspamprobe export\fR +.RS 4 +Similar to the +\fBdump\fR +command but prints the counts and words in a comma separated format with the words surrounded by double quotes\&. This can be more useful for importing into some databases\&. +.RE +.PP +\fBspamprobe import\fR +.RS 4 +Reads the specified files which must contain export data written by the +\fBexport\fR +command\&. The terms and counts from this file are added to the database\&. This can be used to convert a database from a prior version\&. +.RE +.SH "EXAMPLES" +.SS "External Tokenizers" +.PP +Assuming you have a tokenizer tokenize\&.pl, in your procmailrc file you could use: +.sp +.if n \{\ +.RS 4 +.\} +.nf +SCORE=| tokenize\&.pl | /usr/bin/spamprobe \-o tokenized train + +.fi +.if n \{\ +.RE +.\} +.sp +.SS "Querying Mailboxes" +.PP +To list all words from +\(lqmost good\(rq +to +\(lqleast good\(rq +use this command: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe tokenize \fIfilename\fR | sort \-k 1n \-k 2nr + +.fi +.if n \{\ +.RE +.\} +.PP +To list all words from +\(lqmost spammy\(rq +to +\(lqleast spammy\(rq +use this command: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe tokenize \fIfilename\fR | sort \-k 1nr \-k 3nr + +.fi +.if n \{\ +.RE +.\} +.sp +.SS "Querying The Database" +.PP +Use +\fBspamprobe dump\fR +to get a human readable list of tokens in +SpamProbe\*(Aqs database\&. +Berkeley DB +sorts terms alphabetically; piping output into the standard unix +\fBsort\fR(1) +command can be used to sort the terms as desired\&. +.PP +To list all words in +SpamProbe\*(Aqs database from +\(lqmost good\(rq +to +\(lqleast good\(rq +use this command: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe dump | sort \-k 1n \-k 2nr + +.fi +.if n \{\ +.RE +.\} +.PP +To list all words from +\(lqmost spammy\(rq +to +\(lqleast spammy\(rq +use this command: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe dump | sort \-k 1nr \-k 3nr + +.fi +.if n \{\ +.RE +.\} +.PP +Optionally you can specify a regular expression\&. If specified +SpamProbe +will only dump terms matching the regular expression\&. For example: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe dump \*(Aqfinance\*(Aq +spamprobe dump \*(Aq\e\ebfinance\e\eb\*(Aq +spamprobe dump \*(AqHSubject_\&.*finance\*(Aq +.fi +.if n \{\ +.RE +.\} +.sp +.SH "DATABASE MAINTAINANCE" +.PP +When no provision is taken, +SpamProbe\*(Aqs databases will constantly grow while classifying messages\&. In order to remove old unused entries, you should run +\fBcleanup\fR +on a regular basis, most easily from +\fBcron\fR(1)\&. +.sp +.if n \{\ +.RS 4 +.\} +.nf +# daily at 00:03 +# remove entries with count <= 2 that haven\*(Aqt +# been touched during the last 2 weeks from +# spamprobe\*(Aqs database +3 0 * * * /usr/bin/spamprobe cleanup 2 14 +.fi +.if n \{\ +.RE +.\} +.PP +Alternatively you might want to use a much higher count (1000 in this example) for terms that have not been seen in roughly six months: +.sp +.if n \{\ +.RS 4 +.\} +.nf +3 0 * * * /home/brian/bin/spamprobe cleanup 1000 180 2 14 +.fi +.if n \{\ +.RE +.\} +.PP +Because of the way that +Berkeley DB +works the database file will not actually shrink, but newly added terms will be able to use the space previously occupied by any removed terms so that the file\*(Aqs growth should be significantly slower if this command is used\&. +.PP +To actually shrink the database you can build a new one using the +Berkeley DB +utility programs +\fBdb_dump\fR(1) +and +\fBdb_load\fR(1) +or the +SpamProbe +\fBimport\fR +and +\fBexport\fR +commands\&. For example: +.sp +.if n \{\ +.RS 4 +.\} +.nf +cd ~ +mkdir new\&.spamprobe +spamprobe export | spamprobe \-d ~/new\&.spamprobe import +mv \&.spamprobe old\&.spamprobe +mv new\&.spamprobe \&.spamprobe + +.fi +.if n \{\ +.RE +.\} +.PP +The +\fB\-P\fR +option can also be used to limit the rate of growth of the database when importing a large number of emails\&. For example if you want to classify 1000 emails and want +SpamProbe +to purge rare terms every 100 messages use a command such as: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe \-P 100 good goodmailboxname + +.fi +.if n \{\ +.RE +.\} +.PP +Using +\fB\-P\fR +slows down the classification but can avoid the need to use the +\fBexport\fR/\fBimport\fR +trick\&. Note that +\fB\-P\fR +only makes sense when classifying a large number of messages\&. +.PP +You may want to force a particular word to be very spammy or extremely good: +.sp +.if n \{\ +.RS 4 +.\} +.nf +spamprobe edit\-term xanax 0 1000000 +spamprobe edit\-term debian 10000000 0 + +.fi +.if n \{\ +.RE +.\} +.sp +At least pinning good terms tends to help spammers\&. +.SH "BUGS" +.PP +This manual page is still work in progress\&. In particular it\*(Aqs lacking a description of which headers are processed with +\fB\-H normal\fR +and how terms are generated from headers as well as a reference to the regex syntax applicable to +\fBdump\fR +and +\fBpurge\-term\fR +commands\&. +.SH "FILES" +.PP +~/\&.spamprobe +.RS 4 +When not otherwise specified with the +\fB\-d \fR\fB\fIdirectory\fR\fR +option, +SpamProbe +stores its database files in this directory\&. +\fIIt does not automatically create database directories except when explicitly asked to by the \fR\fI\fB\-c\fR\fR\fI command line flag or the \fR\fI\fBcreate\-db\fR\fR\fI command\fR\&. If your home directory is +NFS +mounted, use a different directory on a local disk, since +Berkeley DB +performance suffers badly over +NFS\&. +.RE +.PP +~/\&.spamprobe/spamprobe\&.hdl +.RS 4 +Configuration file for +\fBspamprobe\fR\&. This file is optional\&. It can be initialized with all the default values by the +\fBcreate\-config\fR +command\&. +.RE +.SH "SEE ALSO" +.PP + +\fBprocmail\fR(1) , \fBmaildrop\fR(1) +.SH "AUTHOR" +.PP +SpamProbe +has been written by Brian Burton and is published under the +QPL +(Qt Public License)\&. +.PP +This manual page was compiled by Siggy Brentrup +bsb@debian\&.org +from the distributed one for the +Debian GNU/Linux +system but may be used by others\&. Permission is granted to copy, distribute and/or modify this document under the terms of the +GPL +version 2\&. --- spamprobe-1.4d.orig/src/includes/MultiLineSubString.h +++ spamprobe-1.4d/src/includes/MultiLineSubString.h @@ -31,6 +31,7 @@ #ifndef _MultiLineSubString_h #define _MultiLineSubString_h +#include #include "AbstractMultiLineString.h" class MultiLineSubString : public AbstractMultiLineString --- spamprobe-1.4d.orig/src/includes/Ref.h +++ spamprobe-1.4d/src/includes/Ref.h @@ -189,7 +189,7 @@ CRef &operator=(const CRef &other) { - assign(other); + this->assign(other); return *this; } @@ -245,7 +245,7 @@ Ref &operator=(const Ref &other) { - assign(other); + this->assign(other); return *this; } --- spamprobe-1.4d.orig/src/includes/hash.h +++ spamprobe-1.4d/src/includes/hash.h @@ -10,11 +10,13 @@ #ifndef _jenkinshash_h #define _jenkinshash_h +#include + #ifdef __cplusplus extern "C" { #endif -typedef unsigned long int ub4; /* unsigned 4-byte quantities */ +typedef uint32_t ub4; /* unsigned 4-byte quantities */ typedef unsigned char ub1; /* unsigned 1-byte quantities */ #define hashsize(n) ((ub4)1<<(n)) --- spamprobe-1.4d.orig/src/includes/util.h +++ spamprobe-1.4d/src/includes/util.h @@ -42,6 +42,7 @@ #include #include #include +#include #include "Ptr.h" #include "Ref.h" --- spamprobe-1.4d.orig/src/includes/Buffer.h +++ spamprobe-1.4d/src/includes/Buffer.h @@ -32,6 +32,7 @@ #define _Buffer_h #include "Array.h" +#include // // Similar to Array but handles variable length. --- spamprobe-1.4d.orig/src/hdl/HdlTokenizer.cc +++ spamprobe-1.4d/src/hdl/HdlTokenizer.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include "AbstractCharReader.h" #include "HdlError.h" #include "HdlToken.h" --- spamprobe-1.4d.orig/src/spamprobe/Command_edit_term.cc +++ spamprobe-1.4d/src/spamprobe/Command_edit_term.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include "SpamFilter.h" #include "FrequencyDB.h" #include "CommandConfig.h" --- spamprobe-1.4d.orig/src/spamprobe/Command_purge.cc +++ spamprobe-1.4d/src/spamprobe/Command_purge.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include "CleanupManager.h" #include "SpamFilter.h" #include "FrequencyDB.h" --- spamprobe-1.4d.orig/src/spamprobe/Command_exec.cc +++ spamprobe-1.4d/src/spamprobe/Command_exec.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include "SpamFilter.h" #include "CommandConfig.h" #include "ConfigManager.h" --- spamprobe-1.4d.orig/src/spamprobe/Command_import.cc +++ spamprobe-1.4d/src/spamprobe/Command_import.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include #include "LineReader.h" #include "IstreamCharReader.h" --- spamprobe-1.4d.orig/src/spamprobe/spamprobe.cc +++ spamprobe-1.4d/src/spamprobe/spamprobe.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include #include #include --- spamprobe-1.4d.orig/src/spamprobe/Command_cleanup.cc +++ spamprobe-1.4d/src/spamprobe/Command_cleanup.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include "CleanupManager.h" #include "SpamFilter.h" #include "FrequencyDB.h" --- spamprobe-1.4d.orig/src/database/WordArray.h +++ spamprobe-1.4d/src/database/WordArray.h @@ -31,6 +31,8 @@ #ifndef _WordArray_h #define _WordArray_h +#include + class WordData; class WordArray @@ -47,7 +49,7 @@ FLAGS_SIZE = 2, }; - typedef unsigned long key_t; + typedef uint32_t key_t; void reset(char *buffer, int num_words); --- spamprobe-1.4d.orig/src/database/DatabaseConfig.cc +++ spamprobe-1.4d/src/database/DatabaseConfig.cc @@ -29,6 +29,7 @@ // #include +#include #include "File.h" #include "WordData.h" #include "FrequencyDBImpl.h" --- spamprobe-1.4d.orig/src/parser/AutoTrainMailMessageReader.cc +++ spamprobe-1.4d/src/parser/AutoTrainMailMessageReader.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include "MailMessage.h" #include "AutoTrainMailMessageReader.h" --- spamprobe-1.4d.orig/src/parser/MailMessageReader.cc +++ spamprobe-1.4d/src/parser/MailMessageReader.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include "RegularExpression.h" #include "MailMessage.h" #include "MailMessageList.h" --- spamprobe-1.4d.orig/src/parser/MbxMailMessageReader.cc +++ spamprobe-1.4d/src/parser/MbxMailMessageReader.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include #include "MailMessage.h" #include "MailMessageList.h" --- spamprobe-1.4d.orig/src/parser/PngParser.cc +++ spamprobe-1.4d/src/parser/PngParser.cc @@ -37,6 +37,14 @@ #include "StringReader.h" #include "PngParser.h" +#ifndef int_p_NULL +#define int_p_NULL NULL +#endif + +#ifndef png_infopp_NULL +#define png_infopp_NULL NULL +#endif + PngParser::PngParser(Message *message, AbstractTokenizer *tokenizer, AbstractTokenReceiver *receiver, --- spamprobe-1.4d.orig/src/parser/HtmlTokenizer.cc +++ spamprobe-1.4d/src/parser/HtmlTokenizer.cc @@ -28,6 +28,7 @@ // http://www.cooldevtools.com/qpl.html // +#include #include "AbstractTokenReceiver.h" #include "StringReader.h" #include "RegularExpression.h" --- spamprobe-1.4d.orig/debian/README.Debian +++ spamprobe-1.4d/debian/README.Debian @@ -0,0 +1,54 @@ + Note For First Time Users + -------------------------- + Per default spamprobe uses the directory ~/.spamprobe to store + its database when no directory is specified with the -d option. + IT DOES NOT AUTOMATICALLY CREATE DIRECTORIES when they don't + exist. Either you create the directory where you want the + database to reside with mkdir before invoking spamprobe or + use spamprobe's -c command line flag. + + + Upgrading Notes + --------------- + + Since version 1.4d-8, this package uses Berkeley DB 5.1. Previous + versions of this package used Berkeley DB version 4.8 or prior + databases for storing state. + + This change results in spamprobe choking when trying to modify + old databases (cf. Bug #440939). Section DATABASE MAINTENANCE + in the spamprobe(1) manual page outlines a manual upgrade path. + + Spamprobe allows each user to use own spamprobe databases, making + a fully automatic upgrade path infeasibly, again I apologize. + + To avoid this issue at next Berkeley DB migration, converting to + another data file format, like 'hash', is recommended. See + documentation of the -d option in spamprobe(1) man page. + + + Configure Options Used + ---------------------- + + When building the package, spamprobe was configured with these + specific options: + + default-8bit - Use 8bit characters by default + cdb - Concurrent Berkeley DB access + gif - Processing of gif images + png - Processing of png images + jpeg - Processing of jpeg images + + Should you experience problems with the latter option, please file + a bug report using a tool like reportbug. + + + Manual Page + ----------- + + Upstream's manual page spamprobe(1) is sort of reformating the + README coming with the tarball, including marketing blurb, build + and installation instructions, effectively making it annoying to + use for reference. Debian provides a docbook-xml based version + compiled from the original by Siggy Brentrup (previous maintainer + of this Debian package). --- spamprobe-1.4d.orig/debian/copyright +++ spamprobe-1.4d/debian/copyright @@ -0,0 +1,143 @@ +This package was debianized by Eric Dorland on +Thu, 12 Dec 2002 01:36:07 -0500. +It was maintained by Siggy Brentrup from April, 6th +2004 to March, 10th 2005. + +It is now maintained by Nicolas Duboc . + +It was downloaded from http://spamprobe.sourceforge.net/ + +Upstream Author: Brian Burton + +Copyright (C) 2002,2003,2004,2005,2006 Burton Computer Corporation ALL +RIGHTS RESERVED + +This program is open source software; you can redistribute it and/or +modify it under the terms of the Q Public License (QPL) version +1.0. Use of this software in whole or in part, including linking it +(modified or unmodified) into other programs is subject to the terms +of the QPL. + +This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the Q Public +License for more details. + +You should have received a copy of the Q Public License along with +this program; see the file LICENSE.txt. If not, visit the Burton +Computer Corporation or CoolDevTools web site QPL pages at: + + http://www.burton-computer.com/qpl.html + +====================================================================== + +The Q Public License Version 1.0 +Copyright (C) 1999 Trolltech AS, Norway. + +Everyone is permitted to copy and distribute this license document. + +The intent of this license is to establish freedom to share and change +the software regulated by this license under the open source model. + +This license applies to any software containing a notice placed by the +copyright holder saying that it may be distributed under the terms of +the Q Public License version 1.0. Such software is herein referred to +as the Software. This license covers modification and distribution of +the Software, use of third-party application programs based on the +Software, and development of free software which uses the Software. + +Granted Rights +-------------- + +1. You are granted the non-exclusive rights set forth in this license + provided you agree to and comply with any and all conditions in + this license. Whole or partial distribution of the Software, or + software items that link with the Software, in any form signifies + acceptance of this license. + +2. You may copy and distribute the Software in unmodified form + provided that the entire package, including - but not restricted to + - copyright, trademark notices and disclaimers, as released by the + initial developer of the Software, is distributed. + +3. You may make modifications to the Software and distribute your + modifications, in a form that is separate from the Software, such + as patches. The following restrictions apply to modifications: + +a. Modifications must not alter or remove any copyright notices in the + Software. + +b. When modifications to the Software are released under this license, + a non-exclusive royalty-free right is granted to the initial + developer of the Software to distribute your modification in future + versions of the Software provided such versions remain available + under these terms in addition to any other license(s) of the + initial developer. + +4. You may distribute machine-executable forms of the Software or + machine-executable forms of modified versions of the Software, + provided that you meet these restrictions: + +a. You must include this license document in the distribution. + +b. You must ensure that all recipients of the machine-executable forms + are also able to receive the complete machine-readable source code + to the distributed Software, including all modifications, without + any charge beyond the costs of data transfer, and place prominent + notices in the distribution explaining this. + +c. You must ensure that all modifications included in the + machine-executable forms are available under the terms of this + license. + +5. You may use the original or modified versions of the Software to + compile, link and run application programs legally developed by you + or by others. + +6. You may develop application programs, reusable components and other + software items that link with the original or modified versions of + the Software. These items, when distributed, are subject to the + following requirements: + +a. You must ensure that all recipients of machine-executable forms of + these items are also able to receive and use the complete + machine-readable source code to the items without any charge beyond + the costs of data transfer. + +b. You must explicitly license all recipients of your items to use and + re-distribute original and modified versions of the items in both + machine-executable and source code forms. The recipients must be + able to do so without any charges whatsoever, and they must be able + to re-distribute to anyone they choose. + +c. If the items are not available to the general public, and the + initial developer of the Software requests a copy of the items, + then you must supply one. + + +Limitations of Liability +------------------------ + +In no event shall the initial developers or copyright holders be +liable for any damages whatsoever, including - but not restricted to - +lost revenue or profits or other direct, indirect, special, incidental +or consequential damages, even if they have been advised of the +possibility of such damages, except to the extent invariable law, if +any, provides otherwise. + + +No Warranty +----------- + +The Software and this license document are provided AS IS with NO +WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. + + +Choice of Law +------------- + +This license is governed by the Laws of the state of Maryland. +Disputes shall be settled by a court selected by Burton Computer +Corporation. + --- spamprobe-1.4d.orig/debian/postrm +++ spamprobe-1.4d/debian/postrm @@ -0,0 +1,7 @@ +#!/bin/sh +# $URL$ +# $Id$ + +set -e + +#DEBHELPER# --- spamprobe-1.4d.orig/debian/config +++ spamprobe-1.4d/debian/config @@ -0,0 +1,12 @@ +#!/bin/sh + +set -e + +. /usr/share/debconf/confmodule + +if [ "$1" = "configure" ] && [ ! -z $2 ]; then + if dpkg --compare-versions "$2" lt "1.4d-8" ; then + db_input critical spamprobe/db51_upgrade || true + db_go || true + fi +fi --- spamprobe-1.4d.orig/debian/templates +++ spamprobe-1.4d/debian/templates @@ -0,0 +1,27 @@ +# These templates have been reviewed by the debian-l10n-english +# team +# +# If modifications/additions/rewording are needed, please ask +# debian-l10n-english@lists.debian.org for advice. +# +# Even minor modifications require translation updates and such +# changes should be coordinated with translators and reviewers. + +Template: spamprobe/db51_upgrade +Type: note +_Description: Upgrading to Berkeley DB 5.1 + As of spamprobe 1.4d-8, the database format changed + to Berkeley DB 5.1 and spamprobe is no longer able to modify + databases using an older format. + . + Since there is no general way to locate all existing databases, no + automatic upgrade is attempted. A manual upgrade path using + spamprobe export/import is outlined in the 'DATABASE MAINTENANCE' section + of the spamprobe(1) manual page. + . + All spamprobe users on this system should be informed of this change + and advised to read the README.Debian file. + . + To avoid this issue at next Berkeley DB migration, converting to + another data file format, like 'hash', is recommended. See + documentation of the -d option in spamprobe(1) man page. --- spamprobe-1.4d.orig/debian/changelog +++ spamprobe-1.4d/debian/changelog @@ -0,0 +1,308 @@ +spamprobe (1.4d-12ubuntu1) trusty; urgency=low + + * Build-depend on version-less libdb-dev, to transition to db5.3. + + -- Dmitrijs Ledkovs Mon, 04 Nov 2013 07:13:31 +0000 + +spamprobe (1.4d-12) unstable; urgency=low + + * Orphaning this package. + + -- Nicolas Duboc Sat, 20 Apr 2013 16:19:12 +0200 + +spamprobe (1.4d-11) unstable; urgency=low + + * Fix FTBFS with gcc 4.7 by using “this->” when needed + Patch from Cyril Brulebois (Closes: #667376). + * Bumped declaration of Debian policy compliance to 3.9.3 (nothing to do). + + -- Nicolas Duboc Tue, 10 Apr 2012 22:18:24 +0200 + +spamprobe (1.4d-10) unstable; urgency=low + + * Updated translation of debconf template: + * Dutch, by Jeroen Schot (closes: #623568) + * Danish, by Joe Hansen (closes: #645817 #608427) + * Japanese, by Hideki Yamane (closes: #645434) + * Czech, by Miroslav Kure (closes: #645855) + * Brazilian Portuguese, by Adriano Rafael Gomes (closes: #646186) + * Portuguese, by Rui Branco (closes: #646397) + * Spanish, by Omar Campagne (closes: #646576) + * Now build-depending on libjpeg-dev instead of libjpeg62-dev + to prepare libjpeg8 transition (closes: #644807) + * Idem for libpng-dev. + + -- Nicolas Duboc Mon, 31 Oct 2011 11:09:00 +0100 + +spamprobe (1.4d-9) unstable; urgency=low + + * Updated translations of debconf template: + * French, by Christian Perrier (closes: #640870) + * Russian, by Yuri Kozlov (closes: #641073) + * Swedish, by Martin Bagge (closes: #641238) + * German, by Helge Kreutzmann (closes: #641841) + * Integrated patch to make spamprobe compile with libpng1.5 + (closes: #641890) + + -- Nicolas Duboc Tue, 11 Oct 2011 14:25:57 +0200 + +spamprobe (1.4d-8) unstable; urgency=low + + * Upgraded to BerkeleyDB 5.1 (closes: #621438) + * Updated doc and debconf template accordingly. + * Added recommended build-arch and build-indep targets in debian/rules. + * fixed a grammatical error in NEWS.Debian + * Bumped declaration of Debian policy compliance to 3.9.2 (nothing to do). + + -- Nicolas Duboc Sun, 04 Sep 2011 22:05:37 +0200 + +spamprobe (1.4d-7) unstable; urgency=low + + * Fixed spelling error in man page (closes: #580950) + * Updated translations of debconf template: + * French, by Christian Perrier (closes: #581183) + * German, by Helge Kreutzmann (closes: #581213) + * Russian, by Yuri Kozlov (closes: #581403) + * Vietnamese, by Clytie Siddall (closes: #581502) + * Italian, by Luca Monducci (closes: #581755) + * Portugese, by Pedro Ribeiro (closes: #581816) + * Czech, by Miroslav Kure (closes: #581992) + * Brazilian Portugese, by Adriano Rafael Gomes (closes: #582112) + * Japanese, by Hideki Yamane (closes: #582304) + * Spanish, by Omar Campagne (closes: #582629) + * Swedish, by Martin Bagge (closes: #582700) + + -- Nicolas Duboc Mon, 24 May 2010 14:38:10 +0200 + +spamprobe (1.4d-6) unstable; urgency=low + + * Now build-depends on libgif-dev instead of obsolete libungif4-dev + (closes: #575862) + * New Spanish debconf template translation (closes: #579204) + * Switched to libdb 4.8 (closes: #548488) + * Updated debconf master template for the libdb upgrade. + * Package maintainer scripts no longer ignore errors + * Added ${misc:Depends} on binary package dependency list + * Bumped debhelper compat version from 4 to 7 + * Added debian/source/format file, but still using 1.0 format for now + * Added Homepage field in control file + * Compliant with Debian policy 3.8.3 (no change) + * Fixed portability issue of hash database on amd64 (closes: #564643) + Patch from Jem Berkes and Francis Russell + + -- Nicolas Duboc Sun, 09 May 2010 10:20:33 +0200 + +spamprobe (1.4d-5) unstable; urgency=low + + * Updated Swedish translation of debconf template by Martin Bagge. + (closes: #491958) + + -- Nicolas Duboc Sat, 09 Aug 2008 16:17:15 +0200 + +spamprobe (1.4d-4) unstable; urgency=low + + * Fixed man page to give correct description of the dump sorting + command lines. 'sort by spamness' and 'sort by goodness' were inverted. + (closes: #453772) + * Corrected typo in man page : 'pipeing' becomes 'piping'. + (closes: #453773) + * Japanese translation of debconf template by Hideki Yamane + (closes: #463312) + * Debian policy compliance upgraded to 3.7.3. + + -- Nicolas Duboc Sun, 11 May 2008 21:35:42 +0200 + +spamprobe (1.4d-3) unstable; urgency=low + + [ Nicolas Duboc ] + * Fixed the NEWS.Debian file syntax. + * Make spamprobe compile with (the current snapshot of) GCC 4.3 + (closes: #417706) + * Renamed the debconf template to prevent confusion with template + used in 0.9h-2. + + [ Christian Perrier ] + * Debconf templates and debian/control reviewed by the debian-l10n- + english team as part of the Smith review project. Closes: #444802 + * [Debconf translation updates] + * Czech. Closes: #445028 + * Galician. Closes: #445055 + * Vietnamese. Closes: #445131 + * Tamil. Closes: #445252 + * Finnish. Closes: #445697 + * Basque. Closes: #445952 + * Italian. Closes: #446156 + * German. Closes: #446279 + * Portuguese. Closes: #446303 + * Russian. Closes: #446654 + * French. Closes: #446979 + * Brazilian Portuguese. Closes: #447030 + + -- Nicolas Duboc Tue, 06 Nov 2007 21:14:23 +0100 + +spamprobe (1.4d-2) unstable; urgency=low + + * Fixed the po files for debconf templates after template change + in 1.4d-1. + * Added po-debconf template translations + - german translation by Matthias Julius (closes: #412773) + - portugueuse translation by Ricardo Silva (closes: #414928) + + -- Nicolas Duboc Mon, 10 Sep 2007 21:50:04 +0200 + +spamprobe (1.4d-1) unstable; urgency=low + + * New upstream release. + * Build with PNG, GIF and JPEG support. + * Acknowledge NMU: + - add watch file + - switch to libdb4.6 (closes: #421954) + - bump to Standards-Version 3.7.2 + * Add a debconf note about needed database upgrade (closes: #440939) + * Slightly updated package description. + + -- Nicolas Duboc Sun, 10 Sep 2007 17:56:54 +0200 + +spamprobe (1.4b-2.1) unstable; urgency=low + + * NMU + * Bump to Standards-Version 3.7.2. + * Switch to db4.6. closes: #421954. + * Add watch file. + + -- Clint Adams Fri, 31 Aug 2007 14:14:40 -0400 + +spamprobe (1.4b-2) unstable; urgency=low + + * Made spamprobe compile with g++ 4.1 (closes: #357478) + (needed explicit inclusion of ) + + -- Nicolas Duboc Wed, 22 Feb 2006 12:53:40 +0100 + +spamprobe (1.4b-1) unstable; urgency=low + + * New upstream release (closes: #342596, #352501, #353685) + * Corrected changelog file to include the license declaration + * Added Swedish translation of debconf template by Daniel Nylander. + (closes: #332964) + * Build depends on libungif4-dev (for scanning of GIF attachments) + * Actually include the debian/NEWS file. + * Build man page in 'build' target (and not 'install') + * Removed the README.txt because the man page is a better reference + for the users. + * Updated description to advertise the GIF feature. + * Updated the man page. + + -- Nicolas Duboc Sun, 19 Feb 2006 14:16:42 +0100 + +spamprobe (1.2a-1) unstable; urgency=low + + * New upstream release. + * Debian policy compliance upgraded to 3.6.2. + * Removed dpatch system which was not used since previous version. + * Added Vietnamese translation of debconf template by Clytie Siddall. + (closes: #318699) + * Now depends on debconf | debconf-2.0. + * Removed technical detail ("C++") from the package description. + * Update of documentation (docbook man page and Debian files). + * Fixed bug preventing relative path with -d option. (closes: #246312) + + -- Nicolas Duboc Tue, 2 Aug 2005 12:37:56 +0200 + +spamprobe (1.0a-1) unstable; urgency=low + + * New upstream release (closes: #286313) + * New maintainer (closes: #298368) + * Acknowledging NMU changes (closes: #246794, #258703) + * Removed 10_configure.in_db4.2.dpatch and dependency on automake since + Siggy's patch has been integrated upstream + * spamprobe.db: in "DATABASE MAINTAINANCE" section, the export/import + sample now use -d option with absolute path to workaround bug + #246312 + + -- Nicolas Duboc Thu, 10 Mar 2005 22:24:05 +0100 + +spamprobe (0.9h-2.1) unstable; urgency=low + + * Non-maintainer upload for fixing longstanding l10n issues + * Translations: + - French added. Closes: #246794 + - Japanese added. Closes: #258703 + - Danish added. Thanks to Morten Brix Pedersen + - Russian added. Thanks to Yuri Kozlov + - Basque added. Thanks to Piarres Beobide Egana + - Dutch added. Thanks to Luk Claes + * Lintian fixes: + - Remove initial article in package description + + -- Christian Perrier Thu, 20 Jan 2005 06:56:20 +0100 + +spamprobe (0.9h-2) unstable; urgency=medium + + * Build-depend on xmlto when using it (closes: #242699)! + * README.Debian: prominent notice regarding -c. + * Debconf template warning about Berkeley DB upgrade (closes: #242630). + * Add empty postinst to force db_update warning even when installing + with dpkg. + + -- Siggy Brentrup Thu, 8 Apr 2004 11:39:14 +0200 + +spamprobe (0.9h-1) unstable; urgency=low + + * New upstream release (Closes: #234682). + * New maintainer (Closes: #235887). + * debian/rules: Remove cdbs until dpatch is supported (Bug #241504). + * Build depend on dpatch, docbook-to-man, automake1.8. + * 10_configure.in_db4.2.dpatch: make selection of Berkeley DB version + depending on which libdb*-dev package is installed work. + * debian/README.Debian: provide some packaging info. + * debian/spamprobe.db: compile usable man page from upstream version. + + -- Siggy Brentrup Tue, 6 Apr 2004 20:22:27 +0200 + +spamprobe (0.9e-1) unstable; urgency=low + + * New upstream release. + * debian/rocks: Actually remove this time. + + -- Eric Dorland Mon, 22 Sep 2003 23:15:14 -0400 + +spamprobe (0.9b-1) unstable; urgency=low + + * New upstream release. + * debian/rules: Use cdbs. + * debian/control: + - Build-Depend on cdbs. + - Upgrade Standard-Version to 3.6.1. + * debian/rocks: Remove, no longer necessary. + + -- Eric Dorland Sun, 24 Aug 2003 16:40:48 -0400 + +spamprobe (0.8b-1) unstable; urgency=low + + * New upstream release. + * debian/rules: Update to version 1.52 of CBS. + * spamprobe.1: Add .br to procmail example. (Closes: #180376) + + -- Eric Dorland Mon, 10 Feb 2003 01:41:57 -0500 + +spamprobe (0.8-1) unstable; urgency=low + + * New upstream release. + * Update to version 1.49 of Colin's Build System. + + -- Eric Dorland Sun, 5 Jan 2003 20:28:12 -0500 + +spamprobe (0.7g-2) unstable; urgency=low + + * debian/docs: Add maildrop doc. + * debian/control: Add build-dep on libdb3-dev. (Closes: #174015) + + -- Eric Dorland Mon, 23 Dec 2002 01:17:03 -0500 + +spamprobe (0.7g-1) unstable; urgency=low + + * Initial Release. + + -- Eric Dorland Thu, 12 Dec 2002 01:36:07 -0500 + --- spamprobe-1.4d.orig/debian/postinst +++ spamprobe-1.4d/debian/postinst @@ -0,0 +1,9 @@ +#! /bin/sh +# $URL: svn+bsb://svn.winnegan.de/svn/spamprobe/trunk/debian/postinst $ +# $Id: postinst 28 2004-04-08 09:39:53Z bsb $ + +set -e + +. /usr/share/debconf/confmodule + +#DEBHELPER# --- spamprobe-1.4d.orig/debian/control +++ spamprobe-1.4d/debian/control @@ -0,0 +1,21 @@ +Source: spamprobe +Section: mail +Priority: optional +Maintainer: Ubuntu Developers +XSBC-Original-Maintainer: Debian QA Group +Build-Depends: debhelper (>> 7.0.0), po-debconf, libdb-dev, xmlto, libgif-dev, libpng-dev, libjpeg-dev +Standards-Version: 3.9.3 +Homepage: http://spamprobe.sf.net/ + +Package: spamprobe +Architecture: any +Recommends: procmail | maildrop +Depends: debconf | debconf-2.0, ${misc:Depends}, ${shlibs:Depends} +Description: Bayesian spam filter + This package provides a spam filter based on the article 'A Plan for Spam' + by Paul Graham. It uses a database (either BerkeleyDB or a simpler hash + file) to store one- and two-word phrases. Only certain headers are analyzed + and HTML tags are ignored to prevent false positives of legitimate HTML + emails. Image attachments are considered as words that can signal spam. It + can be simply integrated with procmail or maildrop to filter spam on + incoming mail. --- spamprobe-1.4d.orig/debian/rules +++ spamprobe-1.4d/debian/rules @@ -0,0 +1,83 @@ +#!/usr/bin/make -f + +package=spamprobe +PACKAGE=$(package) + +binaries=spamprobe +manpages=spamprobe.1 + +#export DH_VERBOSE=1 + +SHELL=/bin/bash + +DESTDIR=$$(pwd)/debian/$(package) + +ifneq (,$(findstring debug,$(DEB_BUILD_OPTIONS))) +CFLAGS += -g +endif +ifeq (,$(findstring nostrip,$(DEB_BUILD_OPTIONS))) +INSTALL_PROGRAM += -s +endif + +build-arch: Makefile debian/po/templates.pot manpage + $(MAKE) + touch build-arch + +debian/po/templates.pot: debian/templates + @debconf-updatepo + +manpage: debian/spamprobe.db + xmlto man debian/spamprobe.db + +Makefile: + ./configure --prefix=/usr --enable-default-8bit --enable-cdb \ + --with-gif --with-png --with-jpeg + +clean: + $(checkdir) + [ ! -f Makefile ] || $(MAKE) distclean + rm -rf build + dh_clean + rm -rf $$(find . -name "*~") + find .. -name $(package)*dsc.asc -size 0 -maxdepth 1 -exec rm {} ";" + +build-indep: + +build: build-arch build-indep + +binary-indep: +# There are no architecture-independent files to be uploaded +# generated by this package. If there were any they would be +# made here. + + +binary-arch: checkroot build + $(checkdir) + + dh_installdirs + dh_installdocs + dh_installchangelogs ChangeLog + $(MAKE) install prefix=$(DESTDIR)/usr mandir=$(DESTDIR)/usr/share/man + dh_strip + dh_compress + dh_installdebconf + dh_installdeb + dh_shlibdeps + dh_gencontrol + dh_md5sums + dpkg --build debian/$(package) .. + + +define checkdir + test -f debian/rules +endef + +# Below here is fairly generic really + +binary: binary-indep binary-arch + +checkroot: + $(checkdir) + dh_testroot + +.PHONY: binary binary-arch binary-indep build build-indep clean checkroot manpage --- spamprobe-1.4d.orig/debian/watch +++ spamprobe-1.4d/debian/watch @@ -0,0 +1,3 @@ +version=3 + +http://sf.net/spamprobe/ spamprobe-(.+)\.tar\.gz --- spamprobe-1.4d.orig/debian/NEWS +++ spamprobe-1.4d/debian/NEWS @@ -0,0 +1,24 @@ +spamprobe (1.4b-1) unstable; urgency=low + + Spamprobe now scans image attachments to score mails. Properties of + the image are used as words that can be classified as spam or + not. That allows spamprobe to classify as spam mails including + attachments used in previous spams. + + This version includes support for Maildir mailboxes. Commands + accepting mailbox pathes now accept MBOX, MBX and Maildir boxes. + + Spamprobe may now use an optional configuration file. The new + command create-config is used to initialize this file. See man page. + + -- Nicolas Duboc Sun, 19 Feb 2006 19:26:00 +0200 + +spamprobe (0.9h-1) unstable; urgency=low + + * Defaults for character size have changed, if you want the old + behaviour, you have to specify -7 on the spamprobe command line. + * Use Berkeley DB 4.2 instead of 3. Existing databases are still + readable but must be manually upgraded using db4.2_upgrade. + + -- Siggy Brentrup Fri, 2 Apr 2004 18:22:32 +0200 + --- spamprobe-1.4d.orig/debian/docs +++ spamprobe-1.4d/debian/docs @@ -0,0 +1 @@ +contrib/README-maildrop.txt --- spamprobe-1.4d.orig/debian/compat +++ spamprobe-1.4d/debian/compat @@ -0,0 +1 @@ +7 --- spamprobe-1.4d.orig/debian/spamprobe.db +++ spamprobe-1.4d/debian/spamprobe.db @@ -0,0 +1,947 @@ + +Siggy"> + Brentrup"> + April 7, 2004"> + 1"> + bsb@debian.org"> + + SPAMPROBE"> + + + Debian GNU/Linux"> + GNU"> + + GPL"> + QPL"> + SpamProbe"> + HTML"> + NFS"> + Berkeley DB"> +]> + + + + SPAMPROBE + 1 + April 6, 2004 + User commands + + + spamprobe + A Bayesian spam filter + + + + spamprobe + options + command + files ... + + + + DESCRIPTION + + &SP; is a spam filter relying on a Bayesian analysis of the + frequency of words used in spam and non-spam emails received by an + individual person. The process is completely automatic and + tailors itself to the kinds of emails that each person + receives. + + &SP; recognizes and decodes MIME attachments in quoted-printable + and base64 encoding. Image attachments are considered as words that + can signal a spam. By default, it ignores HTML tags for scoring + purpose. + + &SP; supports MBOX, MBX and Maildir mailbox formats. These + formats are automatically detected for mailboxes used as + parameters of &SP; commands. + + spamprobe is designed to be used in mail + delivery agents (MDAs) like + + procmail + 1 + or + + maildrop + 1 + to + help in identifying spam. + + + + OPTIONS + The recognized options are: + + + + By default &SP; converts non-ascii characters + (characters with the most significant bit set to 1) into the + letter 'z'. This is useful for lumping all Asian characters + into a single word for easy recognition. The -a option + allows you to change the character to something else if you + don't like the letter 'z' for some reason. + + + + Tells &SP; to create the database directory if + it does not already exist. Normally &SP; exits with a + usage error if the database directory does not already + exist. + + + + Tells &SP; to assign a default, somewhat + neutral, probability to any term that does not have a + weighted (good count doubled) count of at least + number in the database. This + prevents terms which have been seen only a few times from + having an unreasonable influence on the score of an email + containing them. + The default value is 5. For example if number + is 5 then in order for a term to use its + calculated probability it must have been seen 3 times in + good mails, or 2 times in good mails and once in spam, or 5 + times in spam, or some other combination adding up to at + least 5. + + + + By default &SP; stores its database in a + directory named .spamprobe under your + home directory. The option allows you + to specify a different directory to use. This is necessary + if your home directory is NFS mounted for + example. + + The directory name can be prefixed with a special code + to force &SP; to use a particular type of data file format. + + Defined types include: + + + + -d bdb:path + Forces the use of Berkeley DB data file. + + + -d hash:path + Forces the use of an mmapped hash + file. + + + -d split:path + Forces the use of a hash file and ISAM + file (may provide better precision than plain hash in + some cases). + + The hash: option + can also specify a desired file size in megabytes before the + path. For example -d hash:19:path would + cause &SP; to use a 19 MB hash file. The size must be in + the range of 1-100. The default hash file size is 16 MB. + Because hash files have a fixed size and capacity they + should be cleaned relatively often using the + cleanup command (see below) to prevent + them from becoming full or being slowed by too many hash key + collisions. + Hash files provide better performance than Berkeley + DB. However hash files + do not store the original terms. Only a 32 bit hash key is + stored with each term. This prevents a user from exploring + the terms in the database using the dump command to see what + words are particularly spammy or hammy. The default data + file format is Berkeley BD (bdb). + + + + + Tells &SP; to use the database in the specified + directory (must be different than the one specified with the + option) as a shared database from which + to draw terms that are not defined in the user's own + database. This can be used to provide a baseline database + shared by all users on a system (in the directory) and a private database unique to each + user of the system ($HOME/.spamprobe or + directory). + + + + + Tells &SP; what header to look for previous + score and message digest in. Default is X-SpamProbe. Field + name is not case sensitive. Used by all commands except + receive. + + + + By default &SP; removes &HTML; markup from the + text in emails to help avoid false positives. The + option allows you to override this + behavior and force &SP; to include words from within &HTML; tags + in its word counts. Note that &SP; always counts any + URLs in hrefs within tags whether is + used or not. Use of this option is discouraged. It can + increase the rate of spam detection slightly but unless the + user receives a significant amount of &HTML; emails it also + tends to increase the number of false + positives. + + + + By default &SP; only scans a meaningful + subset of headers from the email message when searching for + words to score. The option allows the + user to specify additional headers to scan. Legal values are + all, nox, + none, or normal. + all scans all headers, nox + scans all headers except those starting with X-, + none does not scan headers, and + normal scans the normal set of + headers. + In addition to those values you can also explicitly add + a header to the list of headers to process by adding the + header name in lower case preceded by a plus sign. Multiple + headers can be specified by using multiple options. For example, to include only the + From and Received headers + in your train command you could run + &SP; as follows: + spamprobe -Hnone -H+from -H+received train + + To process the normal set of headers but also add the + SpamAssassin header X-SpamStatus you could run &SP; as + follows: + spamprobe -H+x-spam-status train + + + + + Changes the spam probability threshold for + emails from the default (0.7) to + number. The number must be a + value between 0 and 1. Generally the value + should be above 0.5 to avoid a high false positive rate. Lower + numbers tend to produce more false positives while higher + numbers tend to reduce accuracy. + + + + Forces &SP; to use + mbox format for reading emails in + receive mode. Normally &SP; assumes + that the input to receive mode contains a + single message so it doesn't look for message + breaks. + + + + Forces &SP; to treat the entire input as + a single message. This ignores From lines + and Content-Length headers in the input. + + + + + Enables special options by name. Currently + the only special options are: + + + + Causes &SP; to emulate the filtering + algorithm originally outlined in A Plan For + Spam. + + + + Causes &SP; to ignore messages if + they have a Status: header containing a capital D. Some + mail servers use this status to indicate a message that + has been flagged for deletion but has not yet been + purged from the file. + DO NOT use this option with the receive or train + command in your procmailrc file! Doing so could allow + spammers to bypass the filter. This option is meant to + be used with the train-spam and + train-good commands in scripts that + periodically update the database. + + + + Causes &SP; to use its original + scoring algorithm that produces excellent results but + tends to generate scores of either 0 or 1 for all + messages. + + + + Causes &SP; to scan the contents of + suspicious tags for tokens rather than simply throwing + them out. Currently only font tags are scanned but + other tags may be added to this list in later + versions. + + + + + Causes &SP; to read tokens one per line rather + than processing the input as mail format. This allows + users to completely replace the standard &SP; + tokenizer if they wish and instead use some external + program as a tokenizer. + In this mode &SP; considers a blank line to + indicate the end of one message's tokens and the start + of a new message's tokens. &SP; computes a message + digest based on the lines of text containing the + tokens. + + + The option can be used multiple times and + all requested options will be applied. Note that some options + might conflict with each other in which case the last option + would take precedence. + + + + Changes the maximum number of words per + phrase. Default value is two. Increasing the limit + improves accuracy somewhat but increases database size. + Experiments indicate that increasing beyond two is not worth + the extra cost in space. + + + + Causes &SP; to perform a purge of all + terms with junk count less than or equal 2 after every + number messages are processed. Using this option when + classifying a large collection of spam can prevent the + database from growing overly large at the cost of more + processing time and possible loss of + precision. + + + + Changes the number of times that a single + word/phrase can occur in the top words array used to + calculate the score for each message. Allowing repeats + reduces the number of words overall (since a single word + occupies more than one slot) but allows words which occur + frequently in the message to have a higher weight. Generally + this is changed only for optimization + purposes. + + + + Causes &SP; to treat the input as a + single message and to base its exit code on whether or not + that message was spam. The exit code will be 0 if the + message was spam or 1 if the message was + good. + + + + &SP; maintains an in memory cache of the + words it has seen in previous messages to reduce disk I/O + and improve performance. By default the cache will contain + the most recently accessed 2,500 terms. This number can be + changed using the option. Using a larger + the cache size + will cause &SP; to use more memory and, potentially, to + perform less database I/O. + A value of zero causes &SP; to use 100,000 as the limit which + effectively means that the cache will only be flushed at program + exit (unless you have really enormous mailbox files). The cache + doesn't affect receive, dump, or export but has a significant + impact on the others. + + + + Causes &SP; to write out the top terms + associated with each message in addition to its normal + output. Works with find-good, + find-spam, and + score. + + + + When it appears once on the command line this + option tells + &SP; to write verbose information during processing. When it + appears twice on the command line this option tells &SP; to + write debugging information to stderr. This can be useful for + debugging or for seeing which terms &SP; used to score each + email. + + + + Prints version and copyright information and + then exits. + + + + Changes the number of most significant + words/phrases used by &SP; to calculate the score for + each message. Generally this is changed only for + optimization purposes. + + + + Normally &SP; uses only a fixed number of + top terms (as set by the command line option) when + scoring emails. The option can be used to allow the + array to be extended past the max size if more terms are + available with probabilities <= 0.1 or >= 0.9. + + + + An interesting variation on the scoring + settings. Equivalent to using so that + generally only words with probabilites <= 0.1 or >= 0.9 are + used and word frequencies in the email count heavily towards + the score. Tests have shown that this setting tends to be + safer (fewer false positives) and have higher recall (proper + classification of spams previously scored as spam) although + its predictive power isn't quite as good as the default + settings. WARNING: This setting might work best with a + fairly large corpus, it has not been tested with a small + corpus so it might be very inaccurate with fewer than 1000 + total messages. + + + + Assume traditional Berkeley mailbox format, + ignoring any Content-Length: fields. + + + + Tells &SP; to ignore any characters with + the most significant bit set to 1 instead of mapping them to + the letter 'z'. + + + + Tells &SP; to store all characters even + if their most significant bit is set to 1. + + + + + + COMMANDS + &SP; recognizes the following commands: + + + spamprobe help [ command ] + With no arguments &SP; lists all of the + valid commands. If one or more commands are specified after + the word help, &SP; will print a more verbose description of + each command. + + + + spamprobe create-db + If no database currently exists &SP; will + attempt to create one and then exit. This can be used to + bootstrap a new installation. Strictly speaking this + command is not necessary since the + train-spam, train-good, + and auto-train commands will also create a + database if none already exists but some users like to create + a database as a separate installation step. + + + + spamprobe create-config + Writes a new configuration file named + spamprobe.hdl into the database + directory (normally $HOME/.spamprobe). + Any existing configuration file will be overwritten so be + sure to make a copy before invoking this command. + + + + spamprobe receive [ filename... ] + Tells &SP; to read its standard input (or + a file specified after the receive command) and score it + using the current databases. Once the message has been + scored the message is classified as either spam or non-spam + and its word counts are written to the appropriate database. + The message's score is written to stdout along with a single + word. For example: + +SPAM 0.9999999 595f0150587edd7b395691964069d7af +GOOD 0.0200000 595f0150587edd7b395691964069d7af + + The string of hex digits after the score is the + message's MD5-digest, a 128 bit number + which uniquely identifies the message. The digest is used + by &SP; to recognize messages that it has processed + previously so that it can keep its word counts consistent + if the message is reclassified. + Using the option additionally + lists the terms used to produce the score along with their + counts (number of times they were found in the + message). + + + + spamprobe train [ filename... ] + + Functionally identical to receive except + that the database is only modified if the message was + difficult to classify. In practice + this can reduce the number of database updates to as + little as 10% of messages received. + + + + spamprobe score [ filename... ] + + Similar to receive except that the database is not + modified in any way. + + + + spamprobe summarize [ filename... ] + + Similar to score except that it prints + a short summary and score for each message. This can be + useful when testing. Using the option + additionally lists the terms used to produce the score + along with their counts (number of times they were found + in the message). + + + + spamprobe find-spam [ filename... ] + + Similar to score except that it + prints a short summary and score for each message that is + determined to be spam. This can be useful when testing. + Using the option additionally lists + the terms used to produce the score along with their + counts (number of times they were found in the + message). + + + + spamprobe find-good [ filename... ] + + Similar to score except that it + prints a short summary and score for each message that is + determined to be good. This can be useful when testing. + Using the option additionally lists + the terms used to produce the score along with their + counts (number of times they were found in the + message). + + + + spamprobe auto-train { SPAM|GOOD + filename ... } ... + + Attempts to efficiently build a database from all of + the named files. You may specify one or more file of each + type. Prior to each set of file names you must include the + word SPAM or GOOD to + indicate what type of mail is contained in the files which + follow on the command line. + The case of the SPAM and + GOOD keywords is important. Any number + of file names can be specified between the keywords. The + command line format is very flexible. You can even use a + find command in backticks to process whole directory trees + of files. For example: + spamprobe auto-train SPAM spams/* GOOD `find hams -type f` + &SP; pre-scans the files to determine how many emails + of each type exist and then trains on hams and spams in a + random sequence that balances the inflow of each type so + that the train command can work most effectively. For + example if you had 400 hams and 400 spams, auto-train will + generally process one spam, then one ham, etc. If you had + 4000 spams and 400 hams then auto-train will generally + process 10 spams, then one ham, etc. + Since this command will likely take a long time to run + it is often desireable to use it with the -v option to see + progress information as the messages are processed. + spamprobe -v auto-train SPAM spams/* GOOD hams/* + + + + + spamprobe good [ filename... ] + + Scans each file (or stdin if no file is specified) and + reclassifies every email in the file as non-spam. The + databases are updated appropriately. Messages previously + classified as good (recognized using their MD5 digest) are + ignored. Messages previously classified as spam are + reclassified as good. + + + + spamprobe train-good [ filename... ] + + Functionally identical to good command + except that it only updates the database for messages that + are either incorrectly classified (i.e. classified as + spam) or are difficult to + classify. In practice this can reduce amount of database + updates to as little as 10% of messages. + + + + spamprobe spam [ filename... ] + + Scans each file (or stdin if no file is specified) and + reclassifies every email in the file as spam. The + databases are updated appropriately. Messages previously + classified as spam (recognized using their MD5 digest of + message ids) are ignored. Messages previously classified + as good are reclassified as spam. + + + + spamprobe train-spam [ filename... ] + + Functionally identical to spam command + except that it only updates the database for messages that + are either incorrectly classified (i.e. classified as + good) or are difficult to classify. In + practice this can reduce amount of database updates to as + little as 10% of messages. + + + + spamprobe remove [ filename... ] + + Scans each file (or stdin if no file is specified) and + removes its term counts from the database. Messages which + are not in the database (recognized using their MD5 digest + of message ids) are ignored. + + + + spamprobe cleanup [ junk_count [ max_age ] ] + + Scans the database and removes all terms with + junk_count or less (default 2) + which have not had their counts modified in at least + max_age days (default 7). You + can specify multiple count/age pairs on a single command + line but must specify both a count and an age for all but + the last count. This should be run periodically to keep + the database from growing endlessly. + + + + spamprobe purge [ junk_count ] + + Similar to cleanup but forces the immediate deletion + of all terms with total count less than + junk_count (default is 2) no + matter how long it has been since they were modified + (i.e. even if they were just added today). This could be + handy immediately after classifying a large mailbox of + historical spam or good email to make room for the next + batch. + + + + spamprobe purge-terms regex + + Similar to purge except that it removes from the + database all terms which match the specified regular + expression. Be careful with this command because it could + remove many more terms than you expect. Use + dump with the same + regex before running this + command to see exactly what will be deleted. + + + + spamprobe edit-term term good_count spam_count + + Can be used to specifically set the good and spam + counts of a term. Whether this is truly useful is doubtful + but it is provided for completeness sake. + + + + + spamprobe dump [ regex ] + + Prints the contents of the word counts database one + word per line in human readable format with spam + probability, good count, spam count, flags, and word in + columns separated by whitespace. When given, the + regex argument limits output to + matching tokens. + + + + spamprobe tokenize [ filename ] + + Prints the tokens found in the file one word per line + in human readable format with spam probability, good + count, spam count, message count, and word in columns + separated by whitespace. Terms are listed in the order in + which they were encountered in the message. The standard + unix sort command can be used to sort the terms as + desired. + + + + spamprobe export + + Similar to the dump command but + prints the counts and words in a comma separated format + with the words surrounded by double quotes. This can be + more useful for importing into some databases. + + + + spamprobe import + + Reads the specified files which must contain export + data written by the export command. + The terms and counts from this file are added to the + database. This can be used to convert a database from a + prior version. + + + + + + + EXAMPLES + + External Tokenizers + Assuming you have a tokenizer tokenize.pl, in your procmailrc file you could use: + +SCORE=| tokenize.pl | /usr/bin/spamprobe -o tokenized train + + + + + Querying Mailboxes + To list all words from most good to + least good use this command: + +spamprobe tokenize filename | sort -k 1n -k 2nr + + + To list all words from most spammy to + least spammy use this command: + +spamprobe tokenize filename | sort -k 1nr -k 3nr + + + + + Querying The Database + Use spamprobe dump to get a human + readable list of tokens in &SP;'s database. &BDB; sorts terms + alphabetically; piping output into the standard unix + + sort + 1 + command + can be used to sort the terms as desired. + To list all words in &SP;'s database from most good to + least good use this command: + +spamprobe dump | sort -k 1n -k 2nr + + To list all words from most spammy to + least spammy use this command: + +spamprobe dump | sort -k 1nr -k 3nr + + + Optionally you can specify a regular expression. If + specified &SP; will only dump terms matching the + regular expression. For example: + +spamprobe dump 'finance' +spamprobe dump '\\bfinance\\b' +spamprobe dump 'HSubject_.*finance' + + + + + + DATABASE MAINTAINANCE + When no provision is taken, &SP;'s databases will constantly + grow while classifying messages. In order to remove old unused + entries, you should run cleanup on a + regular basis, most easily from + + cron + 1 + . + # daily at 00:03 +# remove entries with count <= 2 that haven't +# been touched during the last 2 weeks from +# spamprobe's database +3 0 * * * /usr/bin/spamprobe cleanup 2 14 + + Alternatively you might want to use a much higher count + (1000 in this example) for terms that have not been seen in + roughly six months: + 3 0 * * * /home/brian/bin/spamprobe cleanup 1000 180 2 14 + + Because of the way that &BDB; works the + database file will not actually shrink, but newly added + terms will be able to use the space previously occupied by + any removed terms so that the file's growth should be + significantly slower if this command is used. + To actually shrink the database you can build a new + one using the &BDB; utility programs + + db_dump + 1 + and + + db_load + 1 + or + the &SP; import and + export commands. For example: + +cd ~ +mkdir new.spamprobe +spamprobe export | spamprobe -d ~/new.spamprobe import +mv .spamprobe old.spamprobe +mv new.spamprobe .spamprobe + + + The option can also be used to limit the + rate of growth of the database when importing a large number of + emails. For example if you want to classify 1000 emails and + want &SP; to purge rare terms every 100 messages use a command + such as: + +spamprobe -P 100 good goodmailboxname + + + Using slows down the classification but + can avoid the need to use the + export/import trick. + Note that only makes sense when + classifying a large number of messages. + + You may want to force a particular word to + be very spammy or extremely good: + +spamprobe edit-term xanax 0 1000000 +spamprobe edit-term debian 10000000 0 + + At least pinning good terms tends to help spammers. + + + + + BUGS + This manual page is still work in progress. In particular + it's lacking a description of which headers are processed with + and how terms are generated from + headers as well as a reference to the regex syntax applicable to + dump and purge-term commands. + + + + + FILES + + + ~/.spamprobe + When not otherwise specified with the + + option, &SP; stores its database files in this directory. + It does not automatically create database + directories except when explicitly asked to by the + command line flag or the + create-db command. If your + home directory is &NFS; mounted, use a different directory + on a local disk, since &BDB; performance suffers badly over + &NFS;. + + + + ~/.spamprobe/spamprobe.hdl + Configuration file for + spamprobe. This file is optional. It + can be initialized with all the default values by the + create-config command. + + + + + + SEE ALSO + + + + + procmail + 1 + + + + + maildrop + 1 + + + + + + + + AUTHOR + &SP; has been written by Brian Burton + <bburton@users.sourceforge.net> and is published under + the &QPL; (Qt Public License). + This manual page was compiled by &dhusername; + &dhemail; from the distributed one for the &debian; system + but may be used by others. Permission is granted to copy, + distribute and/or modify this document under the terms of the + &GPL; version 2. + + + + + --- spamprobe-1.4d.orig/debian/source/format +++ spamprobe-1.4d/debian/source/format @@ -0,0 +1 @@ +1.0 --- spamprobe-1.4d.orig/debian/po/vi.po +++ spamprobe-1.4d/debian/po/vi.po @@ -0,0 +1,74 @@ +# Vietnamese translation for SpamProbe. +# Copyright © 2010 Free Software Foundation, Inc. +# Clytie Siddall , 2005-2010. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe 1.4d-7\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2010-05-13 19:39+0930\n" +"Last-Translator: Clytie Siddall \n" +"Language-Team: Vietnamese \n" +"Language: vi\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" +"Plural-Forms: nplurals=1; plural=0;\n" +"X-Generator: LocFactoryEditor 1.8\n" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "Upgrading to Berkeley DB 4.8" +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Đang nâng cấp lên cơ sở dữ liệu Berkeley DB 4.8" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "" +#| "As of spamprobe 1.4d-6, the database format changed to Berkeley DB 4.8 " +#| "and spamprobe is no longer able to modify databases using an older format." +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Kể từ spamprobe phiên bản 1.4d-6, định dạng cơ sở dữ liệu đã thay đổi thành " +"Berkeley DB 4.8 nên spamprobe không còn có khả năng sửa đổi lại cơ sở dữ " +"liệu theo định dạng cũ." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Vì không có phương pháp chung để tìm mọi cơ sở dữ liệu đã có, không có tiến " +"trình tự động nâng cấp sẽ được thử chạy. Một đường dẫn nâng cấp bằng tay " +"dùng công cụ xuất/nhập của spamprobe được diễn tả trong phần 'DATABASE " +"MAINTENANCE' (duy trì cơ sở dữ liệu) của trang hướng dẫn spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Mọi người dùng spamprobe trên hệ thống này đều nên nhận thông tin về thay " +"đổi này và đọc tập tin Đọc Đi (README.Debian)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" --- spamprobe-1.4d.orig/debian/po/ta.po +++ spamprobe-1.4d/debian/po/ta.po @@ -0,0 +1,72 @@ +# translation of spamprobe.po to TAMIL +# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER +# This file is distributed under the same license as the PACKAGE package. +# +# Dr.T.Vasudevan , 2007. +msgid "" +msgstr "" +"Project-Id-Version: spamprobe\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2007-10-04 12:27+0530\n" +"Last-Translator: Dr.T.Vasudevan \n" +"Language-Team: TAMIL \n" +"Language: \n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" +"X-Generator: KBabel 1.11.4\n" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "Upgrading to Berkeley DB 4.6" +msgid "Upgrading to Berkeley DB 5.1" +msgstr "பெர்க்லேய் டிபி 4.6 க்கு மேம்படுத்துதல்" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "" +#| "As of spamprobe 1.4d-1, the database format changed to Berkeley DB 4.6 " +#| "and spamprobe is no longer able to modify databases using an older format." +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"ஸ்பாம்ப்ரோப் 1.4d-1 முதல் தரவுத்தளம் பெர்க்லேய் டிபி 4.6 க்கு மாற்றப்படுகிறது. பழைய " +"முறைமையை பயன்படுத்தும் தரவுத்தளத்தை மாற்ற ஸ்பாம்ப்ரோப் ஆல் இயலாது." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"இருப்பில் உள்ள அனைத்து தரவுத்தளங்களையும் கண்டறிதல் அரிதாகையால் தானியங்கி மேம்படுத்தல் " +"முயற்சிக்கப்படவில்லை. கைமுறையாக ஸ்பாம்ப்ரோப் ஏற்றுமதி/இறக்குமதி ஐ பயன்படுத்தி மேம்படுத்த " +"வழியானது ஸ்பாம்ப்ரோப்(1) கையேட்டில் தரவுத்தள பராமரிப்பு பகுதியில் உள்ளது." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"இக் கணினியின் அனைத்து ஸ்பாம்ப்ரோப் பயனர்களுக்கும் இந்த மாறுதல் குறித்து தெரிவிக்கப்பட " +"வேண்டும். அவர்களை README.Debian கோப்பை படிக்க சொல்ல வேண்டும்." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" --- spamprobe-1.4d.orig/debian/po/eu.po +++ spamprobe-1.4d/debian/po/eu.po @@ -0,0 +1,82 @@ +# translation of spamprobe_0.9h-2_eu.po to Librezale.org +# translation of spamprobe_0.9h-2_templates.po to Librezale.org +# +# Translators, if you are not familiar with the PO format, gettext +# documentation is worth reading, especially sections dedicated to +# this format, e.g. by running: +# info -n '(gettext)PO Files' +# info -n '(gettext)Header Entry' +# Some information specific to po-debconf are available at +# /usr/share/doc/po-debconf/README-trans +# or http://www.debian.org/intl/l10n/po-debconf/README-trans# +# Developers do not need to manually edit POT or PO files. +# Piarres Beobide , 2005. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe_0.9h-2_eu\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2005-01-17 22:55+0100\n" +"Last-Translator: Piarres Beobide \n" +"Language-Team: Librezale.org \n" +"Language: \n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" +"X-Generator: KBabel 1.9.1\n" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "Upgrading to Berkeley DB 4.6" +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Berkeley DB 4.6-ra eguneratzen" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "" +#| "As of spamprobe 1.4d-1, the database format changed to Berkeley DB 4.6 " +#| "and spamprobe is no longer able to modify databases using an older format." +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Spamprobe 1.4d-1 bertsiotik aurrera databasea Berkeley DB 4.6 formatura " +"aldatu da, spmaprobe ez da aurreko bertsioez sortutako databaseak " +"eguneratzeko gai izango aurrerantzean." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Ez dagoenez databaseak gordetzeko bide lehenetsirik, ez da eguneraketa " +"automatikorik egingo. Spamprobe erabiliaz eskuz eguneratzeko modu bat " +"spamprobe(1) manual orrialdeko 'DATABASE MAINTENANCE' atalean aurki dezakezu." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Sistema honetako spamprobe erabiltzaile guztiak aldaketa honetaz ohartu eta " +"README.Debian fitxategia irakurtzea gonbidatu beharko lirateke." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" --- spamprobe-1.4d.orig/debian/po/it.po +++ spamprobe-1.4d/debian/po/it.po @@ -0,0 +1,75 @@ +# Italian (it) translation of debconf templates for spamprobe +# Copyright (C) 2007 Free Software Foundation, Inc. +# This file is distributed under the same license as the spamprobe package. +# Luca Monducci , 2007-2010. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe 1.4d-7 italian debconf templates\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2010-05-15 17:07+0200\n" +"Last-Translator: Luca Monducci \n" +"Language-Team: Italian \n" +"Language: it\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "Upgrading to Berkeley DB 4.8" +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Aggiornamento a Berkeley DB 4.8" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "" +#| "As of spamprobe 1.4d-6, the database format changed to Berkeley DB 4.8 " +#| "and spamprobe is no longer able to modify databases using an older format." +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Dalla versione 1.4d-6 di spamprobe il formato del database è diventato " +"Berkeley DB 4.8 e spamprobe non è più in grado di modificare database nel " +"vecchio formato." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Dato che non esiste un modo per localizzare tutti i database esistenti, non " +"è possibile effettuare un aggiornamento automatico. Come effettuare un " +"aggiornamento manuale tramite le operazioni export/import di spamprobe è " +"descritto nella sezione \"MANUTENZIONE DEL DATABASE\" della pagina man di " +"spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Tutti gli utenti di questo sistema che utilizzano spamprobe devono essere " +"informati di questa modifica e devono essere invitati a leggere il file " +"README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" --- spamprobe-1.4d.orig/debian/po/sv.po +++ spamprobe-1.4d/debian/po/sv.po @@ -0,0 +1,79 @@ +# Translation of spamprobe debconf template to Swedish +# Copyright (C) 2010 Martin Bagge +# This file is distributed under the same license as the spamprobe package. +# +# Martin Bagge , 2008, 2010, 2011 +msgid "" +msgstr "" +"Project-Id-Version: spamprobe 1.2a-1\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-09-12 01:20+0100\n" +"Last-Translator: Martin Bagge / brother \n" +"Language-Team: Swedish \n" +"Language: sv\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=utf-8\n" +"Content-Transfer-Encoding: 8bit\n" +"X-Poedit-Language: Swedish\n" +"X-Poedit-Country: Sweden\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Uppgraderar till Berkeley DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Från och med spamprobe 1.4d-8 har databasformatet ändrats till Berkeley DB " +"5.1 och spamprobe kan inte modifiera existerande databaser i äldre format." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Eftersom det inte finns något standardsätt att hitta alla databaser kommer " +"ingen automatisk uppgradering att göras. En manuell uppgradering med " +"spamprobe's export/import är angiven i manualsidan spamprobe(1) under " +"kapitlet 'DATABASE MAINTENANCE'." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Alla som använder spamprobe på detta systemet bör informeras om förändringen " +"och uppmanas att läsa i filen README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"För att undvika detta problem nästa gång när Berkeley DB-version ska " +"uppgraderas rekomenderas att byta dataformat, exempelvis till \"hash\". Läs " +"om flaggan -d i manualsidan för spamprobe(1) ." + +#~ msgid "" +#~ "Please inform all spamprobe users on your system of this change and to " +#~ "read README.Debian for further changes. Sorry for the inconvenience." +#~ msgstr "" +#~ "Vänligen informera alla användare av spamprobe på ditt system om denna " +#~ "ändring och att läsa README.Debian för fler ändringar. Ursäkta för denna " +#~ "olägenhet." --- spamprobe-1.4d.orig/debian/po/gl.po +++ spamprobe-1.4d/debian/po/gl.po @@ -0,0 +1,72 @@ +# Galician translation of spamprobe's debconf templates +# This file is distributed under the same license as the spamprobe package. +# Jacobo Tarrio , 2007. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2007-10-02 22:24+0100\n" +"Last-Translator: Jacobo Tarrio \n" +"Language-Team: Galician \n" +"Language: gl\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "Upgrading to Berkeley DB 4.6" +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Actualización a Berkeley DB 4.6" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "" +#| "As of spamprobe 1.4d-1, the database format changed to Berkeley DB 4.6 " +#| "and spamprobe is no longer able to modify databases using an older format." +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"A partires de spamprobe 1.4d-1, o formato das bases de datos cambiou a " +"Berkeley DB 4.6, e spamprobe xa non pode modificar as bases de datos que " +"teñan un formato anterior." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Xa que non hai unha maneira xenérica de atopar tódalas bases de datos " +"existentes, non se tenta actualizalas automaticamente. Descríbese unha ruta " +"de actualización manual empregando spamprobe export/import na sección " +"\"DATABASE MAINTENANCE\" da páxina de manual de spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Debería informarse deste cambio a tódolos usuarios de spamprobe deste " +"sistema, e debería indicárselles que lean o ficheiro README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" --- spamprobe-1.4d.orig/debian/po/nl.po +++ spamprobe-1.4d/debian/po/nl.po @@ -0,0 +1,72 @@ +# Dutch translation of spamprobe debconf templates. +# Copyright (C) 2005-2011 THE PACKAGE'S COPYRIGHT HOLDER +# This file is distributed under the same license as the spamprobe package. +# Luk Claes , 2005. +# Jeroen Schot , 2011. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe 1.4d-8\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-10-12 14:39+0200\n" +"Last-Translator: Jeroen Schot \n" +"Language-Team: Debian l10n Dutch \n" +"Language: nl\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=utf-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Opwaarderen naar Berkeley DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Vanaf de uitgave van spamprobe 1.4d-8 is de databaseindeling gewijzigd naar " +"Berkeley DB 5.1 en kan spamprobe bestaande databases van een oudere indeling " +"niet meer bewerken." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Aangezien er geen algemene manier is om alle aanwezige databases te vinden, " +"wordt niet geprobeerd deze automatisch bij te werken. U kunt handmatig " +"bijwerken met de export/import-mogelijkheden van spamprobe, zoals geschetst " +"in het hoofdstuk 'DATABASE MAINTENANCE' van de spamprobe(1) manpagina." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Alle gebruikers van spamprobe op dit computersysteem dienen op de hoogte te " +"worden gebracht van deze verandering en aangeraden te worden het bestand " +"README.Debian door te lezen." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"Om dit probleem bij de volgende Berkeley DB-migratie te vermijden wordt u " +"aangeraden om over te stappen naar een andere bestandsindeling, zoals " +"'hash'. Zie de documentatie van de -d optie in de manpagina van spamprobe(1)." --- spamprobe-1.4d.orig/debian/po/cs.po +++ spamprobe-1.4d/debian/po/cs.po @@ -0,0 +1,88 @@ +# +# Translators, if you are not familiar with the PO format, gettext +# documentation is worth reading, especially sections dedicated to +# this format, e.g. by running: +# info -n '(gettext)PO Files' +# info -n '(gettext)Header Entry' +# +# Some information specific to po-debconf are available at +# /usr/share/doc/po-debconf/README-trans +# or http://www.debian.org/intl/l10n/po-debconf/README-trans +# +# Developers do not need to manually edit POT or PO files. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-10-19 07:08+0200\n" +"Last-Translator: Miroslav Kure \n" +"Language-Team: Czech \n" +"Language: cs\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Přechod na Berkeley DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Od spamprobe verze 1.4d-8 se změnil formát databáze na Berkeley DB 5.1. " +"Změna je tak výrazná, že spamprobe není schopen aktualizovat databáze " +"používající starší formát." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Protože neexistuje žádný obecný postup k nalezení všech stávajících " +"databází, spamprobe se ani nebude pokoušet o automatický převod databází na " +"novější verzi. Ruční postup za pomoci příkazů export/import programu " +"spamprobe je nastíněn v manuálové stránce spamprobe(1) v sekci ÚDRŽBA " +"DATABÁZE." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Všichni uživatelé spamprobe na tomto systému by měli být informováni o této " +"změně a měli byste jim doporučit pročtení souboru README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"Abyste se vyhnuli tomuto problému při příští změně formátu Berkeley " +"DB, je doporučeno přejít na jiný datový formát, například „hash“. " +"Podrobnosti naleznete v manuálové stránce spamprobe(1) u popisu " +"parametru -d." + +#~ msgid "" +#~ "Please inform all spamprobe users on your system of this change and to " +#~ "read README.Debian for further changes. Sorry for the inconvenience." +#~ msgstr "" +#~ "Informujte o této změně prosím všechny uživatele spamprobe na svém " +#~ "systému a doporučte jim přečíst README.Debian. Omlouváme se za způsobené " +#~ "nepříjemnosti." --- spamprobe-1.4d.orig/debian/po/fi.po +++ spamprobe-1.4d/debian/po/fi.po @@ -0,0 +1,70 @@ +msgid "" +msgstr "" +"Project-Id-Version: spamprobe\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2007-10-07 22:57+0200\n" +"Last-Translator: Esko Arajärvi \n" +"Language-Team: Finnish \n" +"Language: fi\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=utf-8\n" +"Content-Transfer-Encoding: 8bit\n" +"X-Poedit-Language: Finnish\n" +"X-Poedit-Country: FINLAND\n" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "Upgrading to Berkeley DB 4.6" +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Päivitetään tietokantaan Berkeley DB 4.6" + +#. Type: note +#. Description +#: ../templates:2001 +#, fuzzy +#| msgid "" +#| "As of spamprobe 1.4d-1, the database format changed to Berkeley DB 4.6 " +#| "and spamprobe is no longer able to modify databases using an older format." +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Versiosta spamprobe 1.4d-1 lähtien käytettävä tietokannan muoto on Berkeley " +"DB 4.6, eikä spamprobe pysty muokkaamaan vanhempia muotoja käyttäviä " +"tietokantoja." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Koska ei ole olemassa yleistä tapaa paikantaa kaikki olemassa olevat " +"tietokannat, automaattista päivitystä ei yritetä. spamprobe(1) man-ohjesivun " +"osiossa 'DATABASE MAINTENANCE' on esitetty kuinka tehdä päivitys käsin " +"spamproben export/import-ominaisuuden avulla. " + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Tästä muutoksesta tulisi tiedottaa kaikille spamprobea tässä järjestelmässä " +"käyttäville ja kehottaa heitä lukemaan tiedosto README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" --- spamprobe-1.4d.orig/debian/po/POTFILES.in +++ spamprobe-1.4d/debian/po/POTFILES.in @@ -0,0 +1 @@ +[type: gettext/rfc822deb] templates --- spamprobe-1.4d.orig/debian/po/templates.pot +++ spamprobe-1.4d/debian/po/templates.pot @@ -0,0 +1,59 @@ +# SOME DESCRIPTIVE TITLE. +# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER +# This file is distributed under the same license as the PACKAGE package. +# FIRST AUTHOR , YEAR. +# +#, fuzzy +msgid "" +msgstr "" +"Project-Id-Version: PACKAGE VERSION\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" +"Last-Translator: FULL NAME \n" +"Language-Team: LANGUAGE \n" +"Language: \n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=CHARSET\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" --- spamprobe-1.4d.orig/debian/po/da.po +++ spamprobe-1.4d/debian/po/da.po @@ -0,0 +1,72 @@ +# Danish translation for spamprobe. +# Copyright (C) 2010 spamprobe og nedenstående oversættere. +# This file is distributed under the same license as the spamprobe package. +# Morten Brix Pedersen , 2005. +# Joe Hansen (joedalton2@yahoo.dk), 2010, 2011. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-10-18 19:25+0200\n" +"Last-Translator: Joe Hansen \n" +"Language-Team: Danish \n" +"Language: da\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Opgraderer til Berkely DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Fra og med spamprobe 1.4d-8 er databaseformatet ændret til Berkely DB 5.1, " +"og spamprobe er ikke længere i stand til at ændre databaser, der bruger et " +"ældre format." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Eftersom der ikke er en generel metode, man kan bruge, som finder alle " +"eksisterende databaser, så vil jeg ikke forsøge at opgradere automatisk. En " +"manuel opgraderingsmetode ved hjælp af spamprobe eksport/import er beskrevet " +"i afsnittet »DATABASE MAINTENANCE« i manualen spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Alle spamprobebrugere på dette system bør informeres om denne ændring og " +"rådes til at læse filen README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"For at undgå dette problem ved næste Berkeley DB-migrering, så anbefales det " +"at konvertere til et andet datafilformat såsom »hash«. Se dokumentationen " +"for tilvalget -d i manualsiden for spamprobe(1)." + --- spamprobe-1.4d.orig/debian/po/fr.po +++ spamprobe-1.4d/debian/po/fr.po @@ -0,0 +1,76 @@ +# Translation of spamprobe debconf templates to French +# Copyright (C) 2007 Christian Perrier +# This file is distributed under the same license as the spamprobe package. +# +# Christian Perrier , 2007, 2011. +msgid "" +msgstr "" +"Project-Id-Version: \n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-09-08 06:05+0200\n" +"Last-Translator: Christian Perrier \n" +"Language-Team: French \n" +"Language: fr\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" +"X-Generator: Lokalize 1.2\n" +"Plural-Forms: nplurals=2; plural=(n > 1);\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Mise à jour vers Berkeley DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"À partir de la version 1.4d-8 de spamprobe, le format de la base de données " +"est devenu le format Berkeley 5.1. Par conséquence, spamprobe ne peut plus " +"modifier les bases de données existantes si elles utilisent un format " +"antérieur." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Comme il n'existe pas de méthode générique pour rechercher toutes les bases " +"de données existantes, aucune mise à jour automatique ne sera tentée. Une " +"procédure manuelle de mise à jour utilisant les fonctions d'import et export " +"de spamprobe est documentée dans la section « DATABASE " +"MAINTENANCE » (maintenance des bases de données) de la page de manuel de " +"spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Il est nécessaire d'informer tous les utilisateurs de spamprobe sur ce " +"système afin qu'ils consultent le fichier README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"Pour éviter de recommencer ce processus à chaque changement de version de la " +"base de données Berkeley, il est recommandé d'utiliser un autre format de " +"fichier de données, par exemple « hash ». Vous pouvez consulter la " +"documentation de l'option « -d » dans la page de manuel de spamprobe(1)." --- spamprobe-1.4d.orig/debian/po/pt.po +++ spamprobe-1.4d/debian/po/pt.po @@ -0,0 +1,74 @@ +# Portuguese translation for spamprobe (debconf) +# Copyright (C) 2007 Pedro Ribeiro +# This file is distributed under the same license as the spamprobe package. +# Pedro Ribeiro , 2007,2011 +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe_1.4d-10\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-10-18 22:45+0000\n" +"Last-Translator: Pedro Ribeiro \n" +"Language-Team: Portuguese \n" +"Language: pt\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +#| msgid "Upgrading to Berkeley DB 4.8" +msgid "Upgrading to Berkeley DB 5.1" +msgstr "A actualizar para Berkeley DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +#| msgid "" +#| "As of spamprobe 1.4d-6, the database format changed to Berkeley DB 4.8 " +#| "and spamprobe is no longer able to modify databases using an older format." +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"A partir do spamprobe 1.4d-8, o formato da base de dados mudou para o " +"BerkeleyDB 5.1 e o spamprobe já não pode modificar bases de dados do formato " +"antigo." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Uma vez que não existe uma forma geral de localizar as bases de dados " +"existentes, não será tentada a actualização automática. Uma forma de " +"actualização manual usando a exportação/importação de spamprobe está " +"delineada na secção 'DATABASE MAINTENANCE' do manual do spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Todos os utilizadores de spamprobe neste sistema devem ser informados desta " +"modificação e aconselhados a ler o ficheiro README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"Para evitar este problema numa próxima actualização da Berkeley DB, " +"recomenda-se a conversão para um outro formato de dados como o 'hash'. Veja a " +"documentação da opção -d na página do manual do spamprobe(1)." --- spamprobe-1.4d.orig/debian/po/ja.po +++ spamprobe-1.4d/debian/po/ja.po @@ -0,0 +1,80 @@ +# +# Translators, if you are not familiar with the PO format, gettext +# documentation is worth reading, especially sections dedicated to +# this format, e.g. by running: +# info -n '(gettext)PO Files' +# info -n '(gettext)Header Entry' +# +# Some information specific to po-debconf are available at +# /usr/share/doc/po-debconf/README-trans +# or http://www.debian.org/intl/l10n/po-debconf/README-trans +# +# Developers do not need to manually edit POT or PO files. +# +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe 1.4d-10\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-10-16 05:52+0900\n" +"Last-Translator: Hideki Yamane \n" +"Language-Team: Japanese \n" +"Language: ja\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Berkeley DB 5.2 へのアップグレード" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"spamprobe 1.4d-8 のリリースより、データベースの形式が Berkeley DB 5.1 に変わ" +"りました。これにより、spamprobe が以前の形式を使っているデータベースを変更" +"できなくなっています。" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"既存のデータベース全てに対処する一般的な方法が無いので、自動的なアップグレー" +"ドは行われません。spamprobe のエクスポート・インポートを使った手動でのアップ" +"グレードについては、spamprobe(1) マニュアルページの 'DATABASE MAINTENANCE' セ" +"クションに概略があります。" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"このシステムを利用しているすべての spamprobe ユーザに対してこの変更について通" +"知し、README.Debian ファイルを読むように勧める必要があります。" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"次回、Berkeley DB の移行時にこの問題を避けるには、'hash' などの他のデータ" +"ファイル形式への変換をお勧めします。spamprobe(1) man ページ中にある -d " +"オプションのドキュメントを参照してください。" + --- spamprobe-1.4d.orig/debian/po/ru.po +++ spamprobe-1.4d/debian/po/ru.po @@ -0,0 +1,82 @@ +# translation of ru.po to Russian +# +# Translators, if you are not familiar with the PO format, gettext +# documentation is worth reading, especially sections dedicated to +# this format, e.g. by running: +# info -n '(gettext)PO Files' +# info -n '(gettext)Header Entry' +# Some information specific to po-debconf are available at +# /usr/share/doc/po-debconf/README-trans +# or http://www.debian.org/intl/l10n/po-debconf/README-trans# +# Developers do not need to manually edit POT or PO files. +# +# Yuri Kozlov , 2005, 2007. +# Yuri Kozlov , 2010, 2011. +msgid "" +msgstr "" +"Project-Id-Version: 1.4d-8\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-09-10 09:43+0400\n" +"Last-Translator: Yuri Kozlov \n" +"Language-Team: Russian \n" +"Language: ru\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" +"X-Generator: Lokalize 1.0\n" +"Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n" +"%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Обновление до Berkeley DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Начиная с версии spamprobe 1.4d-8, формат базы данных был изменён на " +"Berkeley DB 5.1, и поэтому spamprobe не может вносить изменения в базы " +"данных старого формата." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Так как не существует способа обнаружить все существующие базы, то нет " +"способа автоматического обновления. Схема обновления вручную с помощью " +"spamprobe export/import описана в справочной странице spamprobe(1) в разделе " +"«DATABASE MAINTENANCE»." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Все пользователи spamprobe в вашей системе должны быть проинформированы об " +"этом изменении и посоветуйте им прочитать файл README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"Чтобы избежать этой проблемы при следующем изменении формата Berkeley DB, " +"рекомендуется перейти на другой формат, например на «hash». Смотрите " +"документацию к параметру -d в справочной странице spamprobe(1)." --- spamprobe-1.4d.orig/debian/po/es.po +++ spamprobe-1.4d/debian/po/es.po @@ -0,0 +1,96 @@ +# spamprobe po-debconf translation to Spanish +# Copyright (C) 2010 Software in the Public Interest +# This file is distributed under the same license as the spamprobe package. +# +# Changes: +# - Initial translation +# Omar Campagne , 2010 +# +# - Updates +# TRANSLATOR +# +# Traductores, si no conocen el formato PO, merece la pena leer la +# documentación de gettext, especialmente las secciones dedicadas a este +# formato, por ejemplo ejecutando: +# info -n '(gettext)PO Files' +# info -n '(gettext)Header Entry' +# +# Equipo de traducción al español, por favor lean antes de traducir +# los siguientes documentos: +# +# - El proyecto de traducción de Debian al español +# http://www.debian.org/intl/spanish/ +# especialmente las notas y normas de traducción en +# http://www.debian.org/intl/spanish/notas +# +# - La guía de traducción de po's de debconf: +# /usr/share/doc/po-debconf/README-trans +# o http://www.debian.org/intl/l10n/po-debconf/README-trans +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe 1.4d-10\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-10-23 12:28+0200\n" +"Last-Translator: Omar Campagne \n" +"Language-Team: Debian l10n Spanish \n" +"Language: \n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Actualizando a la versión 5.1 de la base de datos de Berkeley" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"A partir de la versión 1.4d-8, el formato de la base de datos cambió al " +"formato de la versión 5.1 de la base de datos de Berkeley, y spamprobe ya no " +"es capaz de modificar bases de datos con un formato anterior a éste." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Debido a que no existe una manera general de ubicar todas las bases de datos " +"existentes, no se intenta ninguna actualización automática. Se describe una " +"vía para actualizar, usando las funciones de exportar e importar de " +"spamprobe, en la sección «DATABASE MAINTENANCE» de la página de manual de " +"spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Se debería informar de este cambio a todos los usuarios de spamprobe en el " +"sistema, así como recomendar la lectura del fichero «README.Debian»." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"Para evitar este problema durante la siguiente actualización de la base de " +"datos de Berkeley, se recomienda la conversión a otro formato de fichero " +"de datos, como «hash». Consulte la documentación de la opción «-d» en la " +"página de manual de spamprobe." --- spamprobe-1.4d.orig/debian/po/fr.po.save +++ spamprobe-1.4d/debian/po/fr.po.save @@ -0,0 +1,78 @@ +# translation of fr.po to French +# +# Translators, if you are not familiar with the PO format, gettext +# documentation is worth reading, especially sections dedicated to +# this format, e.g. by running: +# info -n '(gettext)PO Files' +# info -n '(gettext)Header Entry' +# +# Some information specific to po-debconf are available at +# /usr/share/doc/po-debconf/README-trans +# or http://www.debian.org/intl/l10n/po-debconf/README-trans +# +# Developers do not need to manually edit POT or PO files. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe\n" +"Report-Msgid-Bugs-To: \n" +"POT-Creation-Date: 2005-01-10 18:15+0100\n" +"PO-Revision-Date: 2005-01-10 18:17+0100\n" +"Last-Translator: Christian Perrier \n" +"Language-Team: French \n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=ISO-8859-1\n" +"Content-Transfer-Encoding: 8bit\n" +"X-Generator: KBabel 1.9.1\n" + +#. Type: note +#. Description +#: ../templates:5 +msgid "Upgrading to Berkeley DB 4.2" +msgstr "Mise jour vers Berkeley DB 4.2" + +#. Type: note +#. Description +#: ../templates:5 +msgid "" +"Starting with released spamprobe 0.9h-1, database format has changed to " +"Berkeley DB 4.2 to the effect that spamprobe will not be able to modify " +"existing databases." +msgstr "" +" partir de la version publie 0.9h-1 de spamprobe, le format de la base de " +"donnes est devenu le format Berkeley 4.2. Par consquence, spamprobe ne " +"peut plus modifier les bases de donnes existantes." + +#. Type: note +#. Description +#: ../templates:5 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import " +"is outlined in spamprobe(1) DATABASE MAINTENANCE." +msgstr "" +"Comme il n'existe pas de mthode gnrique pour rechercher toutes les bases " +"de donnes existantes, aucune mise jour automatique ne sera tente. Une " +"procdure manuelle de mise jour utilisant les fonctions d'import et export " +"de spamprobe est documente dansla section DATABASE MAINTENANCE (maintenance des bases de donnes) de la page de manuel spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:5 +msgid "" +"Please inform all spamprobe users on your system of this change and to read " +"README.Debian for further changes. Sorry for the inconvenience." +msgstr "Vous devriez informer tous les utilisateurs de spamprobe sur ce systme et consulter le fichier README.Debian pour les changements ultrieurs." + +#~ msgid " spamprobe(1) DATABASE MAINTENANCE" +#~ msgstr "" +#~ " spamprobe(1), section DATABASE MAINTENANCE\n" +#~ " (MAINTENANCE DE LE BASE DE DONNES)" + +#~ msgid "Please inform all spamprobe users on your system." +#~ msgstr "" +#~ "Veuillez informer tous les utilisateurs de spamprobe sur ce systme." + +#~ msgid "Sorry for the inconvenience." +#~ msgstr " " + --- spamprobe-1.4d.orig/debian/po/pt_BR.po +++ spamprobe-1.4d/debian/po/pt_BR.po @@ -0,0 +1,76 @@ +# spamprobe Brazilian Portuguese translation +# Copyright (C) 2007 THE spamprobe'S COPYRIGHT HOLDER +# This file is distributed under the same license as the spamprobe package. +# Felipe Augusto van de Wiel (faw) , 2007. +# Adriano Rafael Gomes , 2010, 2011. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe 1.4d-8\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-10-15 13:52-0300\n" +"Last-Translator: Adriano Rafael Gomes \n" +"Language-Team: Brazilian Portuguese \n" +"Language: pt_BR\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +#| msgid "Upgrading to Berkeley DB 4.8" +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Atualizando para Berkeley DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +#| msgid "" +#| "As of spamprobe 1.4d-6, the database format changed to Berkeley DB 4.8 " +#| "and spamprobe is no longer able to modify databases using an older format." +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"A partir do spamprobe 1.4d-8, o formato do banco de dados mudou para o " +"Berkeley DB 5.1 e o spamprobe não é mais capaz de modificar bancos de dados " +"usando o antigo formato." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Como não há uma forma geral de localizar todos os bancos de dados " +"existentes, não tenta-se realizar uma atualização automática. Um caminho de " +"atualização manual usando exportar/importar do spamprobre é explicado na " +"seção 'DATABASE MAINTENANCE' da página de manual do spamprobe(1)." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Todos os usuários do spamprobe neste sistema deveriam ser informados dessa " +"mudança e aconselhados a ler o arquivo README.Debian." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"Para evitar essa situação na próxima migração do Berkeley DB, é recomendado " +"converter para um outro formato de arquivo de dados, como 'hash'. Veja a " +"documentação da opção -d na página de manual do spamprobe(1)." --- spamprobe-1.4d.orig/debian/po/de.po +++ spamprobe-1.4d/debian/po/de.po @@ -0,0 +1,70 @@ +# Translation of spamprobe debconf templates to German +# Copyright (C) Helge Kreutzmann , 2007, 2010, 2011. +# This file is distributed under the same license as the spamprobe package. +# +msgid "" +msgstr "" +"Project-Id-Version: spamprobe 1.4d-8\n" +"Report-Msgid-Bugs-To: spamprobe@packages.debian.org\n" +"POT-Creation-Date: 2011-09-04 21:12+0200\n" +"PO-Revision-Date: 2011-09-11 13:17+0200\n" +"Last-Translator: Helge Kreutzmann \n" +"Language-Team: German \n" +"Language: de\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=ISO-8859-15\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "Upgrading to Berkeley DB 5.1" +msgstr "Upgrade auf Berkeley DB 5.1" + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"As of spamprobe 1.4d-8, the database format changed to Berkeley DB 5.1 and " +"spamprobe is no longer able to modify databases using an older format." +msgstr "" +"Mit Spamprobe 1.4d-8 nderte sich das Datenbankformat auf Berkeley DB 5.1 " +"und Spamprobe ist nicht mehr in der Lage, Datenbanken in lteren Formaten zu " +"bearbeiten." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"Since there is no general way to locate all existing databases, no automatic " +"upgrade is attempted. A manual upgrade path using spamprobe export/import is " +"outlined in the 'DATABASE MAINTENANCE' section of the spamprobe(1) manual " +"page." +msgstr "" +"Da es keinen allgemeingltigen Weg gibt, um alle existierenden Datenbanken " +"aufzuspren, wird kein automatisches Upgrade versucht. Ein manueller " +"Upgrade-Pfad mittels export/import von Spamprobe wird im Abschnitt DATABASE " +"MAINTENANCE der Handbuchseite von spamprobe(1) skizziert." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"All spamprobe users on this system should be informed of this change and " +"advised to read the README.Debian file." +msgstr "" +"Alle Spamprobe-Benutzer auf diesem System sollten ber diese nderung " +"informiert und angehalten werden, die Datei README.Debian zu lesen." + +#. Type: note +#. Description +#: ../templates:2001 +msgid "" +"To avoid this issue at next Berkeley DB migration, converting to another " +"data file format, like 'hash', is recommended. See documentation of the -d " +"option in spamprobe(1) man page." +msgstr "" +"Um dieses Problem bei der nchsten Berkeley-DB-Migration zu vermeiden, wird " +"empfohlen, auf ein anderes Datendateiformat, wie hash, zu konvertieren. " +"Lesen Sie die Dokumentation der Option -d in der Handbuchseite von " +"spamprobe(1)."