[imapfilter-devel] Regex matching; trailing CR?

Lefteris Chatzimparmpas lefcha at hellug.gr
Fri Oct 30 00:24:11 EET 2009


On Thu, Oct 29, 2009 at 10:33:58AM -0400, William Faulk wrote:
> I've been arguing with getting imapfilter to work against my Exchange  
> IMAP server for a couple of days now, and I finally figured out what was  
> going wrong.  When I call match_subject (for example), the string that's  
> passed to the regular expression ends in a carriage return (\r, \015),  
> and that makes my regexes that end in '$' fail.
>
> Should imapfilter be passing that CR to the regex engine?  If so, what's  
> the best way to deal with it?  The only thing I've found that works is  
> '\015' immediately before the '$', but I don't know that it will always  
> be there, and if I follow it with a '?', it fails.  I also don't see how  
> to mark my regex as a multiline regex.

You can change this behaviour by adding the following code at the mailbox.lua
file (which should be installed/located at /usr/share/imapfilter/mailbox.lua)
at line 449:

	results[m] = string.gsub(results[m], "\r\n$", "")

I will probably change this behaviour in the next release to what you are
suggesting.

You can pass some options to the PCRE regex engine through some special
syntax in the pattern you are searching as described in the PCRE
documentation.  For example if you want to match the newline character
inside your pattern you pass the option (?s), like this:

	foo.INBOX:match_header("(?s)Subject.*Message")
	regex_search("(?s)one.*two", "one \n two")

There are other such modifiers:

	(?i)	caseless
	(?J)	allow duplicate names
	(?m)	multiline
	(?s)	single line (dotall)
	(?U)	default ungreedy (lazy)
	(?x)	extended (ignore white space)

>
> FWIW, here is the code in my config.lua that shows me the CR:
>
>> result = cortina.INBOX:match_subject('^Subject:\\s*Backup.*\\s[01]')
>> headers = cortina.INBOX:fetch_fields({'subject'},result)
>> for i,line in pairs(headers) do
>>   print(line)
>>   reres, cap = regex_search('^Subject:\\s*Backup.*\\s[01](.*)$', line)
>>   if reres then
>>     print 'match'
>>     print('"' .. cap .. '"')
>>     for j=1,string.len(cap) do
>>       print(string.byte(cap,j))
>>     end
>>   else
>>     print 'nomatch'
>>   end
>> end
>
> And here's a segment of the output:
>
>> Subject: Backup on hzsbak0100 - 0
>>
>> match
>> "
>> 13
>
> (As a side note, I think the fact that the trailing quotation mark in  
> the raw print of the regex capture is bizarre.)
>
> -Bitt Faulk
> _______________________________________________
> Imapfilter-devel mailing list
> Imapfilter-devel at lists.hellug.gr
> http://lists.hellug.gr/mailman/listinfo/imapfilter-devel


More information about the Imapfilter-devel mailing list