gs.group.messages.text API

There are three parts to the API provided by the gs.group.messages.text product. The split message code separates the bottom quoting and signatures from the rest of the message. The HTML body code will format the parts of the message, using the matcher code.

Split message

An email message is normally in two parts: the actual body of the message, and then some trailing bottom quoting and signatures. The SplitMessage named tuple represents this duality, while the split_message() function does the actual splitting. Both parts of the message can be fed into the HTMLBody class to generate the markup.

gs.group.messages.text.SplitMessage (:class:`collections.namedtuple`)

The 2-tuple containing the strings representing

  1. The main body of the message (intro) and
  2. The rest of the message, including the bottom-quoting and the footer (remainder).
gs.group.messages.text.split_message(messageText, max_consecutive_comment=12, max_consecutive_whitespace=3)[source]

Split the message into main body and the footer.

Parameters:
  • messageText (str) – The text to process.
  • max_consecutive_comment (int) – The maximum number of lines of quoting to allow before snipping.
  • max_consecutive_whitespace (int) – The maximum number of lines that just contain whitespace to allow before snipping.
Returns:

2-tuple, containing the strings representing the main-body of the message, and the footer.

Return type:

SplitMessage

Email messages often contain a footer at the bottom, which identifies the user, and who they work for. However, GroupServer has lovely profiles which do this, so normally we want to snip the footer, to reduce clutter.

In addition, many users only write a short piece of text at the top of the email, while the remainder of the message consists of all the previous posts. This method also removes the bottom quoting.

Originally a ZMI-side script in Presentation/Tofu/MailingListManager/lscripts.

HTML body

The HTMLBody class will format a plain-text message as HTML. The changes that are made include the following.

  • The characters that would cause issues with the XML are escaped. This includes " and ' characters.

  • Each line is placed within a <span> element, with the CSS class set to line.

    <span class="line">Like this</span>
    
  • Lines that start with > but not >From are considered quotes, and given the additional CSS class muted.

    <span class="line muted">&gt; Like this</span>
    
  • The words of the line markup by the matcher classes.

class gs.group.messages.text.HTMLBody(originalText)[source]

The HTML form of a plain-text email body.

Parameters:originalText (str) – The original (plain) text
__iter__()[source]

The marked-up lines in the main body

__unicode__()[source]

The main part of the HTML body, as a Unicode string

__str__()[source]

The main part of the HTML body, as an ASCII string. Non-ASCII characters are replaced with XML entities.

markup(line)[source]

Markup the line, and the words in the line

Parameters:line (str) – The line to mark up.
Returns:An HTML form of the line: the characters escaped, the words marked up, and surrounded in a <span> element.
Return type:str
markup_words(line)[source]

Mark up the words on the line

Parameters:line (str) – The line to mark up
Returns:The line with the words marked up
Return type:str

Matcher

The matcher classes

  • Test that a word matches, and
  • Produce a substitute for the word.

They all inherit from the Matcher class.

class gs.group.messages.text.Matcher(matchRE, subStr, weight=10)[source]

Match a word, by a regular expression, and make a substitution

Parameters:
  • matchRE (str) – The regular expression used to check if there was a match (see re.match())
  • subStr (str) – The string specifying the subsitution (see re.sub())
re = None

The regular expression used to make the match. The flags re.I, re.M, and re.U are set.

match(s)[source]

Does the string match the regular expression?

Parameters:s (str) – The string to evaluate
Returns:True if the string matches the regular expression, False otherwise.
Return type:bool
sub(s)[source]

Substitute the string in for the substitution string

Parameters:s (str) – The string to process
Returns:The new string substituted in self.subStr
Return type:unicode

Instances

Four instances of the Matcher class are provided to make the following changes to the email.

  • Words in *asterisk* characters are made bold

    gs.group.messages.text.boldMatcher = <gs.group.messages.text.matcher.Matcher object>

    Turn words within *asterisk* characters into bold-elements. This is as close as GroupServer gets to implementing a wiki.

  • Email addresses are made clickable

    gs.group.messages.text.emailMatcher = <gs.group.messages.text.matcher.Matcher object>

    Turn email addresses (person@example.com) into clickable mailto: links. Surrounding text (such as parenthesis) is added to the link text, while the address is extractd and used as for the link target.

  • Site names starting with www are made clickable.

    gs.group.messages.text.wwwMatcher = <gs.group.messages.text.matcher.Matcher object>

    Turn site names that start with www (www.example.com) into clickable http:// links.

  • URLs (http and https) are made clickable.

    gs.group.messages.text.uriMatcher = <gs.group.messages.text.matcher.URIMatcher object>

    Turn URIs (both http and https) into clickable links. If the link is particularly long (over 64 characters) then small text will be used (<a class="small"). Leading and trailing characters (like parenthesis) will be used in the link text while just the URL will be used for the link target.