<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Thanks for the clues! It looks like
      python may be the winner this time.<br>
      <br>
      <br>
      On 12/14/2012 10:14 AM, Sean Brewer wrote:<br>
    </div>
    <blockquote
cite="mid:CANEHAufCkkT7GRm20ygSA-+QShe=u0iUmn3C52=uXDfAyC8Wqg@mail.gmail.com"
      type="cite">
      <div>If you can export the e-mails easily, general algorithm is
        something like this:</div>
      <div>
        <div>1. Tokenize the words in the e-mail body.</div>
        <div>2. Remove stop words (a, an, the, etc.  You can find word
          lists, and libraries like NLTK have them built in)</div>
        <div>3. Use stemming algorithm to reduce word tokens to their, I
          think the correct vocabulary is, free morpheme (e.g. convert
          the token word "passing" to "pass")</div>
        <div>4. Rank by frequency of result.</div>
        <div><br>
        </div>
        <div>That should get you in the neighborhood.</div>
        <br>
        <div class="gmail_quote">On Fri, Dec 14, 2012 at 9:11 AM,
          Matthew Keys <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:mk6032@yahoo.com" target="_blank">mk6032@yahoo.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div>
              <div style="font-size:12pt;font-family:times new roman,new
                york,times,serif">Does anyone know how to create a tag
                clouds based on the body of an email? The google gods
                point me in the direction of outlook pluggins but I'm
                looking for something more linux cli scriptable; maybe
                something that could parse through an exported
                mailbox/folder.<br>
              </div>
            </div>
            <br>
            _______________________________________________<br>
            Chugalug mailing list<br>
            <a moz-do-not-send="true"
              href="mailto:Chugalug@chugalug.org">Chugalug@chugalug.org</a><br>
            <a moz-do-not-send="true"
              href="http://chugalug.org/cgi-bin/mailman/listinfo/chugalug"
              target="_blank">http://chugalug.org/cgi-bin/mailman/listinfo/chugalug</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Chugalug mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Chugalug@chugalug.org">Chugalug@chugalug.org</a>
<a class="moz-txt-link-freetext" href="http://chugalug.org/cgi-bin/mailman/listinfo/chugalug">http://chugalug.org/cgi-bin/mailman/listinfo/chugalug</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>