<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Thanks for the clues! It looks like
python may be the winner this time.<br>
<br>
<br>
On 12/14/2012 10:14 AM, Sean Brewer wrote:<br>
</div>
<blockquote
cite="mid:CANEHAufCkkT7GRm20ygSA-+QShe=u0iUmn3C52=uXDfAyC8Wqg@mail.gmail.com"
type="cite">
<div>If you can export the e-mails easily, general algorithm is
something like this:</div>
<div>
<div>1. Tokenize the words in the e-mail body.</div>
<div>2. Remove stop words (a, an, the, etc. You can find word
lists, and libraries like NLTK have them built in)</div>
<div>3. Use stemming algorithm to reduce word tokens to their, I
think the correct vocabulary is, free morpheme (e.g. convert
the token word "passing" to "pass")</div>
<div>4. Rank by frequency of result.</div>
<div><br>
</div>
<div>That should get you in the neighborhood.</div>
<br>
<div class="gmail_quote">On Fri, Dec 14, 2012 at 9:11 AM,
Matthew Keys <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:mk6032@yahoo.com" target="_blank">mk6032@yahoo.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div style="font-size:12pt;font-family:times new roman,new
york,times,serif">Does anyone know how to create a tag
clouds based on the body of an email? The google gods
point me in the direction of outlook pluggins but I'm
looking for something more linux cli scriptable; maybe
something that could parse through an exported
mailbox/folder.<br>
</div>
</div>
<br>
_______________________________________________<br>
Chugalug mailing list<br>
<a moz-do-not-send="true"
href="mailto:Chugalug@chugalug.org">Chugalug@chugalug.org</a><br>
<a moz-do-not-send="true"
href="http://chugalug.org/cgi-bin/mailman/listinfo/chugalug"
target="_blank">http://chugalug.org/cgi-bin/mailman/listinfo/chugalug</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Chugalug mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Chugalug@chugalug.org">Chugalug@chugalug.org</a>
<a class="moz-txt-link-freetext" href="http://chugalug.org/cgi-bin/mailman/listinfo/chugalug">http://chugalug.org/cgi-bin/mailman/listinfo/chugalug</a>
</pre>
</blockquote>
<br>
</body>
</html>