[Chugalug] tag cloud a mailing list

Matt Keys mk6032 at yahoo.com
Sat Dec 15 09:34:08 UTC 2012


I ran across a few like that, too. I'm a bit confused as to the 
difference between a word cloud and a tag cloud. I'm guessing tag clouds 
presume that you've attached some form of tag to an example text, which 
the code would use to sort upon whereas word clouds you just point the 
code to a pile of text that has not been tagged/grouped?

On 12/15/2012 03:44 AM, Sean Brewer wrote:
> I ran across this: https://github.com/larsmans/weighwords
>
> It might make what you want to do even easier.
>
> On Sat, Dec 15, 2012 at 2:14 AM, Sean Brewer <seabre986 at gmail.com 
> <mailto:seabre986 at gmail.com>> wrote:
>
>     Actually, you want to do something called lemmaisation, not
>     stemming, although they are related, stemming does something
>     slightly different. Lemmaisation does what I described.
>
>     I can probably whip up a dirty example with python and nltk.
>
>
>     On Fri, Dec 14, 2012 at 10:14 AM, Sean Brewer <seabre986 at gmail.com
>     <mailto:seabre986 at gmail.com>> wrote:
>
>         If you can export the e-mails easily, general algorithm is
>         something like this:
>         1. Tokenize the words in the e-mail body.
>         2. Remove stop words (a, an, the, etc.  You can find word
>         lists, and libraries like NLTK have them built in)
>         3. Use stemming algorithm to reduce word tokens to their, I
>         think the correct vocabulary is, free morpheme (e.g. convert
>         the token word "passing" to "pass")
>         4. Rank by frequency of result.
>
>         That should get you in the neighborhood.
>
>         On Fri, Dec 14, 2012 at 9:11 AM, Matthew Keys
>         <mk6032 at yahoo.com <mailto:mk6032 at yahoo.com>> wrote:
>
>             Does anyone know how to create a tag clouds based on the
>             body of an email? The google gods point me in the
>             direction of outlook pluggins but I'm looking for
>             something more linux cli scriptable; maybe something that
>             could parse through an exported mailbox/folder.
>
>             _______________________________________________
>             Chugalug mailing list
>             Chugalug at chugalug.org <mailto:Chugalug at chugalug.org>
>             http://chugalug.org/cgi-bin/mailman/listinfo/chugalug
>
>
>
>
>
>
> _______________________________________________
> Chugalug mailing list
> Chugalug at chugalug.org
> http://chugalug.org/cgi-bin/mailman/listinfo/chugalug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://chugalug.org/pipermail/chugalug/attachments/20121215/76630458/attachment-0001.html>


More information about the Chugalug mailing list