RE: Big Memory

From: Matt Keys 
------------------------------------------------------
I built a small vmware host in my newegg wishlist that was a lot of bang for
the buck.  My notes on it were "16 cores at 2GHz, 24GB DDR3 1066, 2.5TB disk
in a 4U for under $2000"

 

1x rosewill rsv-l4000 4U case $109

1x asus kgpe-d16 dual socket g34 $399

1x corsair 750w $115

2x amd opteron 6128 (8 cores each) $520

1x g.skill 24gb (6x4gb) ddr3 1333 $239 (out of stock now)

5x 500gb sata WD caviar blacks (deactivated now, was around ~$50/ea)

1x artic cooling thermal compound $13

 

 

From: chugalug-bounces@chugalug.org [mailto:chugalug-bounces@chugalug.org]
On Behalf Of Eric Wolf
Sent: Tuesday, June 28, 2011 1:32 PM
To: CHUGALUG
Subject: Re: [Chugalug] Big Memory

 

I think I'm going to take the next logical/lazy step and write the index to
SQLite and let the library do the dirty work for me. I'm spending too much
time thinking about this.

 

And yeah, a half TB of RAM seems ridiculous but it's surprisingly doable.
You can build a 1/4 TB RAM machine with parts from NewEgg for under $7K.

 

Figure you guys have been talking about building systems with 1000s of
processors for Bitcoin mining. Makes sense that RAM would work
proportionally as well.

 

We need a "NewEgg Index": What is the phattest machine that can be built
from parts in stock at NewEgg?

 

CPU: How many cores? What speed?

RAM: TBs?

Disk: PBs?

GPU: 10K?

 

The motherboard I was looking at could support 48 CPU cores, 256GB RAM but
the rest gets harder because you wouldn't put too many drives in a single
cabinet (just use NAS) and to get the GPU count up, you are using bus
extenders...

 

Thanks for the input...

 

-Eric


-=--=---=----=----=---=--=-=--=---=----=---=--=-=-
Eric B. Wolf                           720-334-7734






On Tue, Jun 28, 2011 at 11:17 AM, Chad Smith  wrote:

The more I read the more amazed I get...

HALF A TERABYTE OF RAM!!!!

it's like "1.21 JiggaWatts!!!"  (I know it's Gigawatts, but that's not what
the man said.)

- Chad W Smith
"I like a man who's middle name is W." - President George W. Bush - February
10, 2003 bit.ly/gwb-dubya





On Tue, Jun 28, 2011 at 12:09 PM, Aaron welch  wrote:

Hive running on a Cassandra ring would be easier.  That gives you an SQL
interface over a distributed node cluster with linear performance gains from
adding new hosts.

 

http://www.datastax.com/products/brisk

 

-AW

 

On Tue, Jun 28, 2011 at 1:06 PM, Eric Wolf  wrote:

Like I said, I'm being lazy with the code. Map-Reducing the problem is not
lazy.

 

-Eric


-=--=---=----=----=---=--=-=--=---=----=---=--=-=-
Eric B. Wolf                           720-334-7734






On Tue, Jun 28, 2011 at 10:58 AM, Ryan Bales  wrote:

You don't need big memory if you're able to distribute the load with
something like MapReduce. I know GAE supports MapReduce, and I'm sure there
are others.  GAE also supports WSGI, so you're good to go with python.

~Ryan Bales






On Tue, Jun 28, 2011 at 11:20 AM, Eric Wolf  wrote:

I'm currently trying to work with a really big data file (473GB) with some
Python code. I'm building an index in RAM in Python with a set. Currently, I
am running out RAM (and VM) on my system with 8GB of RAM and 12GB of VM. I
have two options: rewrite the code so it's slower but fits in my available
memory or push it out somewhere where I can have the RAM to do the job.

 

The "slower" bit may end up being a deal breaker because I anticipate the
jobs to take a couple days even working straight from RAM. "Slower" might
mean weeks or months. So I have time to explore finding someplace else to
run this.


So what I need is a platform that provides a reasonably current Python
installation, 512GB of RAM and 2-3TBs of disk.

 

Looking on NewEgg, the biggest system I can build is a 256GB RAM box
starting around $6K. I could build a system with 128GB of RAM and use a
512GB SSD for VM for starting around $5K. The money isn't a deal breaker but
it still doesn't guarantee I can achieve what I need - hours or days instead
of weeks or months.

 

The largest EC2 instance Amazon has only has 68GB of RAM. I'll probably try
that next just because it's a cheaper way to get out of my 8GB physical
limitation.

 

Cloud is more appealing because I really don't want to have to waste a day
or two building a box (in addition to the purchasing headaches). And I may
not need the system after this project.

 

Are there any other options out there for large memory cloud systems? 

 

-Eric


-=--=---=----=----=---=--=-=--=---=----=---=--=-=-
Eric B. Wolf                           720-334-7734