MBDN-L Archives

October 2014

MBDN-L@LISTSERV.GMU.EDU

Options: Use Proportional Font
Show HTML Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Content-Type:
multipart/alternative; boundary="----Nodemailer-0.5.0-?=_1-1414342312646"
Sender:
Mason Big Data Network <[log in to unmask]>
Subject:
From:
Maksim Tsvetovat <[log in to unmask]>
Date:
Sun, 26 Oct 2014 09:51:52 -0700
In-Reply-To:
MIME-Version:
1.0
Reply-To:
Mason Big Data Network <[log in to unmask]>
Parts/Attachments:
text/plain (2268 bytes) , text/html (3069 bytes)
Spark runs on top of Hadoop and is the next iteration of the software -- I suggest learning it first and foremost.Nobody will be doing raw Hadoop in a year or two.


—
Sent from Mailbox

On Sun, Oct 26, 2014 at 12:20 PM, Sachin Garg <[log in to unmask]>
wrote:

> Thanks for the pointers, Max.
> You guessed it right - we are essentially looking to familiarize
> ourselves with the Big Data processing paradigm and understand the tool
> chain involved - hence the request for a Hadoop cluster.
> We will look at Digital Ocean as well as Spark/Shark.
> However, we would still prefer something within the university - I know
> a shared cluster is there - but where and how to access it remains a
> question.
> Best,
> Sachin
> On Saturday 25 October 2014 06:08 PM, Maksim Tsvetovat wrote:
>> If all you're trying to analyze is 20 gigs you don't need Hadoop. I
>> suggest loading it into an SQL database and crunching on a desktop box.
>> If the point is to setup and use Hadoop, on Digital Ocean you can rent a
>> requisite cluster for about the price of a latte.
>> 
>> Also -- don't use raw Hadoop, use Spark/Shark.
>> 
>> For another fun tool -- google BigQuery. 
>> 
>> —
>> Sent from Mailbox <https://www.dropbox.com/mailbox>
>> 
>> 
>> On Sat, Oct 25, 2014 at 5:23 PM, Sachin Garg <[log in to unmask]
>> <mailto:[log in to unmask]>> wrote:
>> 
>>     Hi,
>> 
>>     We are looking for a Hadoop Cluster to analyze data related to airlines
>>     ontime arrival for Dr. Borne's CS695 class.
>> 
>>     Wondering if anyone can point us to a Hadoop cluster on campus to do
>>     this? I remember that we have a shared computing resource in the
>>     university that we can use.
>> 
>>     Best,
>>     Sachin
>> 
>>     -- 
>>     Sachin Garg <[log in to unmask]>
>>     Doctoral Student
>>     School of Policy, Government, and International Affairs
>>     George Mason University, Arlington, VA 22201
>>     Phone: +1-703-993-8647 Cell: +1-703-996-9445
>>     SSRN page: http://ssrn.com/author=690016
>> 
>> 
> -- 
> Sachin Garg <[log in to unmask]>
> Doctoral Student
> School of Policy, Government & Intl. Affairs
> George Mason University
> 3351 Fairfax Drive MS3B1, Arlington, VA 22201, USA
> Phone: +1-703-993-8647  Cell: +1-703-996-9445

ATOM RSS1 RSS2