Thanks for the pointers, Max.
You guessed it right - we are essentially looking to familiarize
ourselves with the Big Data processing paradigm and understand the tool
chain involved - hence the request for a Hadoop cluster.
We will look at Digital Ocean as well as Spark/Shark.
However, we would still prefer something within the university - I know
a shared cluster is there - but where and how to access it remains a
question.
Best,
Sachin
On Saturday 25 October 2014 06:08 PM, Maksim Tsvetovat wrote:
> If all you're trying to analyze is 20 gigs you don't need Hadoop. I
> suggest loading it into an SQL database and crunching on a desktop box.
> If the point is to setup and use Hadoop, on Digital Ocean you can rent a
> requisite cluster for about the price of a latte.
>
> Also -- don't use raw Hadoop, use Spark/Shark.
>
> For another fun tool -- google BigQuery.
>
> —
> Sent from Mailbox <https://www.dropbox.com/mailbox>
>
>
> On Sat, Oct 25, 2014 at 5:23 PM, Sachin Garg <[log in to unmask]
> <mailto:[log in to unmask]>> wrote:
>
> Hi,
>
> We are looking for a Hadoop Cluster to analyze data related to airlines
> ontime arrival for Dr. Borne's CS695 class.
>
> Wondering if anyone can point us to a Hadoop cluster on campus to do
> this? I remember that we have a shared computing resource in the
> university that we can use.
>
> Best,
> Sachin
>
> --
> Sachin Garg <[log in to unmask]>
> Doctoral Student
> School of Policy, Government, and International Affairs
> George Mason University, Arlington, VA 22201
> Phone: +1-703-993-8647 Cell: +1-703-996-9445
> SSRN page: http://ssrn.com/author=690016
>
>
--
Sachin Garg <[log in to unmask]>
Doctoral Student
School of Policy, Government & Intl. Affairs
George Mason University
3351 Fairfax Drive MS3B1, Arlington, VA 22201, USA
Phone: +1-703-993-8647 Cell: +1-703-996-9445
|