Skip to main content

Big Data & Hadoop

Since everything moves fast in the IT world, you have new terminologies entering their 3rd or 4th generation by the time you get a chance to get your hands dirty with them. Big Data has been one of them, an alluring technology allowing massive distributed power over large datasets using the famous map-reduce algorithm. Apache Hadoop allows scaling to massive proportions and has been in use with tech giants like Google and Facebook.

I decided to start running a Hadoop cluster myself using the following guide as a started.

https://github.com/GoogleCloudPlatform/solutions-google-compute-engine-cluster-for-hadoop/blob/master/README.md

This version installs Hadoop locally but uses the Google App Engine and Google Cloud Storage and allows basic scaling/clustering. I started running the pre-requisites on a VM Centos 6.4 and things were going ok. Then I realized that I needed to go deeper into Hadoop and maybe run a sample locally, without achieving the Cloud version first.  Then I went to the following:

http://tecadmin.net/steps-to-install-hadoop-on-centosrhel-6/#

It had simple enough steps to get it installed. Now I am reading Hadoop in Action by Alex Holmes.


Comments

Popular posts from this blog

Employment Based Green Card Marathon

There are 3 most popular ways to get a Green Card to live and work in US: Through Marriage with a US Citizen, Employment Sponsorship and the Diversity Lottery. I would like to articulate on the unfairness of the process for the Employment based applicants and its repercussions. After getting a scholarship to study at Brandeis University, I arrived in the United States on August 28, 1998 on an F-1 Student Visa. After graduating with my Master's degree, I had 1 year of OPT - Optional Practical Training which allowed me to work for companies that were in fields similar to my concentration. Next chapter in my immigration story is the H1-B Work Visa which is frowned upon. This visa provides an entry point for skilled immigrants and it is one of the very few points of entry to the US based on merit. Scorn on this quota of about 85k is well deserved on an emotional level, especially when considering high unemployment of today's workplace. Yet, working in IT and being involved with

Online Education

I am enamored with the open coursework offerings of MIT and Stanford. I would contemplate hard on going to college and go under hundreds of thousands of dollars of debt if these high quality offerings were provided for free. Why not have all material for free and offer exams ala certifications? CNBC Article . Currently watching Special Relativity from Leonard Susskind

American Bulls

American Bulls was an interesting site I stumbledupon recently. Not sure what algorithms they are using, but comparing with the charts, they seem to be pretty good. Worth taking a look.