Pig Latin

Discussion in 'OT Technology' started by Peyomp, Nov 26, 2009.

  1. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    Check out Apache Pig: http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html

    Its a dataflow abstraction, similar to SQL but its procedural, for analyzing big data on a hadoop cluster. User Defined Functons are easily defined in Java. You type your procedure into the grunt prompt, and it doesn't execute until you store/dump the data. The reason being - you might be mapping 100TB, and you don't want to execute that until you're sure you've got it right, it can take a while and use a lot of resources. So you execute the ILLUSTRATE command, which samples your data and generates some too to satisfy filters, and when you're happy you fire the job by executing your STORE. You can execute files as well, for regular jobs.

    It is hella fun to have a little prompt at your hands that drives many TBs on several racks of machines :big grin:

    You can even use this on your own apache logs. The nice thing is - no tables, no database, works on flat files. TSV - the future - hah! :) Actually a lot of people use Thrift, but its not very relational - you do far fewer JOINs in hadoop.

    For instance, this is how Pig is used at Twitter: http://www.slideshare.net/kevinweil/hadoop-pig-and-twitter-nosql-east-2009
     
  2. CodeX

    CodeX Guest

    pfft... I made a pig latin translator in quick basic when I was 7 years old
     
  3. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    :rofl: good one :)
     
  4. Curren$y

    Curren$y New Member

    Joined:
    Aug 18, 2004
    Messages:
    20,275
    Likes Received:
    0
    man i still remember when u bleached ur beard, how long ago was that?
     

Share This Page