Mapreduce job in HADOOP using Python

Question

philipalex 0 Newbie Poster

12 Years Ago

Hello, Friends,

I am new to python, currently I have a requirement to use python to create a mapreduce job in HADOOP... I searched every where but couldn't get any lead.

1) First I need read a KEY, VALUE pair from the Hadoop sequential file.
2) Secondly, need to uncompress the Value read from the sequential file

Please help me..... Is there any builtin class in hadoop that can be used in python to read the sequential file?

python python-2

3 Contributors
4 Replies
228 Views
8 Hours Discussion Span
Latest Post 12 Years Ago Latest Post by philipalex

All 4 Replies

TrustyTony 888 pyMod

12 Years Ago

What is your current knowledge of Hadoop? Compared for example with http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Lucaci Andrew 140 Za s|n · Answer 1 · 2012-02-15T20:46:19+00:00

A way to do this is with dictionaries, so that you can store your values from the sequential file. You will have then the key and the values.
For more information about dictionaries, visit http://docs.python.org/library/stdtypes.html#dict , or read the Python Manual which came with your installer, or the online Python documentation.

philipalex 0 Newbie Poster · Answer 2 · 2012-02-15T21:45:09+00:00

What is your current knowledge of Hadoop? Compared for example with http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/

@pyTony: thank you for your post. I have successfully executed the code in hadoop which is described in this tutorial http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/ and I am searching further to read a sequential file.

philipalex 0 Newbie Poster · Answer 3 · 2012-02-15T21:50:09+00:00

A way to do this is with dictionaries, so that you can store your values from the sequential file. You will have then the key and the values.
For more information about dictionaries, visit http://docs.python.org/library/stdtypes.html#dict , or read the Python Manual which came with your installer, or the online Python documentation.

Thank you very much for your link...I am currently checking it... also could you please give me some links to read a binary sequential file in Hadoop since I need to distinguish key and compressed binary data.... please help me

Mapreduce job in HADOOP using Python

Recommended Answers Collapse Answers

All 4 Replies

Recommended Answers