Hello, Friends,

I am new to python, currently I have a requirement to use python to create a mapreduce job in HADOOP... I searched every where but couldn't get any lead.

1) First I need read a KEY, VALUE pair from the Hadoop sequential file.
2) Secondly, need to uncompress the Value read from the sequential file

Please help me..... Is there any builtin class in hadoop that can be used in python to read the sequential file?

Recommended Answers

All 4 Replies

A way to do this is with dictionaries, so that you can store your values from the sequential file. You will have then the key and the values.
For more information about dictionaries, visit http://docs.python.org/library/stdtypes.html#dict , or read the Python Manual which came with your installer, or the online Python documentation.

A way to do this is with dictionaries, so that you can store your values from the sequential file. You will have then the key and the values.
For more information about dictionaries, visit http://docs.python.org/library/stdtypes.html#dict , or read the Python Manual which came with your installer, or the online Python documentation.

Thank you very much for your link...I am currently checking it... also could you please give me some links to read a binary sequential file in Hadoop since I need to distinguish key and compressed binary data.... please help me

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.