Hi
Will do this for you.
Being working in the BigData domain for more than 3 years and a Cloudera and IBM certified Hadoop developer, I hope this is my domain.
Well, that was to just introduce myself, now on to the project. We can do the same over multiple milestones/phases.
Before that, I need to know the exact data model of the input data. I hope the whole schema is somewhat csv, right? Also, what strategy you are using for injesting data onto AWS, using any custom tools or something like sqoop or flume?
Also, as per the description; you need to analyze the data; can you please get me some more details.
Also, is there any direct dependancy to Map-R or can we use other distributions like Cloudera?
Please feel free to contact,
Thanks,