Company: SPH
Responsible for implementation and ongoing admin of Hadoop infrastructure.
Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop
Working with data delivery teams to set up new Hadoop users also setting up and testing HDFS, Hive, Pig and MapReduce access for the new users.
Cluster maintenance as well as creation and removal of nodes using tools like Ganglia, Nagios, Cloudera Manager Enterprise, Dell Open Manage and other tools
Performance tuning of Hadoop clusters and Hadoop MapReduce routines
Job Requirements
The Essentials :
3 -5 years of hands-on experience in Hadoop ecosystem, must have in EMR AWS Cloud platform
At least 3 years of experience as a Hadoop Administration and worked on more than one cluster and must have Hadoop Cluster setup experience
Extensive experience working with data warehouses and big data platforms
Oversees implementation and ongoing administration of BigData infrastructure, specifically EMR
Should be able to the upscale cluster to cater to ongoing requirements
Manages Big Data components/frameworks such as Hadoop, Spark, HBase, Hadoop Distributed File System (HDFS) Oozie, Avro, Kafka, Kibana, Hue, Hive, Yarn in EMR AWS Cloud.
Analyzes latest Big Data analytic technologies and innovative applications in both business intelligence analysis and new offerings
Coordinate with the DevOps and networking team to propose and enhance the Big Data environment
Handles cluster monitoring and maintenance, perform POCs of a new capability in Hadoop Platform
Strong shell / Linux hands-on experience
Able to tune cluster performance and capacity planning
Make sure the data and cluster is secure as per the organization policy
Collaborates with Data Engineers, Data Scientists to deploy their jobs and continues support for versioning and maintaining
Acts of point of contact for Business and IT End users escalation
Data Engineer Qualifications / Skills:
BS or MS degree in Computer Science or a related technical field
4+ years experience programming and/or architecting a back end language (Java, J2EE, Core)
3+ years , Experience with non-relational & relational databases (SQL, MySQL, NoSQL, Hadoop, MongoDB, etc.)
Experience with Java oriented technologies (JBoss, Spring, SpringMVC, Hibernate, REST/SOAP)
3+ Years ,Experience with Spark, or the Hadoop ecosystem and similar frameworks