Prospance Infotech Inc is a staffing and recruiting company.
Responsibilities
· Create new, and maintain existing, Spark jobs written is Scala
· Create new, and maintain existing, Flink jobs written in Scala
· Produce unit and system tests for all code
· Participate in design discussions to improve our existing frameworks
· Define scalable calculation logic for interactive and batch use cases
· Interact with infrastructure and data teams to produce complex analysis across data
Required Qualifications:
· A minimum of 2 years of experience with Scala and/or Java
· A minimum of 5 years of programming experience
· Required experience with Hadoop, Spark
· Knowledge and experience with cloud-based technologies
· Experience in batch or real-time data streaming
· Ability to dynamically adapt to conventional big-data frameworks and open source tools if project demands
· Knowledge of design strategies for developing scalable, resilient, always-on data lake
· Strong development/automation skills
· Must be very comfortable with reading and writing Scala code
· An aptitude for analytical problem solving
· Deep knowledge of troubleshooting and tuning Spark applications and Hive scripts to achieve optimal performance
· Good understanding/knowledge of HDFS architecture and various components such as Job Tracker, Task Tracker, Name Node, Data Node, HDFS high availability (HA) and Map Reduce programming paradigm.
· Experienced working with various Hadoop Distributions (Cloudera, Hortonworks, MapR, Amazon EMR) to fully implement and leverage new Hadoop features
· Experience in developing Spark Applications using Spark RDD, Spark-SQL, Spark -Yarn, Spark Mlib and Data frame APIs
· Experience with real-time data processing and streaming techniques using Spark streaming and Kafka, moving data in and out HDFS and RDBMS.
· Familiarity with open source configuration management and development tools
Preferred Qualifications:
· Hands on experience and production use of Hadoop/Cassandra, Spark, Flink and other distributed technologies would be a plus
· Other Technologies
o Scalatest
o Gradle/Maven
o Airflow
o SQL
o AWS