Somesh

Hadoop Developer

======

Summary

Having 13 yearsexperience in software design and development using HadoopBigDataEcho Systems and Core JAVA/J2EE with Telecommunication,Banking and Finance and HealthCare domains.

5/13 years of IT Experience in Analyzing, Designing, Developing, Implementing and Testing of

Software Applications in HDFC, MapReduce,PIG, HIVE,Sqoop,Spark, HBase, MongoDBKAFKA, AZKABAN, ZooKeeper and Ozie.

8/13 Years of hands- on experience in software design and development using Core JAVA,SPRINGCore , SpringAOP , SpringHibernate, JDBC, XML in UNIX operating system.

Excellent understanding of Hadooparchitecture and different components of Hadoop clusters whichinclude componenets of Hadoop (Job Tracker, Task Tracker, Name Node and Data Node).

Involved in Data Ingestion to HDFS from various data sources.

Involved in loading data from LINUX file system to HDFS.

Programmed for processing and generatingbig datasets with aparallel,distributedalgorithm on acluster using Map Reducer.

Executed HIVECommands for reading, writing, and managing large datasets residing in distributed storage using SQL.

Importing and exporting the data from relational databases, NO SQLDB’S using Sqoop.

Analyzed large data sets by running Hive queries and Pigscripts.

Having a good experience in writing the pig scripts.

Involved in Optimization of HiveQueries.

Analyzed larger sets of data representing them as data flows, applied pig filters to process the data and stored in HDFC for further process using PIGLatin script.

Automated sqoop,hive and pig jobs using Oozie scheduling.

Experience in designing and developing POCs usingSpark to compare the performance of Spark with Hive and PIG in data processing time.

Managing large set of hosts, Co-coordinating and managing a service in a distributed environment using Zookeeper.

To improve flexibility, great performance and for big data horizontal scaling , used column based HBASE NOSQL database.

Written API to store documents using mango NOSQL database.

Ability to analyze different file formats avro and Parquet

Having good exposure in cluster maintenance.

Involved in configuration and deployment modules.

Having a good knowledge in OOZIE workflows.

Have good knowledge on writing and using the user defined functions in HIVE and PIG.

Configured & deployed and maintained multi-node Dev and Clusters.

Developed multiple Kafka Producers and Consumers from scratch as per the business requirements.

Responsible for creating, modifying and deleting topics (Kafka Queues) as and when required by theBusiness team.

Developed tests cases to benchmark and verify data flow through the Kafka clusters.

Strong experience in Core Java, used Concurrency Collection and ConcurrencyThreads to implement high performance and complex logics to strive good performance solutions.

Very strong experience in ObjectOrientedDesignPrinciples to achieve high cohesive and loosecouplingarchitecture.

Excellent hands on experience on applying CoreJava design patterns for the appropriate problem.

Has good experience in XML parsing using SAX parser.

Has a very good knowledge in collection frameworks to collect the structured data.

Has good experience on REST webservices to pull data from different servers of CLOUD based API

Used SpringCore nad AOP and Hiber nate template to strive the Dependency Injection benefits and other from Spring ramework.

Demonstrated talent for identifying, scrutinizing, improving, and streamlining complex work processes through highly analytical thinking and analysis.

Versed in both agile and waterfall development techniques.

Flexible team player and able to thrive in environments requiring ability to effectively prioritize and juggle multiple concurrent projects.

Proven relationship-builder with unsurpassed interpersonal skills.

Extensive exposure to all aspects of Software Development Life Cycle (SDLC) i.e. Requirements Definition for customization, Prototyping, Coding (JAVA,Hadoop Echo sytem) and Testing

DeployedJAVA tasksthat periodically query a database and gives the result to a dataset

Java/J2EE Software Developer with experience of Core Java and Web based application with expertise in reviewing client requirement; prioritize requirements, creating project proposal (scope, estimation) and baseline project plan. Extensive experience with design and development of J2EE based applications involving technologies such as Java Server Pages (JSP and Java Data Base Connectivity (JDBC).

Passing datasets created by traditional job steps to Java programs, which convert the data to XML and also reading and writing datasets from Java

Technical Skills
Programming Languages / Core JAVA(Collections, Multi Threading, Concurrency API, Strings, Memory management, Serialization, Thread Executors, Design Principles and Design Patterns ), SCALA, SERVLETS and JSP
Web Services/Servers / REST Services,Weblogic and Tomcat
Frameworks / Spring core, Spring IOC, AOP and Hibernate Template
Architecture / Object Oriented Design Principles,Object Oriented Design Patterns, UML Notations, UMLet, IBM Exceed, HLD and LLD .
BIG DATA /HADOOP ECO System / Hadoop, HDFS, Map Reduce, Pig, Hive, HBase, Sqoop, Zoo Keeper, SPARK,OZIE ,AZKABAN
Scripting Languages / Shell Scripting
Databases / SQL/MYSQL, HBASE and MongoDB, Hibernate (ORM)
Building Tools / Ant, Maven, JENKINS
Development Tools / Eclipse, Notepad++, Net Beans,
Code Repository Tools / GITS HUB, SVN and Clear Case, Confluence(For Project Info)
Data Formats / XML,JSON,AVRO and Parquet
Methodologies / Agile Scrum(Scrum,Grooming,Springt Planning, Daily,Review,Demo),Waterfall
Domains / Tele communication, Banking and Finance and Health care
Certifications
Stream / Program / Certified By
Technical / Cloudera Certified Developer for Apache Hadoop
(License: 100-013-905) /
Technical / Sun Certified Java Programmer /
Technical / Java Programmer /
Telecom Domain / 1)Telecom Fundamentals
2)WCDMA Technology
3)OSS-RC /
Behavioral / 1) Customer Leadership Interface Program
2) Negotiation To Win
3) The Emerging Leader
4) Emotional Intelligent /
Projects

Project: Fraud Prevention and Detection Expert Analytics

[Nov 2014– Till date]

Client: Anthem,Norfolk, VA

TeamSize: 25

Role: Hadoop Solution Architect/Java Developer

Environment: SPARK, MapReducer, PIG Scripts, Azkaban,HBASE,JAVA7, REST Services

Description: Developing predictive models to identify fraudsters by making use of real-time and historical data of medical claims, weather data, wages, voice recordings, demographics, cost of attorneys and call center notes. Hadoop’s capability to store large unstructured data sets inNoSQL databasesand using MapReduce to analyze this data helps in the analysis and detection of patterns in the field of Fraud Detection..

Responsibilities:

  • Involved in writing spark shell commands to process the real time raw data.
  • Involved in writing Map Reduce Jobs for applying proprietary algorithms and processingdata.
  • Involved in configuring Hadoop cluster environment.
  • Creating Azkaban work flows, configuring and executing them.
  • Creating Azkaban projects and executing with workflows.
  • Implementing Ajax API to invoke Azkaban workflows.
  • Creating and Configuring and executing Jenkins Jobs for Azkaban projects.
  • Create and Run the pig scripts for identifying malicious activity operations..
  • Analyzed large data sets by running Hive queries and Pig scripts to identify various theft card rates.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries.
  • Involved in loading data from LINUX file system to HDFS.
  • Load and transform large sets of structured data.
  • Involved in runningHadoopjobs for processing millions of records of network data.
  • Analyzed the data by performing Hive queries and running Pig scripts to know claims behavior.
  • Loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Loading processed data into HBase for further actions.
  • Involved in preparing High level and low level design documents for the RESTful web services used in this application.
  • SME on Big data technologies (hdfs, mapreduce, hive, oozie, spark, sqoop, hbase, platform architecture.)
  • Evaluating client needs and translating their business requirement to functional specifications thereby on boarding them onto the Hadoop ecosystem.
  • Worked on designing the mapreduce flow and writing map reduce scripts , performance tuning and debugging.
  • Worked on implementing Spark to ingest the data in real time and apply transformations in SCALA
  • Involved in creating Hive tables, loading the data and writing hive queries that will run internally in a map reduce way.
  • Figured out the data lineage in hadoop to track down the data from where its being ingested and also has sound knowledge on various tools to figure out that lineage.
  • Imported data using Sqoop to load data from Oracle to HDFS on regular basis.
  • Created HBase tables to store variable data formats coming from different portfolios.
  • Implemented HBase custom co-processors, observers to implement data notifications.
  • Used HBase thrift API to implement Real time analysis on HDFS system.
  • Developed join data set scripts using HIVE join operations.
  • Developed join data set scripts using Pig Latin join operations.
  • Designed and implemented Map Reduce-based large-scale parallel relation-learning system.

Project: OSSRCAnalytics [Sep 2012– Nov2014]

Client: Ericsson R&D,Athlone, Ireland.

TeamSize: 8

Role: Hadoop Java Developer/ Big Data Engineer

Environment: JDK1.7, Solaris OS, UNIX Shell Scripts, XML, Eclipse, Clear Case, HDFC, Map Reducer, PIG, FLUME, Mongo DB, SPARK Core

Description: Ericsson Expert Analytics is a telecom analytics solution that measures all perceived customer experiences based on metrics and events from network nodes, probes, devices, OSS/BSS and other sources. Key insights used to deliver tailored offers and customer responses include:

  1. Service level index: Predicts customer satisfaction – based on objective quality and subjective weightings by customer segment – over time, allowing for better targeting of retention or upsell actions.
  2. Subscriber profile:Incorporates usage trends, location patterns and customer value indicators for a complete understanding of the subscriber and best actions to take.
  3. End-to-end session record: Correlates experience impacts with granular network and device events, and interprets these events to determine the “most probable cause” and therefore the “best next action” to improve experience or target offers.
  4. Device analytics:Provides in-depth insights about which devices drive profitability, usage and superior experiences.
  5. OTT application analytics: Offers insights about app usage that informs investment and marketing decisions.

Technical Responsibilities:

  • Involved requirements gathering, understanding and grooming with the team.
  • Involved in preparing High level and low level design documents for data connect application.
  • Design and recommend the Object Oriented Design Principles into application usecases.
  • Applying Core Java Design Patterns where ever its required for the user stories.
  • Implementation of user stories in Core Java using Multi threading and concurrency collections.
  • Created Immutable classes as part of user stories which needs to traverse through the networks.
  • Design and Implemented REST API to the data connect users.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved Implementing POCs using Spark shell commands to process the huge data and compare the process time.
  • Implemented data injection process using flume sources, flume consumers and flume interceptors
  • Validated the performance of Hive queries on Spark against running them traditionally on Hadoop
  • Involved in Testing and coordination with business in User testing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Written Spark shell commands to parse the network data and structure them in tabular format to facilitate effective querying on the entities.
  • Involved in creating Hive tables loading data and writing queries that will run internally in MapReduce way.
  • Used Pig tool to do transformations, event joins, filter and some pre-aggregations.
  • Involved in processing ingested raw data using MapReduce, Apache Pig and HBase.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved in scheduling Azkaban workflow engine to run multiple Workflows and pig jobs.
  • Used Hive to analyze the partitioned and bucketed data to compute various metrics for reporting.

Project: Asset Management [Aug 2011– Sep2012]

Client:J.P. Morgan

TeamSize: 45

Role:Technical Lead

Environment: JDK1.6, Windows, Eclipse, Clear Case, Ext JS, Restful Web Service,Spring Core Module, Pig, Hadoop.

Description: This is a portfolio manager application that helps to track the investment of customers.

Project: OSS Navigator[April 2010 – Aug 2011]

Client: Ericsson, Athlone, Ireland

TeamSize: 15

Role: Java Developer

Environment: JDK1.5, Solaris OS,UNIX Shell Scripts,Eclipse,Clear Case,Spring,Hibernate

Description:Navigator provides an integrated view of detailed network topology and alarmInformation. The application serves the operator process of network monitoring.

The following functions are available:

• A graphical user interface that provides a comprehensive model of thenetwork topology

• Presentation of alarm status in a topology tree, alarm list view, and a hyperbolic view

• Advanced alarm list filtering functionality

• Alarm synchronization between a 3GPP Alarm IRP agent and NetworkSurveillance Integrator

• The possibility to launch other OSS-RC applications supporting thenetwork problem resolution process

• The possibility to view and store KPI values from network elements

Project: BELL Mobility[Sep 2009 to Feb 2010]

Client: BELL

TeamSize: 10

Role: Java Developer

Environment: JDK1.5, Solaris OS,UNIX Shell Scripts,Eclipse,CVS,HEAT

Description:Iris is a retail web application and this is used by Bell retail dealers and their sales reps in the store to sell return and replace of Bell products like Mobiles, TV, Internet and Homophones. And also provides interface to Warranty and repair tracking inventory systems.

Project: Configuration Management Support (CMS) [Jan 2006 -Jul 2009]

Client: Ericsson, Ireland.

TeamSize: 4

Role: Java Developer

Environment: JDK1.5, Solaris OS,UNIX Shell Scripts,XML,Eclipse,Clear Case

Description:Configuration Management Support (CMS) provides configuration support at installation and at post-installation of network elements, i.e. installation of NE’s after original installation, and after upgrade. In run time CMS keeps track of the connection status and synchronization status of the network elements and supports the possibility to keep the sub-network specific data consistent, i.e. adjacent cell and area configuration.

Project: LSV +[Jan 2005 to Nov 2005]

Client: Credit Swiss

Team Size: 8

Role: Java Developer

Environment: JDK1.4, JSP,Servlet,Struts, Tomcat, Eclipse.

Description:LSV is a direct debiting application developed for managing the bulk debits of the creditor. The debit requests will be in the form of a file, the files are submitted using Direct Net and Direct Link. The files are processed in the File gate and the generated debit request is submitted to OTP or host for processing. The CORBA services gives access to the host.

Academic Profile

Bachelorof Engineering in Electrical and Electronics of Engineering with distinction

Somesh Eskala

1 of 7