


#FLUME PRO 2.6 UPDATE#

#FLUME PRO 2.6 MAC#
Operation Systems\ Methodologies: Linux (CentOS, Ubuntu), Mac OS, Windows\ Agile, Waterfall. Service Programming\ Tools: Zookeeper 3.3.6\ Eclipse, Git, Maven, Tableau\ Languages\ Scheduling: Java, Python, Scala, UNIX Shell Scripting, \ Oozie 4.0.x, Falcon\ SQL, C, C++ SQLOn: Hadoop\ Data Ingestion / ETL tools: Hive 0.12, Cloudera Impala 2.0.x\ Flume 1.3.x, Sqoop 1.4.4, Storm 0.9, Kafka 0.8. Relational Databases: Distribution based on Hadoop Oracle 11g/10g/9i/, MySQL 5.0, SQL Server\ Cloudera Distribution (CDH4, CM). Successfully working in fast-paced environment, both independently and in collaborative team environments.ĭistributed File System\ Distributed Programming: HDFS 2.6.0\ MapReduce 2.6.x, Pig 0.12, Spark 1.3.Hadoop Library\ NoSQL-DataBases Mahout, MLlib\ HBase 0.98, MongoDB, Cassandra.Knowledge of Social Network and Graph Theory.Fluent in Data Mining and Machine Learning, such as classification, clustering, regression and anomaly detection.Experienced in Agile and Waterfall methodologies.Presenting data in a visually appealing tool Tableau.Used Maven to achieve source building framework.In depth understand of Scalable Machine Learning libraries like Apache Mahout, MLlib.Consolidated MapReduce jobs by implemented Spark, decreased data processing time.Scheduled workflow using Oozie workflow Engine.Extracted data from log files and push into HDFS using Flume.Implemented Sqoop jobs for large sets of structured and semi-structured data migration between HDFS and/or other data storage like Hive or RDBMS.Used NoSQL Database including Hbase, MongoDB, Cassandra.Experience in integration of various data sources in RDMS like Oracle, SQL Server.Developed real-time read write access to very large datasets via Hbase.Wrote Ad - hoc queries for analyzing the data using HIVE QL.Extended Pig and Hive core functionality by writing custom UDFs.Experience in analyzing data using HiveQL, HBase and custom MapReduce programs in Java.Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop 2.x, MapReduce 2.x, HDFS, HBase, Oozie, Hive, Kafka, Oozie, Zookeeper, Spark, Storm, Sqoop and Flume.Excellent understanding of Hadoop architecture and various components such as HDFS, YARN, High Availability, and MapReduce programming paradigm.Worked in various domains including luxury, telecommunication.

Over 5 years of working experience including 3+ years of experience in Hadoop Development along with 2+ years of experience in Data Analyst.
