Looking for a Hadoop Spark / Redshift Developer

min $50 USD / hour

Closed

Posted

over 6 years ago

min $50 USD / hour

1)Please provide one (small / medium ) use case of your Hadoop ETL work in detail. 2)f there are some X number of customers and they made some purchases. Can you write an SQL to find out the TOP 5 customers who made the most purchases. 3)What did you use Map Reduce for? What did you use PIG for ? What did you use Hive for ? where does the data gets transformed? / While performing transformations where is the data? -------------------------------------------------- 4)Please provide one (small / medium ) use case of your work with Spark in detail -------------------------------------------------- 5)Please provide one (small / medium ) use case of your work with Redshift in detail. 6)Did you perform the ETL work? where does the data gets transformed? / While performing transformations where is the data? Or Did you simply load the data from Source to Redshift? 7)What are your Data Sources? How did you get the Data from Source to the AWS environment? 8)Did you use Python? If yes, What was the purpose you use Python for? Explain in Detail.

Project ID: 15239597

About the project

17 proposals

Remote project

Active 6 yrs ago

Looking to make some money?

Email address

Benefits of bidding on Freelancer

Set your budget and timeframe

Get paid for your work

Outline your proposal

It's free to sign up and bid on jobs

17 freelancers are bidding on average $58 USD/hour for this job

@bmartynvw

Hi, My name is Benjamin. I'm an expert with over 14 years of experience. I have worked primarily with Spark for ETL. The data sources were Amazon S3 and REST APIs, formats being Parquet, JSON or CSV. 1 -> The case was to take transaction data and glean BI views 2 -> Yes, this should be easy once we have the data, with group by and sums 3 -> Data transformation happens on the cluster 4 -> Did a process automation for Machine Learning, apart from a few ETLs 5 -> I have experience loading data into RedShift for visualization. Would love to work with you on this. Look forward to hearing from you. Regards,

$55 USD in 40 days

5.0

(6 reviews)

6.0

@winnow1

I have extensive experience working on Hadoop ecosystem: Hive, Spark, Sqoop, HBase, Redshift, Oozie, Storm, Impala, Kylin etc. Also, MongoDB, Cassandra, Spark SQL, Spark ML lib, Spark Streaming Q1. Please provide one (small / medium ) use case of your ETL work in detail. 1. Web Scrappers -> Kafka -> Elasticsearch -> Kibana [ 10^7 logs in 1PB data per day] 2. Twitter -> Python Producers -> Pyspark EMR -> Elasticsearch + Redshift -> Tableau [1GB per day] 3. Radio API -> Kinesis -> Hive Map Reduce -> MySQL [2 GB per day] 4. Web API + Mobile API -> Kafka -> Python Consumers -> Teradata -> Power BI [4 GB per day] less Q2. If there are some X number of customers and they made some purchases. Can you write an SQL to find out the TOP 5 customers who made the most purchases. SELECT top 5 custid, COUNT(distinct orderid) AS 'Purchases' FROM orders GROUP BY custid ORDER BY 2 DESC Q3. What did you use Map Reduce for? What did you use PIG for ? What did you use Hive for ? where does the data gets transformed? / While performing transformations where is the data? Map Reduce is a framework, it was used in the use case #3 above. I have not worked on PIG but understand how it is different from Hive. I have used Hive in use case #3 above. During Map Reduce the data is mapped (read) from a storage (usually HDFS or S3 bucket) and Reduced (transformed, aggregated etc.) using Hive Storage. On the other hand, if we do the same in Spark, it happens in memory.

$55 USD in 40 days

5.0

(6 reviews)

5.6

@eclipsetechno

Hi, Its quite unfortunate that we cannot answer the questions you have raised in an attachment and post as freelancer.com never allows a bidder to attach a file until the client responses once at least. We have highly skilled developer on Hadoop and Spark. If you want I can patch you to talk so that you may judge him the technical skills. I will discuss with you the required commercials. Agree? Please response and make a schedule. Let’s discuss, Reasons to choose us: ****************** 1. We are a Govt, registered company named Eclipse Technoconsulting Global Pvt. Ltd 2. We are the proud company to get Global quality appraisal as CMMi level 3. 3. We are ISO 9001:2008 Certified company. Regards, Mit,

$98 USD in 40 days

5.0

(1 review)

5.0

@gkbhardwaj87

Hi, I have experience in spark/hadoop and machine learning. for more information ping me. I will provide all details.

$55 USD in 40 days

5.0

(1 review)

3.1

@mike199

Hi, I’m a Web Designer/Developer from the UK. My name is Mike. Your project description sounds interesting to me and I do have skills & experience that are required to complete this project. Let's have a quick chat when you're online.

$55 USD in 40 days

0.0

(0 reviews)

0.0

@MetaoriginLab

We are a Team of Data Scientists having healthy experience into Big Data technologies like Hadoop,MapReduce and Data Analytics like R,HBase etc. The Team has qualified engineers having expertise in solving complex problems.

$56 USD in 83 days

0.0

(1 review)

0.0

@premanandnaik09

I have strong technical and data analytical skills. Have 2 yrs of experience in Data analytics and data engineering. Hard working.

$50 USD in 10 days

0.0

(0 reviews)

0.0

@sahaworldwide

I am Big Data expert with over 14 years of experience. Among that over 5 years in Big Data Technologies. I have worked all the mentioned technologies, except RedShift.

$77 USD in 20 days

0.0

(0 reviews)

0.0

@aligates007

Hi plz discuss with me I am here to work on hourly basis plz I am waiting ur reply I will discuss with u point by point I am not known that aboutique first point but understood restate of plz message to discuss with u

$55 USD in 40 days

0.0

(0 reviews)

0.0

@point2solutions

Dear Client, I am an expert data analytics having 5 year of experience. I have been worked with big companies like Careem, OLACAB, TCS as well before as in my past experience. I have excellent command on R language, Hadoop, Bigdata tools etc. Thanks Prateek

$55 USD in 40 days