Data Engineer Interview Questions

21,096 data engineer interview questions shared by candidates

Suppose I have records like this: ("a-b", "data1", 1) ("a-c", "data2", 1) ("a-b", "data3", 1) How can I group and sum, such that I have the following results when the input is a DataStream? ("a-b", ["data1", "data3"], 2) ("a-c", ["data2"], 1)
avatar

Data Engineer

Interviewed at Booking.com

4
Mar 3, 2019

Suppose I have records like this: ("a-b", "data1", 1) ("a-c", "data2", 1) ("a-b", "data3", 1) How can I group and sum, such that I have the following results when the input is a DataStream? ("a-b", ["data1", "data3"], 2) ("a-c", ["data2"], 1)

First Round Q1 Delete duplicates from a table. Q2 Find duplicate rows from the table. Q3 Repartition vs Coalesce Q4 How stages are created in Spark ? Q5 Word count program with Spark Q6 Fibonnaci Sequence using Python Q7 What are generators ? Q8 Spark Architecture Q9 Questions related to project Techno managerial Round Q1 Discussion about the project and experience. Q2 Query to create a table and partition the table in HIVE ? Q3 Directory structure for partitioned table Q4 If you add a new directory with correct schema to hdfs will data be shown in HQL ? Q5 Find the max temperature for a given date range ?
avatar

Big Data Engineer

Interviewed at Impetus Technologies

3.7
May 23, 2022

First Round Q1 Delete duplicates from a table. Q2 Find duplicate rows from the table. Q3 Repartition vs Coalesce Q4 How stages are created in Spark ? Q5 Word count program with Spark Q6 Fibonnaci Sequence using Python Q7 What are generators ? Q8 Spark Architecture Q9 Questions related to project Techno managerial Round Q1 Discussion about the project and experience. Q2 Query to create a table and partition the table in HIVE ? Q3 Directory structure for partitioned table Q4 If you add a new directory with correct schema to hdfs will data be shown in HQL ? Q5 Find the max temperature for a given date range ?

Viewing 1861 - 1870 interview questions

Glassdoor has 21,096 interview questions and reports from Data engineer interviews. Prepare for your interview. Get hired. Love your job.