Hadoop General Interview Question

1. How to achieve row level security in hadoop hive,impala etc.
2. Dynamic partition in Hive
3. Bulk load data into hive partition
4. Serde in hive
5. How many types of files are there in Hive?(input format,output format)
6. How to register UDF in hive?
[http://blog.matthewrathbone.com/2013/08/10/guide-to-writing-hive-udfs.html]
7. Bucketing and Partitioning logic in hive
8. ORC format in hive
9. Why PIG is used?
10.What version of PIG & Hive you have used?
11. What is tuple & bag in PIG ?
12.How you can achieve group by in PIG
13.Simulate sql in PIG
14.How to load data from hive partition using oozie as and when data arrieved?
15.How many mapper has been initiated when a hive query run?
16.What to do if a cluster is down in a Hadoop environment?
17.Difference between Kerbarose & Sentry
18.How impala run?
19.What is data lake?
20.What is the input format of Hive UDF?
21.How do you process JSON Serde
22.How can you make update to Hive table?
23.How to create simulate view in hive table?
24.How pig can process unstructured data?
25.What are the source system for your project?(What type of files you have processed)
26.How to update a particular column of a particular row?
27.In which version of Hive update features come?
28.How to load JSON,XML data in Hive?
29.How to do SCD in Hive?(versioning of data)
30.How hive run a map reduce?(partitioning/clustering)
31.If .. Exist in hive
32.Is it possible to create Cartesian join (cross join) in hive?
33.Can we load data as view in Hive?
34.What is bucketing?
35.Common errors in hive.
36.How to create new user and give access to it in hadoop ecosystem
37.Serde input format and output format
38.How to improve performance in hive?

39. asynchronous dataset using oozie
40. hive server2 data load
41. hive tuning
42. bucketing,partitioning
43. sqoop connector data load and fetch
44. UDF for extracting dob in any format from a freeform text seperated with | delim
45. Load data from unbounded xml in hive using serde
46. Hive query and data load into partition
47. Map Reduce programming for depth
48. Churn model in R