YouTube Icon

Interview Questions.

Top 45 Apache Hive Interview Questions - Jul 25, 2022

fluid

Top 45 Apache Hive Interview Questions

Q1. How Do You Check If A Particular Partition Exists?

This can be executed with following query

SHOW PARTITIONS table_name PARTITION(partitioned_column=’partition_value’)

Q2. How Can Hive Avoid Mapreduce?

If we set the property hive.Exec.Mode.Local.Auto to real then hive will keep away from mapreduce to fetch question results.

Q3. What Are The Default Record And Field Delimiter Used For Hive Text Files?

The default report delimiter is − n

And the filed delimiters are − 01,02,03

Q4. How Can You Stop A Partition Form Being Queried?

By the usage of the ENABLE OFFLINE clause with ALTER TABLE atatement.

Q5. Why Do We Need Hive?

Hive is a device in Hadoop ecosystem which gives an interface to arrange and question information in a databse like fashion and write SQL like queries. It is appropriate for getting access to and studying information in Hadoop using SQL syntax.

Q6. What Is Bucketing ?

The values in a column are hashed into some of buckets which is defined with the aid of person. It is a manner to avoid too many walls or nested partitions whilst ensuring optimizes question output.

Q7. Which Java Class Handles The Output Record Encoding Into Files Which Result From Hive Queries?

Org.Apache.Hadoop.Hive.Ql.Io.HiveIgnoreKeyTextOutputFormat

Q8. What Is The Usefulness Of The Distributed By Clause In Hive?

It controls how the map output is reduced most of the reducers. It is beneficial in case of streaming records.

Q9. What Is The Significance Of The Line Set Hive.Mapred.Mode = Strict;

It units the mapreduce jobs to strict mode.By which the queries on partitioned tables can't run without a WHERE clause. This prevents very massive process going for walks for long time.

Q10. Is There A Date Data Type In Hive?

Yes. The TIMESTAMP statistics sorts shops date in java.Sq..Timestamp format

Q11. Can We Run Unix Shell Commands From Hive? Give Example?

Yes, using the ! Mark simply earlier than the command.

For example !Pwd at hive prompt will listing the present day listing.

Q12. What Is The Need For Custom Serde?

Depending on the character of information the consumer has, the built in SerDe may not fulfill the layout of the data. SO customers need to write their very own java code to satisfy their statistics format requirements.

Q13. Is It Possible To Create Cartesian Join Between 2 Tables, Using Hive?

No. As this sort of Join cannot be implemented in mapreduce

Q14. As Part Of Optimizing The Queries In Hive, What Should Be The Order Of Table Size In A Join Query?

In a join query the smallest desk to be taken inside the first role and biggest table ought to be taken in the final function.

Q15. While Loading Data Into A Hive Table Using The Load Data Clause, How Do You Specify It Is A Hdfs File And Not A Local File ?

By Omitting the LOCAL CLAUSE within the LOAD DATA assertion.

Q16. What Does The Following Query Do? Insert Overwrite Table Employees Partition (u . S ., State) Select ..., Se.Cnty, Se.St From Staged_employees Se;

It creates partition on desk personnel with partition values coming from the columns in the choose clause. It is referred to as Dynamic partition insert.

Q17. What Does /*streamtable(table_name)*/ Do?

It is query hint to circulate a table into memory before strolling the question. It is a query optimization Technique.

Q18. Can The Name Of A View Be Same As The Name Of A Hive Table?

No. The name of a view have to be particular whilst compared to all different tables and views present inside the same database.

Q19. What Is The Difference Between Like And Rlike Operators In Hive?

The LIKE operator behaves the same way as the everyday SQL operators utilized in pick queries.

Example − street_name like ‘%Chi’

But the RLIKE operator makes use of extra develop normal expressions which might be availableOho).*’ in an effort to choose any phrase which has both chi or oho in it.

Q20. What Is The Importance Of .Hiverc File?

It is a document containing list of instructions wishes to run whilst the hive CLI starts. For example placing the strict mode to be true and many others.

Q21. Give The Command To See The Indexes On A Table?

SHOW INDEX ON table_name

This will listing all of the indexes created on any of the columns in the table table_name.

Q22. How Can You Delete The Dbproperty In Hive?

There is not any manner you may delete the DBPROPERTY.

Q23. Does The Archiving Of Hive Tables Give Any Space Saving In Hdfs?

No. It best reduces the range of files which turns into simpler for namenode to control.

Q24. What Are The Three Different Modes In Which Hive Can Be Run?

Local mode

Distributed mode

Pseudodistributed mode

Q25. What Are The Different Types Of Tables Available In Hive?

There are two types. Managed desk and external desk. In controlled table both the statistics an schema in below manage of hive however in outside desk only the schema is below manipulate of Hive.

Q26. Can A Partition Be Archived? What Are The Advantages And Disadvantages?

Yes. A partition may be archived. Advantage is it decreases the range of documents saved in namenode and the archived report can be queried using hive. The downside is it will reason less efficient question and does now not offer any area savings.

Q27. Which Java Class Handles The Input Record Encoding Into Files Which Store The Tables In Hive?

Org.Apache.Hadoop.Mapred.TextInputFormat

Q28. What Types Of Costs Are Associated In Creating Index On Hive Tables?

Indexes occupies space and there is a processing value in arranging the values of the column on which index is cerated.

Q29. What Is A Table Generating Function On Hive?

A table producing function is a characteristic which takes a single column as argument and expands it to a couple of column or rows. Example exploe()

Q30. What Is A Generic Udf In Hive?

It is a UDF which is created the use of a java application to server a few particular need now not protected below the prevailing capabilities in Hive. It can discover the sort of input argument programmatically and offer appropriate reaction.

Q31. Can A Table Be Renamed In Hive?

Alter Table table_name RENAME TO new_name

Q32. What Is A Metastore In Hive?

It is a relational database storing the metadata of hive tables, partitions, Hive databases and so on

Q33. What Is A Hive Variable? What For We Use It?

The hive variable is variable created within the Hive surroundings that may be referenced by Hive scripts. It is used to bypass some values to the hive queries when the query begins executing.

Q34. When You Point A Partition Of A Hive Table To A New Directory, What Happens To The Data?

The information remains in the antique region. It has to be moved manually.

Q35. Can Hive Queries Be Executed From Script Files? How?

Using the supply command.

Example: Hive> source /direction/to/document/file_with_query.Hql

Q36. What Are Collection Data Types In Hive?

There are three collection records sorts in Hive.

ARRAY

MAP

STRUCT

Q37. Write A Query To Insert A New Column(new_col Int) Into A Hiev Table (htab) At A Position Before An Existing Column (x_col)

ALTER TABLE table_name

CHANGE COLUMN new_col  INT

BEFORE x_col

Q38. Can We Load Data Into A View?

No. A view can't be the goal of a INSERT or LOAD declaration.

Q39. How Do You Specify The Table Creator Name When Creating A Table In Hive?

The TBLPROPERTIES clause is used to add the creator call whilst growing a desk.

The TBLPROPERTIES is brought like:  TBLPROPERTIES(‘author’= ‘Joan’)

Q40. If You Omit The Overwrite Clause While Creating A Hive Table,what Happens To File Which Are New And Files Which Already Exist?

The new incoming documents are just added to the goal listing and the existing documents are genuinely overwritten. Other documents whose call does no longer in shape any of the incoming documents will live on.

If you upload the OVERWRITE clause then all the present data in the listing can be deleted earlier than new data is written.

Q41. Can We Change The Data Type Of A Column In A Hive Table?

Using REPLACE column alternative

ALTER TABLE table_name REPLACE COLUMNS ……

Q42. What Do You Mean By Schema On Read?

The schema is verified with the records while studying the statistics and now not enforced when writing statistics.

Q43. How Do You List All Databases Whose Name Starts With P?

SHOW DATABASES LIKE ‘p.*’

Q44. Is Hive Suitable To Be Used For Oltp Systems? Why?

No Hive does now not provide insert and replace at row stage. So it isn't suitable for OLTP machine.

Q45. What Is The Default Location Where Hive Stores Table Data?

Hdfs://namenode_server/user/hive/warehouse




CFG