YouTube Icon

Interview Questions.

Top 100+ Greenplum Database Interview Questions And Answers - May 30, 2020

fluid

Top 100+ Greenplum Database Interview Questions And Answers

Question 1. What Are Major Differences Between Oracle And Greenplum?

Answer :

Oracle is relational database. Greenplum is MPP nature. Greenplum is shared not anything architecture. There are many other variations in terms of functionality and behaviour.

Question 2. What Is Good And Bad About The Greenplum, Compared To Oracle And Greenplum?

Answer :

Greenplum is constructed on top of Postgresql . It is shared not anything, MPP architecture high-quality for facts warehousing env. Good for huge statistics analytics reason.
Oracle is an all purpose database.

Python Interview Questions
Question three. How To Find Errors / Fatal From Log Files?

Answer :

grep for ERRORS, FATAL, SIGSEGV in pg_logs directory.

Question 4. What Is Vacuum And When Should I Run This?

Answer :

VACUUM reclaims storage occupied via deleted tuples. In regular GPDB operation, tuples which are deleted or obsoleted with the aid of an replace aren't physically eliminated from their desk. They continue to be gift on disk till a VACUUM is done. Therefore, it is necessary to do VACUUM periodically, specifically on often-up to date desk.

Python Tutorial
Question 5. What Is Difference Between Vacuum And Vacuum Full?

Answer :

Unless you want to go back space to the OS so that different tables or different parts of the system can use that space, you ought to use VACUUM rather than VACUUM FULL.

VACUUM FULL is handiest wished if you have a table this is on the whole lifeless rows, that is, the great majority of its contents had been deleted. Even then, there's no factor the usage of VACUUM FULL unless you urgently want that disk space again for other matters otherwise you anticipate that the table will never again develop to its beyond length. Do no longer use it for desk optimization or periodic protection as it is counterproductive.

Informatica Interview Questions
Question 6. What Is Analyse And How Frequency Should I Run This?

Answer :

ANALYZE collects data about the contents of tables within the database, and stores the results within the gadget table pg_statistic. Subsequently, the question planner makes use of those facts to assist decide the maximum green execution plans for queries.

It is a superb idea to run ANALYZE periodically, or just after making foremost adjustments inside the contents of a desk. Accurate records will assist the question planner to choose the most appropriate query plan, and thereby improve the speed of query processing. A commonplace strategy is to run VACUUM and ANALYZE once an afternoon in the course of a low-usage time of day.

Question 7. What Is Resource Queues?

Answer :

Resource queues are used to manager Greenplum database workload control. All consumer / queries can be prioritized the usage of Resource queues. 

Informatica Tutorial Teradata Interview Questions
Question eight. What Is Gp_toolkit?

Answer :

The gp_toolkit is a database schema, which has many tables, perspectives and features to better control Greenplum Database whilst DB is up. In 3.X earlier variations it turned into known as gp_jetpack.

Question nine. How To Generate Ddl For A Table?

Answer :

Use pg_dump utility to generate DDL.

Example:

pg_dump -t njonna.Money owed -s -f ddl_accounts.Sq.

Where:

-f ddl_accounts.Sq. Is output document.
-t njonna.Debts is table call with schema njonna.
-s unload best schema no data
PL/SQL Interview Questions
Question 10. What Are The Tools Available In Greenplum To Take Backup And Restores?

Answer :

For non-parallel backups:

Use postgres utililities (pg_dump, pg_dumpall for backup, and pg_restore for repair).
Another beneficial command for buying records out of database is the COPY <TABLE> to <File>.
For parallel backups:

gp_dump and gpcrondump for backups and gp_restore for restore manner.
Teradata Tutorial
Question 11. How Do I Clone My Production Databaes To Preprod / Qa Environment?

Answer :

If Prod and QA on equal GPDB cluster, use CREATE database <Clone_DBname> template <Source_DB>.
If Prod and QA are on specific clusters, use backup and restore utilities.
Adv Java Interview Questions
Question 12. What Is Difference Between Pg_dump And Gp_dump?

Answer :

pg_dump – Non-parallel backup utility, you need huge report machine wherein backup may be created in the grasp node handiest.
Gp_dump – Parallel backup software. Backup may be created in grasp and segments document gadget.
Python Interview Questions
Question thirteen. What Is Gpcrondump?

Answer :

A wrapper application for gp_dump, which may be referred to as at once or from a crontab access.
Example: gpcrondump -x <database_name>
Adv Java Tutorial
Question 14. What Are The Backup Options Available At Os Level?

Answer :

Solaris: zfs snapshots at document system degree.
All OS: gpcrondump / gp_dump.
Question 15. My Sql Query Is Running Very Slow, It Was Running Fine Yesterday What Should I Do?

Answer :

Check that your connection to the Greenplum cluster continues to be true in case you are using a far flung purchaser. You can do that by using running the SQL regionally to the GP cluster.
Check that the system tables and user tables worried aren't bloated or skewed. Read jetpack or Greenplum toolkit documentation about a way to do this.
Check together with your DBA that the Greenplum interconnect continues to be acting effectively.
This may be completed by way of checking for dropped packets on the interconnect “netstat -i” and by strolling gpcheckperf.It is also possible that a segment is experiencing hardware issues, which can be determined within the output of dmesg or in

/var/log/messages* (Linux) and /var/adm/messages* (Solaris).

Hadoop Interview Questions
Question sixteen. How To Turn On Timing, And Checking How Much Time A Query Takes To Execute?

Answer :

You can flip in timing in line with session earlier than you run your SQL with the timing command.
You can run explain analyze against your SQL statement to get the timing.
Hadoop Tutorial
Question 17. How To Check If My Session Queries Are Running Or Waiting On Locks?

Answer :

Check “waiting” column in pg_stat_activity and “granted” column in pg_locks for any object degree locks.

PostgreSQL Interview Questions
Question 18. What Kind Of Locks Should We Focus On Mpp System When System Is Slow /hung?

Answer :

Locks that are held for a very long time and a couple of different queries are watching for that lock additionally.

Informatica Interview Questions
Question 19. How Do I Monitor User Activity History In Greenplum Database?

Answer :

Use Greenplum performance reveal (gpperfmon), which has GUI to reveal and query performance records.

Qlik View Tutorial
Question 20. What Is Greenplum Performance Monitor And How To Install?

Answer :

Its a monitoring device that collects records on machine and query performance and builds ancient information.

Qlik View Interview Questions
Question 21. When The Client Connects Does He Connect To The Master Or Segment Node?

Answer :

Master.

Question 22. Can You Explain The Process Of Data Migration From Oracle To Greenplum?

Answer :

There are many approaches. Simplest steps are Unload data into csv documents, create tables in greenplum database corresponding to Oracle, Create outside desk,  start gpfdist pointing to external desk vicinity, Load information into greenplum. You can also use gpload utility. Gpload creates external table at runtime.

Unix/Linux Tutorial
Question 23. Which Command Would You Use To Backup A Database?

Answer :

gp_dump,  gpcrondump, pg_dump, pg_dumpall, replica

SQLite Interview Questions
Question 24. How Would Go About Query Tuning?

Answer :

study the query plan
Look on the stats of the table/tables in the question
take a look at the desk distribution keys and joins within the query
study the community overall performance
have a look at the useful resource queues
take a look at the interconnect overall performance
observe the be a part of order of tables inside the query
take a look at the the question itself i.E. If it is able to be written in greater efficient way
 

Teradata Interview Questions
Question 25. What Would You Do When A User Or Users Are Complaining That A Particular Query Is Running Slow?

Answer :

examine the query plan
Look on the stats of the desk/tables in the question
take a look at the desk distribution keys and joins in the question
have a look at the network overall performance
study the resource queues
study the interconnect performance
examine the be part of order of tables in the question
take a look at the the question itself i.E. If it may be written in extra green manner
Question 26. What Would You Do To Gather Statistics In The Database? As Well As Reclaim The Space?

Answer :

A VACUUM FULL , CTAS : A VACUUM FULL will reclaim all expired row area, however is a totally costly operation and can take an unacceptably long time to complete on huge, disbursed Greenplum Database tables. If you do get right into a situation where the free area map has overflowed, it could be extra well timed to recreate the table with a CREATE TABLE AS declaration and drop the antique desk. A VACUUM FULL is not recommended in Greenplum Database.

Apache Spark Interview Questions
Question 27. How Would You Implement Compression And Explain Possible The Compression Types?

Answer :

There are  styles of in-database compression available in the Greenplum Database for append-handiest tables:

Table-level compression is carried out to a whole table.
Column-degree compression is implemented to a particular column. You can apply exclusive column-degree compression algorithms to extraordinary columns.
PL/SQL Interview Questions
Question 28. If You Configure Your With Master And Segment Nodes, Where Would The Data Reside?

Answer :

Segment nodes.

Question 29. When A User Submits A Query, Where Does It Run In Master Or Segment Nodes?

Answer :

Segment nodes

Hadoop Administration Interview Questions
Question 30. What Is The Location Of Pg_hba/logfile/master_data_directory?

Answer :

cd $MASTER_DATA_DIRECTORY – Master direcoty.
Pg_hba.Conf and postgres.Conf place and different GPDB internal directories.
Cd $MASTER_DATA_DIRECTORY/pg_logs — Master database logfiles vicinity.
Question 31. How To See The Value Of Guc?

Answer :

By connecting GPDB database using psql query catalog or do display parameter.

Example:

gpdb# pick call,setting from pg_settings in which name=’GUC’;
or
gpdb# show <GUC_NAME>;

Question 32. How To Check Db Version And Version At Init Db?

Answer :

To take a look at version:

psql> select version();
or
postgres –gp-version
To take a look at gp version at set up:
psql> pick out * from gp_version_at_initdb;

Unix/Linux Interview Questions
Question 33. How To Create A Password Free Trusted Env B/w The All The Segment Hosts?

Answer :

Use gpssh-exkeys:
gpssh-exkeys -h hostname1 -h hostname2 .. -h hostnameN

Adv Java Interview Questions
Question 34. How To Add New User To The Database?

Answer :

Use createuser utility to create customers. 

You can also use SQL instructions in psql activate to create customers.

For example: CREATE USER or ROLE ….

Question 35. How To Manage Pg_hba.Conf?

Answer :

The pg_hba.Conf document of the grasp example controls patron access and authentication for your Greenplum gadget. Check Greenplum Administrator’s Guidefor commands to add / exchange contents of this report.

Question 36. How To Update Postgresql.Conf And Reload It?

Answer :

In GP four.Zero version take a look at gpconfig utility to trade postgres.Conf parameters.

In three.X version manually change parameters in postgres.Conf 

Hadoop Interview Questions
Question 37. How To Run Gpcheckperf Io/netperf?

Answer :

Create a directory wherein you have loose space and commonplace in all hosts.

For network I/O test for each nic card:

gpcheckperf -f seg_host_file_nic-1 -r N -d /information/gpcheckperf > seg_host_file_nic_1.Out
gpcheckperf -f seg_host_file_nic-2 -r N -d /records/gpcheckperf > seg_host_file_nic_2.Out
For disk I/O:

gpcheckperf -f seg_host_file_nic-1 -r ds -D -d /statistics/gpdb_p1 -d /statistics/gpdb_p2 -d /statistics/gpdb_m1 -d /information/gpdb_m2
Question 38. How To Start/forestall Db In Admin Mode?

Answer :

Admin mode: The gpstart with alternative (-R) is stands for Admin mode or restrained mode where only splendid users can hook up with database while database opened the use of this option.

Utility mode: Utility mode lets in you to connect with simplest person segments when commenced using gpstart -m, as an instance< to connect with handiest grasp instance best:

PGOPTIONS=’-c gp_session_role=application’ psql

Question 39. How To See Primary To Mirror Mapping?

Answer :

From database catalog following query listing configuration on content material ID, you may figure out number one and reflect for each content.

Gpdb=# choose * from gp_configuration order by way of content material.

Note: beginning from GPDB 4.X, gp_segment_configuration table is used as a substitute.

Gpdb=# select * from gp_segment_configuration order with the aid of dbid;

Question forty. How To Add Mirrors To The Array?

Answer :

The gpadd mirrors utility configures mirror phase times for an existing Greenplum Database device that turned into first of all configured with primary phase times simplest.

 

 

PostgreSQL Interview Questions
Question 41. How To Recover An Invalid Segment?

Answer :

Use the gprecoverseg device, for you to recognize which segments need healing and will initialize recovery.

Three.3.X: o Without “-F” alternative – First documents can be in comparison, difference found and most effective specific files will be synched (the first degree may want to remaining a long time if there are too many documents in the statistics directory) 

With “-F” choice – Entire records listing can be resynched.
Four.0.X:

Without “-F” alternative – The trade tracking log may be sent and applied to the mirror.
With “-F” option – Entire records directory may be resynched.
Question 42. How To Re-sync A Standby?

Answer :

Use this selection in case you have already got a standby grasp configured, and just need to resynchronize the statistics between the primary and backup grasp host. The Greenplum device catalog tables will not be up to date.

# gpinitstandby -n (resynchronize)

Qlik View Interview Questions
Question forty three. How To Delete A Standby?

Answer :

To do away with the currently configured standby grasp host from your Greenplum Database system, run the subsequent command within the master simplest:

# gpinitstandby -r

Question forty four. What Is Gpdetective And How Do I Run It In Greenplum?

Answer :

The gpdetective utility collects statistics from a running Greenplum Database device and creates a bzip2-compressed tar output report. This output record allows with the prognosis of Greenplum Database mistakes or system disasters. For greater information take a look at help.

Gpdetective –assist

Question forty five. How To Run Gpcheckcat?

Answer :

The gpcheckcat tool is used to check catalog inconsistencies among master and segments. It can be determined inside the
$GPHOME/bin/lib listing:
Usage: gpcheckcat <option> [dbname]
-?
-B parallel: quantity of worker threads
-g dir : generate SQL to rectify catalog corruption, placed it in dir
-h host : DB host call
-p port : DB port variety
-P passwd : DB password
-o : check OID consistency
-U uname : DB User Name
-v : verbose

Example: gpcheckcat gpdb >gpcheckcat_gpdb_logfile.Log

Question 46. What Is The Procedure To Get Rid Of Mirror Segments?

Answer :

There aren't any utilities to be had to get rid of mirrors from Greenplum. You want to ensure all primary segments are good then you could do away with the mirror configuration from gp_configuration in 3.X.

Question 47. Why Do We Need Gpstop -m And Gpstart -m?

Answer :

The gpstart -m command lets in you to start the grasp only and not one of the data segments and is used primarily with the aid of guide to get gadget stage statistics/configuration. An stop person could no longer often or maybe usually use it.

Question forty eight. Gpstart Failed What Should I Do?

Answer :

Check gpstart logfile in ~gpadmin/gpAdminLogs/gpstart_yyyymmdd.Log

Take a look at the pg begin log record for more info in $MASTER_DATA_DIRECTORY/pg_log/startup.Log

Question 49. Where Can I Get Help On Postgres Psql Commands?

Answer :

In psql session

“ ?” – for all psql consultation help

“h <SQL Command> ” For any SQL syntax help.

Question 50. How To Delete/drop An Existing Database In Greenplum?

Answer :

gpdb=# h DROP Database

Command: DROP DATABASE

Description: cast off a database

Syntax:DROP DATABASE [ IF EXISTS ] nameAlso check dropdb software:$GPHOME/bin/dropdb –help dropdb eliminates a PostgreSQL database.

Usage:

dropdb [OPTION]… DBNAME.




CFG