ETL Testing Interview Questions and Answers
Q1. What is ETL Testing?
Ans: ETL stands for Extract-Transform-Load and it's miles a procedure of the way statistics is loaded from the source gadget to the data warehouse. Data is extracted from an OLTP database, transformed to suit the records warehouse schema and loaded into the records warehouse database.
Q2. Why is this ETL manner used?
Ans: Data has grow to be the vital part of all kinds of agencies and operations. Because statistics is so critical to a a success enterprise, terrible overall performance or erroneous technique can value money and time. Therefore, ETL testing is designed to ensure that the data processing is accomplished inside the expected manner for the business/corporation to get the benefit out of it.
Q3. What is Full load & Incremental or Refresh load?
Initial Load: It is the system of populating all of the statistics warehousing tables for the very first time.
Full Load: While loading the information for the first time, all of the set statistics are loaded at a stretch depending at the extent. It erases all of the contents of tables and reloads with fresh records.
Incremental Load: Applying the dynamic adjustments as and when important in a specific length. The schedule is predefined every duration.
Q4. Where are this ETL structures precisely used?
Ans: ETL systems are used by groups to integrate information from more than one sources. These software systems are key additives in making sure that your business enterprise is processing its records successfully, allowing your commercial enterprise to run smooth and without interruption.
Q5. What gear you've got used for ETL checking out?
Data get right of entry to equipment e.G., TOAD, WinSQL, AQT and so forth. (used to investigate content material of tables)
ETL Tools e.G. Informatica, DataStage.
Test control tool e.G. Test Director, Quality Center and many others. ( to keep necessities, test cases, defects and traceability matrix)
Q6. Explain what are Cubes and OLAP Cubes?
Ans: Cubes are data processing gadgets produced from truth tables and dimensions from the statistics warehouse. It offers multi-dimensional analysis.
OLAP stands for Online Analytics Processing, and OLAP cube stores massive data in muti-dimensional shape for reporting functions. It includes information known as as measures categorized through dimensions.
Q7. What are ETL tester roles and obligations?
Ans: Requires extensive information on the ETL equipment and procedures
Needs to jot down the SQL queries for the diverse given eventualities at some stage in the checking out phase.
Test components of ETL statistics warehouse
Execute backend records-pushed test
Create, design and execute check instances, test plans and test harness
Identify the trouble and offer solutions for capacity issues
Approve necessities and layout specs
Data transfers and Test flat file
Writing SQL queries for various situations like depend test.
Should be able to convey our one of a kind varieties of checks together with Primary Key, defaults and maintain a test on the opposite capability of the ETL procedure.
Q8. What is ETL testing in assessment with data base testing?
Verifies whether facts is moved as expected• Verifies whether counts in the source an goal are matching• Verifies whether facts is converted as predicted• Verifies that the foreign primary key family members are preserved during the ETL• Verifies for duplication in loaded information.
Q9. What Is The Difference Between Etl Tool And Olap Tools?
Ans: ETL device is meant for extraction statistics from the legacy structures and load into unique database with a few method of cleansing information.
Ex: Informatica, statistics degree ....And so forth
OLAP is supposed for Reporting purpose in OLAP facts available in Multidirectional model. So that you can write easy query to extract records from the statistics base.
Ex: Business items, Cognos....And many others
Q10. Explain what factless truth schema is and what's Measures?
Ans: A fact desk with out measures is known as Factless fact desk. It can view the number of going on occasions. For example, it's miles used to report an occasion such as worker count number in a company.
Q11. What Is Ods (operation Data Source)?
ODS: Operational Data Store.
ODS: Comes between staging area & Data Warehouse. The facts is ODS can be on the low level of granularity.
Once facts changed into populated in ODS aggregated information may be loaded into EDW via ODS.
Q12. What Is A Staging Area? Do We Need It? What Is The Purpose Of A Staging Area?
Ans: Data staging is certainly a collection of processes used to put together supply system facts for loading a records warehouse. Staging consists of the following steps:
Source statistics extraction, Data transformation (restructuring),
Data transformation (statistics cleaning, fee transformations),
Surrogate key assignments.
Q13. What is a 3-tier data warehouse?
Ans: Most facts warehouses are considered to be a 3-tier gadget. This is crucial to their structure. The first layer is wherein the records lands. This is the collection point in which records from out of doors sources is compiled. The 2d layer is called the ‘integration layer.’ This is in which the facts that has been saved is transformed to fulfill organisation needs. The 1/3 layer is called the ‘measurement layer,’ and is in which the transformed facts is stored for inner use.
Q14. What is the distinction between records mining and statistics warehousing?
Ans: Data warehousing comes earlier than the mining method. This is the act of amassing facts from numerous outdoors assets and organizing it into one unique location: the warehouse. Data mining is while that information is analyzed and used as statistics for making decisions.
Q15. What is partitioning and what are a few varieties of partitioning?
Ans: Partitioning is while a place of records garage is sub-divided to improve performance. Think of it as an organizational device. If all of your accrued information is in one large area with out organization the virtual equipment used for reading it will have a greater difficult time locating the information so as to investigate it. Partitioning your warehouse will create an organizational shape with a view to make finding and reading less difficult and faster.
Two sorts of partitioning are spherical-robin partitioning and Hash Partitioning. Round-robin partitioning is while the records is frivolously disbursed among all walls. This method that the range of rows in every partition is notably the identical. Hash partitioning is whilst the server applies a hash function in order to create partition keys to organization data.
Q16. What technique is precisely involved in ETL trying out?
Ans: The system of ETL lets in a commercial enterprise/agency to gather critical statistics from unique supply structures and validate/exchange it to suit their desires and models, and then keep it in records warehouse for analytic, forecasts and different forms of reports for daily use. In a international of virtual organization, it's far a crucial part of strolling an powerful and efficient business.
Q17. What are the one-of-a-kind ETL testing operations?
Ans: ETL checking out includes the subsequent :
Verify whether or not the statistics is reworking effectively consistent with business necessities.
Verify that the projected statistics is loaded into the records warehouse without any truncation and facts loss.
Make certain that ETL application reviews invalid records and replaces with default values.
Make sure that information masses at expected time frame to enhance scalability and performance.
Q18. Explain what is tracing level and what are the sorts?
Ans: Tracing degree is the quantity of data stored inside the log documents. Tracing stage may be categorised in Normal and Verbose. Normal stage explains the tracing stage in a detailed way at the same time as verbose explains the tracing degrees at each and each row.
Q19. Explain what is Grain of Fact?
Ans: Grain fact can be described as the extent at which the fact data is saved. It is likewise called Fact Granularity.
Q20. What is Data base trying out?
Ans: Data base trying out carries one-of-a-kind steps compared to facts ware house trying out:
Database checking out is carried out the usage of smaller scale of information normally with OLTP (Online transaction processing) kind of databases.
In database checking out commonly statistics is always injected from uniform sources.
We usually carry out handiest CRUD (Create, study, update and delete) operation in database testing.
Normalized databases are used in DB checking out.
Q21. Why ETL trying out is needed?
Ans: To confirm the Data which might be being transferred from one device to the opposite within the described patter/way by using the enterprise (necessities).
Q22. What are snapshots? What are materialized views & wherein will we use them? What is a materialized view log?
Snapshots are copies of study-only records of a master table.
They are positioned on a far off node that is refreshed periodically to mirror the changes made to the grasp table.
They are replica of tables.
Views are constructed by using the use of attributes of one or greater tables.
View with single desk can be updated, whereas view with more than one tables cannot be updated.
Materialized View log:
A materialized view is a pre computed desk that has aggregated or joined records from reality tables and dimension tables.
To positioned it simple, a materialized view is an aggregate desk.
Q23. What is partitioning? Explain approximately Round-Robin, Hash partitioning.
Partitioning is to sub divide the transactions to enhance performance.
Increasing the wide variety of walls enables Informatica Server for advent of a couple of connections to numerous assets.
The following are the partitions.
Data is sent frivolously by using Informatica amongst all walls.
This partitioning is used in which the quantity of rows to procedure in each partition are about identical
Informatica server applies a hash function for the reason of partitioning keys to institution statistics among walls.
It is used wherein ensuring the methods groups of rows with the same partitioning key inside the identical partition, need to be ensured.
Q24. What are the variations among Connected and Unconnected research?
Ans: Connected Lookup:
Participates in mapping.
Returns a couple of values.
Can be related to any other variations and returns a price.
It is used while lookup feature is used as opposed to an expression transformation while mapping, where research does now not to be had within the main glide.
Returns simplest one output port.
It cannot be connected to some other transformation.
Unconnected Lookups are reusable.
Q25. How to fine music mappings?
Ans: The following are the stairs to excellent track mappings:
Use the circumstance for filter in supply qualifies with out the usage of clear out.
Utilize staying power and cache that is shared in look up t/r.
Use the aggregations t/r in sorted i/p institution with the aid of exclusive ports.
Use operators in expressions in preference to features.
Increase the cache length and devote c programming language.