CrowdforGeeks | Build Skills with Online Courses from Top Institutions

Top 100+ Data Warehouse Etl Toolkit Interview Questions And Answers

Question 1. What Is Etl?

Answer :

ETL stands for extraction transformation and loading
ETL offer developers with an interface for designing supply-to-goal mappings, transformation and task control parameter
* Extraction
Take data from an outside source and move it to the warehouse pre-processor database
* Transformation
Transform information venture lets in point-to-point producing, enhancing and transforming facts
* Loading
Load information project adds information to a database table in a warehouse.

Question 2. What Is A Three Tier Data Warehouse?

Answer :

A statistics warehouse can be thought of as a three-tier device wherein a middle machine presents usable records in a relaxed manner to cease customers. On either side of this center system are the stop customers and the again-give up information shops.

Informatica Interview Questions
Question 3. What Is The Metadata Extension?

Answer :

Informatica permits cease customers and partners to increase the metadata saved in the repository via associating information with individual gadgets in the repository. For instance, when you create a mapping, you may save your contact information with the mapping. You companion data with repository metadata using metadata extensions.

Informatica Client applications can incorporate the following types of metadata extensions:
Vendor-defined: Third-birthday party application vendors create seller-defined metadata extensions. You can view and alternate the values of seller-described metadata extensions, however you cannot create, delete, or redefine them.

User-defined: You create consumer-described metadata extensions using PowerCenter/PowerMart. You can create, edit, delete, and consider person-described metadata extensions. You can also exchange the values of user-defined extensions.

Question 4. Can We Override A Native Sql Query Within Informatica? Where Do We Do It? How Do We Do It?

Answer :

Yes,we can override a local sq. Query in source qualifier and lookup transformation.

In lookup transformation we will locate "Sql override" in lookup properties. By means of the use of this option we will try this.

Informatica Tutorial
Question 5. How Can We Use Mapping Variables In Informatica? Where Do We Use Them?

Answer :

Yes. We are able to use mapping variable in Informatica.

The Informatica server saves the cost of mapping variable to the repository at the cease of session run and makes use of that price next time we run the consultation.

Data Warehousing Interview Questions
Question 6. What Are Snapshots? What Are Materialized Views & Where Do We Use Them? What Is A Materialized View Log?

Answer :

Snapshots are study-most effective copies of a master table located on a remote node which is periodically refreshed to mirror changes made to the grasp table. Snapshots are mirror or replicas of tables.

Views are built using the columns from one or greater tables. The Single Table View may be up to date but the view with multi table can't be updated.

A View may be updated/deleted/inserted if it has simplest one base table if the view is primarily based on columns from one or extra tables then insert, update and delete isn't feasible.
Materialized view:
A pre-computed table comprising aggregated or joined facts from fact and possibly dimension tables. Also called a summary or mixture desk.

Question 7. Can Informatica Load Heterogeneous Targets From Heterogeneous Sources?

Answer :

No, In Informatica five.2 and
Yes, in Informatica 6.1 and later.

Data Warehousing Tutorial Networking Interview Questions
Question 8. What Is Etl Process ?How Many Steps Etl Contains Explain With Example?

Answer :

ETL is extraction , remodeling , loading manner , you'll extract information from the supply and follow the business role on it then you'll load it inside the target
The steps are :
1-define the supply (create the odbc and the relationship to the supply DB)
2-define the target (create the odbc and the relationship to the target DB)
3-create the mapping ( you may practice the enterprise function here by using adding adjustments , and outline how the information glide will pass from the source to the target )
four-create the session (its a set of coaching that run the mapping )
five-create the paintings float (practise that run the consultation)

Question 9. What Is Full Load & Incremental Or Refresh Load?

Answer :

Full Load: absolutely erasing the contents of 1 or extra tables and reloading with sparkling information.
Incremental Load: making use of ongoing adjustments to one or greater tables primarily based on a predefined time table.

System Administration Interview Questions
Question 10. Is There Any Way To Read The Ms Excel Data's Directly Into Informatica? Like Is There Any Possibilities To Take Excel File As Target?

Answer :

we can’t immediately import the xml document in informatica.
We ought to outline the Microsoft excel odbc driving force on our system and outline the name in exce sheet through defining levels then in informatica open the folder using sources ->import from database->select excel odbc driving force->join->pick the excel sheet name .

Networking Tutorial
Question eleven. What Is A Staging Area? Do We Need It? What Is The Purpose Of A Staging Area?

Answer :

Data staging is without a doubt a set of techniques used to put together source device facts for loading a facts warehouse. Staging consists of the subsequent steps:
Source statistics extraction, Data transformation (restructuring),
Data transformation (information cleansing, fee changes),
Surrogate key assignments

Hadoop Interview Questions
Question 12. How Do We Call Shell Scripts From Informatica?

Answer :

Specify the Full course of the Shell script the "Post session properties of session/workflow".

Informatica Interview Questions
Question 13. What Is The Difference Between Power Center & Power Mart?

Answer :

PowerCenter - capability to arrange repositories into a information mart domain and proportion metadata across repositories.
PowerMart - only neighborhood repository can be created.

Hadoop Tutorial
Question 14. Can We Lookup A Table From Source Qualifier Transformation. Ie. Unconnected Lookup

Answer :

You cannot research from a supply qualifier at once. However, you may override the SQL within the source qualifier to enroll in with the lookup table to carry out the research.

Question 15. Do We Need An Etl Tool? When Do We Go For The Tools In The Market?

Answer :

ETL Tool:
It is used to Extract(E) statistics from multiple supply structures(like RDBMS, Flat documents, Mainframes, SAP,XML etc) transform(T) them based on Business requirements and Load(L) in target places.(like tables, documents and so on).
Need of ETL Tool:
An ETL device is generally required whilst facts scattered throughout one of a kind systems. (like RDBMS, Flat files, Mainframes, SAP,XML and so on).

MYSQL DBA Interview Questions
Question 16. What Is Informatica Metadata And Where Is It Stored?

Answer :

Informatica Metadata is information approximately information which stores in Informatica repositories.

Apache Flume Tutorial
Question 17. Techniques Of Error Handling - Ignore , Rejecting Bad Records To A Flat File , Loading The Records And Reviewing Them (default Values)

Answer :

Rejection of statistics both at the database due to constraint key violation or the informatica server when writing statistics into target table. These rejected records we are able to discover in the terrible files folder where a reject report might be created for a session. We can take a look at why a file has been rejected. And this horrific report incorporates first column a row indicator and 2d column a column indicator.

These row signs or of four kinds:
D-valid facts,
O-overflowed facts,
N-null statistics,
T- Truncated data,
And relying on these indicators we will adjustments to load data efficiently to target.

Data modeling Interview Questions
Question 18. What Are The Various Methods Of Getting Incremental Records Or Delta Records From The Source Systems?

Answer :

One foolproof method is to keep a subject referred to as 'Last Extraction Date' after which impose a condition within the code saying 'current_extraction_date > last_extraction_date'.

Data Warehousing Interview Questions
Question 19. What Are The Different Versions Of Informatica?

Answer :

Here are a few popular variations of Informatica.
Informatica Powercenter 4.1,
Informatica Powercenter five.1,
Powercenter Informatica 6.1.2,
Informatica Powercenter 7.1.2,
Informatica Powercenter eight.1,
Informatica Powercenter 8.Five,
Informatica Powercenter 8.6.

Question 20. What Is Ods (operation Data Source)

Answer :

ODS - Operational Data Store.
ODS Comes among staging area & Data Warehouse. The records is ODS may be on the low stage of granularity.
Once information became populated in ODS aggregated records may be loaded into into EDW via ODS.

Hadoop Administration Interview Questions
Question 21. What Is Latest Version Of Power Center / Power Mart?

Answer :

The Latest Version is 7.2

Question 22. What Are The Various Tools?

Answer :

- Cognos Decision Stream
- Oracle Warehouse Builder
- Business Objects XI (Extreme Insight)
- SAP Business Warehouse
- SAS Enterprise ETL Server

Question 23. Compare Etl & Manual Development?

Answer :

ETL - The manner of extracting data from multiple resources.(ex. Flat documents, XML, COBOL, SAP and so forth) is greater simpler with the help of gear.
Manual - Loading the statistics other than flat files and oracle table need more effort.
ETL - High and clear visibility of good judgment.
Manual - complex and now not so user friendly visibility of good judgment.
ETL - Contains Meta records and changes can be finished easily.
Manual - No Meta facts idea and changes desires extra effort.
ETL- Error coping with, log summary and cargo development makes life less difficult for developer and maintainer.
Manual - want maximum attempt from preservation point of view.
ETL - Can deal with Historic information very well.
Manual - as records grows the processing time degrades.

These are a few differences b/w manual and ETL improvement.

Apache Flume Interview Questions
Question 24. When Do We Analyze The Tables? How Do We Do It?

Answer :

The ANALYZE assertion allows you to validate and compute facts for an index, desk, or cluster. These facts are utilized by the price-based totally optimizer while it calculates the maximum green plan for retrieval. In addition to its role in announcement optimization, ANALYZE additionally allows in validating item structures and in managing area in your system. You can pick the following operations: COMPUTER, ESTIMATE, and DELETE. Early model of Oracle7 produced unpredictable effects while the ESTIMATE operation turned into used. It is fine to compute your records.

Networking Interview Questions
Question 25. How Do You Calculate Fact Table Granularity?

Answer :

Granularity, is the extent of element wherein the fact desk is describing, as an instance if we're making time analysis so the granularity maybe day based - month based or year based

Question 26. What Are The Modules In Power Mart?

Answer :

1. PowerMart Designer
2. Server
three. Server Manager
four. Repository
5. Repository Manager

Informatica Admin Interview Questions
Question 27. If A Flat File Contains one thousand Records How Can I Get First And Last Records Only?

Answer :

By the use of Aggregator transformation with first and last features we can get first and remaining document.

System Administration Interview Questions
Question 28. Lets Suppose We Have Some 10,000 Odd Records In Source System And When Load Them Into Target.How Do We Ensure That All 10,000 Records That Are Loaded To Target Doesn't Contain Any Garbage Values?

Answer :

we will do ltrim, rtrim inside the expression or can have check for null after which insert the records.

Question 29. How Do We Extract Sap Data Using Informatica? What Is Abap? What Are Idocs?

Answer :

SAP Data may be loaded into Informatica within the form of Flat documents.
Condition:
Informatica source qualifier column sequence ought to in shape the SAP source document.

Question 30. What Is The Difference Between Joiner And Lookup ?

Answer :

joiner is used to enroll in or more tables to retrieve statistics from tables (much like joins in sq.).
Look up is used to test and compare supply desk and goal table . (much like correlated sub-query in square).

Question 31. What Are The Various Test Procedures Used To Check Whether The Data Is Loaded In The Backend, Performance Of The Mapping, And Quality Of The Data Loaded In Informatica.

Answer :

The quality technique to take a help of debugger where we screen every and every system of mappings and how information is loading primarily based on situations breaks.

Question 32. What Is The Difference Between Etl Tool And Olap Tools ?

Answer :

ETL tool is meant for extraction information from the legacy structures and load into specific information base with some system of cleansing records.
Ex: Informatica, facts degree ....Etc
OLAP is supposed for Reporting reason. In OLAP records to be had in Multidirectional model. In order that u can write easy question to extract records from the information base.
Ex: Business objects, Cognos....And so on

Question 33. What Are Active Transformation / Passive Transformations?

Answer :

Active transformation can alternate the variety of rows that bypass via it. (Decrease or increase rows)
Passive transformation can not exchange the variety of rows that skip via it.

Hadoop Interview Questions
Question 34. What Are The Different Lookup Methods Used In Informatica?

Answer :

1. Connected lookup
2. Unconnected research

Connected research will acquire enter from the pipeline and sends output to the pipeline and can return any quantity of values. It does no longer comprise go back port.

Unconnected research can go back only one column. It include go back port.

Question 35. What Are Parameter Files ? Where Do We Use Them?

Answer :

Parameter file defines the value for parameter and variable utilized in a workflow, work allow or session.

Question 36. What Are The Various Transformation Available?

Answer :

Aggregator Transformation
Expression Transformation
Filter Transformation
Joiner Transformation
Lookup Transformation
Normalizer Transformation
Rank Transformation
Router Transformation
Sequence Generator Transformation
Stored Procedure Transformation
Sorter Transformation
Update Strategy Transformation
XML Source Qualifier Transformation
Advanced External Procedure Transformation
External Transformation
MYSQL DBA Interview Questions
Question 37. How To Determine What Records To Extract?

Answer :

When addressing a desk a few measurement key should reflect the want for a document to get extracted. Mostly it'll be from time dimension (e.G. Date >= 1st of modern month) or a transaction flag (e.G. Order Invoiced Stat). Foolproof could be adding an archive flag to file which gets reset when file adjustments.

Question 38. What Are Snapshots? What Are Materialized Views & Where Do We Use Them? What Is A Materialized View Do?

Answer :

Materialized view is a view wherein facts is also saved in some temp table. I.E if we will go together with the View concept in DB in that we best shop query and once we call View it extract statistics from DB. But In materialized View data is stored in a few temp tables.

Question 39. Give Some Popular Tools?

Answer :

Popular Tools:
IBM WebSphere Information Integration (Accentual DataStage)
Ab Initio
Informatica
Talend

Question 40. Give Some Etl Tool Functionalities?

Answer :

While the choice of a database and a hardware platform is a must, the choice of an ETL tool is pretty advocated, however it is no longer a ought to. When you compare ETL equipment, it pays to search for the following traits:

Functional capability: This includes both the 'transformation' piece and the 'cleansing' piece. In preferred, the typical ETL equipment are both geared in the direction of having sturdy transformation abilities or having sturdy cleaning competencies, however they are seldom very robust in each. As a result, if you understand your statistics goes to be dirty coming in, make sure your ETL device has robust cleansing talents. If you recognize there are going to be quite a few extraordinary records ameliorations, it then makes experience to pick a device this is strong in transformation.

Ability to study without delay from your statistics supply: For each organisation, there may be a exclusive set of information sources. Make positive the ETL device you pick can connect immediately on your source information.

Metadata guide: The ETL device performs a key function to your metadata because it maps the source information to the destination, that's an critical piece of the metadata. In truth, a few organizations have come to rely on the documentation in their ETL device as their metadata supply. As a result, it is very essential to pick an ETL tool that works along with your overall metadata approach.

Data modeling Interview Questions
Question 41. How To Fine Tune The Mappings?

Answer :

1.Use clear out situation in supply qualifies with out the use of filter
2.Use patience and shared cache in look up t/r
three.Use in aggregations t/r in sorted i/p, organization by ports
four.In expression use operators rather than capabilities
5.Growth the cache length
6. Boom the dedicate interval

Question 42. Where Do We Use Connected And Un Connected Lookups

Answer :

If go back port most effective one then we will go for unconnected. More than one go back port isn't possible with Unconnected. If a couple of go back port then pass for Connected.

Hadoop Administration Interview Questions
Question forty three. What Are The Various Tools? - Name A Few.

Answer :

- Abinitio
- DataStage
- Informatica
- Cognos Decision Stream
- Oracle Warehouse Builder
- Business Objects XI (Extreme Insight)
- SAP Business Warehouse
- SAS Enterprise ETL Server