Informatica Interview Questions and Answers
Q1. What Is Informtica?
Ans: Informatica is a Software improvement company, which offers facts integration products. If gives merchandise for ETL, records overlaying, statistics Quality, information reproduction, data virtualization, grasp records control, and many others.
Informatica Powercenter ETL/Data Integration device is a maximum broadly used device and within the not unusual time period while we are saying Informatica, it refers to the Informatica PowerCenter device for ETL.
Informatica Powercenter is used for Data integration. It offers the functionality to attach & fetch facts from exclusive heterogeneous source and processing of statistics.
For instance, you can connect to an SQL Server Database and Oracle Database both and might combine the facts into a 3rd system.
The present day version of Informatica PowerCenter to be had is nine.6.Zero. The one-of-a-kind editions for the PowerCenter are:
The famous clients the use of Informatica Powercenter as a statistics integration tool are U.S Air Force, Allianz, Fannie Mae, ING, Samsung, and so forth. The popular tools available within the market in competition to Informatica are IBM Datastage, Oracle OWB, Microsoft SSIS and Ab Initio.
Ans: Informatica is a tool, supporting all of the steps of Extraction, Transformation and Load procedure. Now days Informatica is likewise being used as an Integration tool.Informatica is an smooth to use tool. It has got a simple visual interface like forms in visual simple. You simply want to pull and drop different gadgets (known as variations) and design manner drift for Data extraction transformation and load.
These method float diagrams are called mappings. Once a mapping is made, it can be scheduled to run as and whilst required. In the history Informatica server looks after fetching facts from source, transforming it, & loading it to the target systems/databases.
Q2. Why do we need Informatica?
Ans: Informatica involves the photo anywhere we have a statistics machine available and at the backend we want to carry out positive operations on the data. It can be like cleansing up of records, enhancing the data, and so forth. Primarily based on certain set of guidelines or truly loading of bulk records from one device to any other.
Informatica gives a wealthy set of capabilities like operations at row stage on records, integration of records from a couple of structured, semi-established or unstructured structures, scheduling of facts operation. It also has the feature of metadata, so the information about the technique and facts operations are also preserved.
Q3. What are the blessings of Informatica?
Ans: Informatica has a few blessings over other records integration systems. A couple of the advantages are:
It is quicker than the available systems.
You can without problems reveal your jobs with Informatica Workflow Monitor.
It has made information validation, generation and task improvement to be easier than earlier than.
If you enjoy failed jobs, it is simple to perceive the failure and recover from it. The identical applies to jobs which are strolling slowly.
Its GUI device, Coding in any graphical tool is generally quicker than hand code scripting.
Can speak with all essential statistics sources (mainframe/RDBMS/Flat Files/XML/VSM/SAP etc).
Can cope with range big/huge facts very successfully.
User can practice Mappings, extract guidelines, cleansing rules, transformation rules, aggregation common sense and loading rules are in separate gadgets in an ETL tool. Any alternate in any of the object will supply minimum effect of other item.
Reusability of the object (Transformation Rules)
Informatica has extraordinary “adapters” for extracting statistics from packaged ERP applications (such as SAP or PeopleSoft).
Availability of aid within the marketplace.
Can be run on Window and Unix environment.
Q4. In what real situations can Informatica be used?
Ans: Informatica has a wide variety of software that covers regions along with:
Q5. What are a few examples of Informatica ETL applications?
Ans: Some basic Informatica programs are:
Mappings: A mapping is designed within the Designer. It defines all the ETL procedures. Data are study from their original sources by means of mappings earlier than the utility of transformation good judgment to the examine information. The transformed information is later written to the goals.
Workflows: The techniques of runtime ETL are defined through a group of various responsibilities are called workflow. Workflows are designed within the Workflow Manager.
Task: This is a set of moves, commands, or capabilities that are executable. How an ETL procedure behaves for the duration of runtime may be defined through a series of various responsibilities.
Q6. Which development additives of Informatica have the highest usage?
Ans: There are many development components in Informatica. However, those are the most extensively used of them:
Expression: This can be used to transform facts that have capabilities.
Lookups: They are drastically used to join records.
Sorter and Aggregator: This is the right device for sorting facts and aggregating them.
Java transformation: Java transformation is the selection of builders if they need to invoke variables, java methods, 1/3-celebration API’s and java packages which are integrated.
Source qualifiers: Many humans use this element to transform source data types to the equal Informatica information types.
Transaction control: If you want to create transactions and feature absolute control over rollbacks and commits, assume this aspect to bail you out.
Q7. What are the uses of ETL equipment?
Ans: ETL tools are quite one of a kind from other gear. They are used for performing some actions along with:
Loading vital statistics right into a records warehouse from any supply known as Target.
Extracting information from a facts warehouse from any resources which includes database tables or files.
Transforming the statistics received from different sources in an organized way. Some of the tremendous assets wherein statistics are received encompass SAP solutions, Teradata, or internet services.
Q8. This Questions & solutions part includes the questions and answers approximately:
Types of information warehouses
snow flake schema
Sub Questions in Q8.
Q1. Compare Informatica & DataStage
Criteria Informatica DataStage
GUI for improvement & tracking PowerDesigner, Repository Manager, Worflow Designer, Workflow Manager. DataStage Designer, Job Sequence Designer and Director.
Data integration solution Step-by using-step solution Project based integration solution
Data transformation Good Excellent
Q2. Define Enterprise Data Warehousing?
When the information of organization is advanced at a unmarried factor of get entry to it's miles known as organisation information warehousing.
Q3. Differentiate among a database, and statistics warehouse?
Database have a collection of useful information which is short in size in comparison to information warehouse while in facts warehouse their are set of each type of information whether or not it's far beneficial or not and data is extracted as the the requirement of patron.
Q4. What do you apprehend via a term domain?
Ans: Domain is the term wherein all interlinked relationship and nodes are under taken with the aid of sole organizational point.
Q5. Differentiate among a repository server and a powerhouse?
Ans: Repository server in particular ensures the repository reliability and uniformity even as powerhouse server tackles the execution of many processes between the elements of server’s database repository.
Q6. In Informatica WorkFlow Manager, how many repositories may be created?
Ans: It in particular depends upon the number of ports we required but as popular there can be any quantity of repositories.
Q7. Write the advantages of partitioning a consultation?
Ans: The fundamental benefit of partitioning a consultation is to get higher server’s procedure and competence. Other gain is it implements the solo sequences in the consultation.
Q8. How we are able to create indexes after completing the weight procedure?
Ans: With the help of command task at session degree we will create indexes after the burden system.
Q9. Define periods in Informatica ETL.
Ans: Session is a teaching group that calls for to be to transform statistics from source to a goal.
Q10. In one institution how many wide variety of periods can we have?
Ans: We can have any range of consultation however it's miles really helpful to have lesser variety of session in a batch as it will become easier for migration.
Q11. Differentiate among mapping parameter and mapping variable?
Ans: At the time values regulate during the session’s implementation it is known as mapping variable whereas the values that don’t alter within the session implementation is known as as mapping parameters.
Q12. What are the functions of complicated mapping?
Ans: The functions of complicated mapping are:
Many numbers of ameliorations
elaborate needscompound business logic
Q13. How we are able to become aware of whether mapping is accurate or no longer with out connecting session?
Ans: With the assist of debugging alternative we are able to discover whether or not mapping is correct or no longer without connecting classes.
Q14. Can we use mapping parameter or variables advanced in a single mapping into any other reusable transformation?
Ans: Yes, we will use mapping parameter or variables into another reusable transformation because it doesn’t have any mapplet.
Q15. What is using aggregator cache report?
Ans: If more reminiscence is needed aggregator presents greater cache documents for preserving the transformation values. It also keeps the transitional value which are there in nearby buffer memory.
Q16. What is lookup transformation?
Ans: The transformation that has front proper to RDBMS Is known as lookup transformation.
Q17. What do you understand by using time period position playing dimension?
Ans: The dimensions that are used for gambling assorted roles even as closing inside the same database area are known as role gambling dimensions.
Q18. How we will get right of entry to repository reports without SQL or other modifications?
Ans: We can get admission to repository reviews by means of the usage of metadata reporter. No want of the usage of SQL or different transformation as it's miles a web app.
Q19. Write the kinds of metadata the ones stores in repository?
Ans: The varieties of metadata that's saved in repository are Target definition, Source definition, Mapplet, Mappings, Transformations.
Q20. What is code web page compatibility?
Ans: Transfer of information take region from one code page to any other maintaining that both code pages have the equal man or woman units then records failure cannot occur.
Q21. How we are able to verify all mappings in the repository simultaneously?
Ans: At a time we can validate only one mapping. Hence mapping cannot be proven simultaneously.
Q22. Define Aggregator transformation?
Ans: It isn't the same as expression transformation in which we will do calculations in set however here we are able to do aggregate calculations consisting of averages, sum, and so forth.
Q23. What is Expression transformation?
Ans: It is used for appearing non aggregated calculations. We can take a look at conditional statements earlier than output consequences pass to the goal tables.
Q24. Define filter transformation?
Ans: Filter transformation is a manner of filtering rows in a mapping. It have all ports of input/output and the row which suits with that circumstance can simplest skip by way of that filter.
Q25. Define Joiner transformation?
Ans: It combines two related combined assets placed in extraordinary places at the same time as a source qualifier transformation can combine facts growing from a common source.
Q26. What do you mean via Lookup transformation?
Ans: Lookup transformation is used for preserving information in a relational desk thru mapping. We can use a couple of research transformation in a mapping.
Q27. How we can use Union Transformation?
Ans: It is a extraordinary enter institution transformation this is used to mix information from one of a kind assets.
Q28. Define Incremental Aggregation?
Ans: The incremental aggregation is finished each time a session is advanced for a mapping combination.
Q29. Differentiate between a connected look up and unconnected appearance up?
Ans: In related lookup inputs are taken without delay from numerous transformations inside the pipeline it's miles called linked lookup. While unconnected lookup doesn’t take inputs straight away from diverse alterations, however it can be utilized in any modifications and can be raised as a function the usage of LKP expression.
The variations are illustrated in the underneath table:
Connected Lookup Unconnected Lookup
Connected lookup participates in dataflow and receives enter without delay from the pipeline Unconnected lookup gets input values from the end result of a LKP: expression in another transformation
Connected lookup can use both dynamic and static cache Unconnected Lookup cache can NOT be dynamic
Connected research can return a couple of column value ( output port ) Unconnected Lookup can go back simplest one column fee i.E. Output port
Connected research caches all research columns Unconnected research caches simplest the research output ports inside the research conditions and the go back port
Supports user-defined default values (i.E. Cost to return while lookup conditions aren't satisfied) Does not assist user defined default values
Q30. Define mapplet?
Ans: A mapplet is a recyclable item this is the use of mapplet fashion designer.
Q31. What is reusable transformation?
Ans: This transformation is used diverse times in mapping. It is divest from different mappings which use the transformation as it's miles saved as a metadata.
Q32. Define update approach.
Ans: Whenever the row needs to be updated or inserted primarily based on a few collection then replace strategy is used. But on this circumstance have to be precise before for the processed row to be tick as replace or inserted.
Q33. Explain the situation which compels informatica server to reject documents?
Ans: When it faces DD_Reject in replace strategy transformation then it sends server to reject documents.
Q34. What is surrogate key?
Ans: It is an alternative to the natural prime key. It is a unique identity for each row in the desk.
Q35. Write the prerequisite tasks to gain the session partition?
Ans: In order to carry out session partition one need to configure the session to partition supply records and then putting in the Informatica server system in multifold CPU’s.
Q36. In informatics server Which files are created throughout the consultation rums?
Ans: Errors log, Bad file, Workflow low and session log particularly files are created for the duration of the session rums.
Q37. Define a consultation project?
Ans: It is a mass of instruction that courses energy middle server about how and when to move facts from sources to goals.
Q38. Define command project?
Ans: This project permits one or more than one shell instructions in UNIX or DOS in windows to run all through the workflow.
Q39. Explain standalone command task?
Ans: This assignment can be used everywhere inside the workflow to run the shell instructions.
Q40. What is pre and publish consultation shell command?
Ans: Command undertaking can be known as because the pre or put up consultation shell command for a session mission. One can run it as pre session command r post consultation fulfillment command or post session failure command.
Q41. What is predefined occasion?
Ans: Predefined event are the report-watch occasion. It waits for a selected report to arrive at a particular vicinity.
Q42. Define user defied event?
Ans: User defined event are a drift of duties within the workflow. Events can be developed and then raised as want comes.
Q43. Define paintings go with the flow?
Ans: The institution of instructions that communicates server approximately a way to put in force duties is called paintings float.
Q44. Write the specific equipment in workflow supervisor?
Ans: The different tools in workflow supervisor are:
Q45. Name other tools for scheduling purpose other than workflow supervisor pmcmd?
Ans: ‘CONTROL M’ is the 0.33 celebration device for scheduling cause apart from workflow manager.
Q46. Define OLAP (On-Line Analytical Processing?
Ans: It is a procedure with the aid of which multi-dimensional analysis takes place.
Take price of your career by going via our professionally designed Informatica Certification Course.
Q47. Name the specific varieties of OLAP? Write an instance?
Ans: Different styles of OLAP are ROLAP, HOLAP< DOLAP.
Q48. Define worklet?
Ans: Worklet is said when the workflow tasks are collected in a group. It includes timer, decision, command, event wait, etc.
Q49. Write the use of target designer?
Ans: With the help of target designer we can create target definition.
Q50. From where can we find the throughput option in Informatica?
Ans: In workflow monitor we can find throughput option.
Right click on session, then press on get run properties and under source/target statistics we can find this option.
Q51). Define target load order?
Ans: It is specified on the criteria of source qualifiers in a mapping. If there are many source qualifiers attached to various targets then we can entitle order in which informatica loads data in targets.
Q9). What can we do to improve the performance of Informatica Aggregator Transformation?
Ans: Aggregator performance improves dramatically if records are sorted before passing to the aggregator and “sorted input” option under aggregator properties is checked. The record set should be sorted on those columns that are used in Group By operation.It is often a good idea to sort the record set in database level e.G. Inside a source qualifier transformation, unless there is a chance that already sorted records from source qualifier can again become unsorted before reaching aggregator.
Q10). What are the different lookup cache(s)?
Ans: Informatica Lookups can be cached or un-cached (No cache). And Cached lookup can be either static or dynamic. A static cache is one which does not modify the cache once it is built and it remains same during the session run. On the other hand, A caches refreshed during the session run by inserting or updating the records in cache based on the incoming source data.
By default, Informatica cache is static cache.A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the cache even after the completion of session run or deletes it.
Q11. How can we update a record in target table without using Update strategy?
Ans: A target table can be updated without using ‘Update Strategy’. For this, we need to define the key in the target table in Informatica level and then we need to connect the key and the field we want to update in the mapping Target. In the session level, we should set the target property as “Update as Update” and check the “Update” check-box.Let’s assume we have a target table “Customer” with fields as “Customer ID”, “Customer Name” and “Customer Address”.
Suppose we want to update “Customer Address” without an Update Strategy. Then we have to define “Customer ID” as primary key in Informatica level and we will have to connect Customer ID and Customer Address fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the customer address field for all matching customer IDs.
Q12 .What are the new features of Informatica 9.X Developer?
Ans: From an Informatica developer’s perspective, some of the new features in Informatica 9.X are as follows:Now Lookup can be configured as an active transformation – it can return multiple rows on successful match
Now you can write SQL override on un-cached lookup also. Previously you could do it only on cached lookup
You can control the size of your session log. In a real-time environment you can control the session log file size or time
Database deadlock resilience feature – this will ensure that your session does not immediately fail if it encounters any database deadlock, it will now retry the operation again. You can configure number of retry attempts.
Q13. What are the advantages of using Informatica as an ETL tool over Teradata?
Ans: First up, Informatica is a data integration tool, while Teradata is a MPP database with some scripting (BTEQ) and fast data movement (mLoad, FastLoad, Parallel Transporter, etc) capabilities.Informatica over Teradata
Metadata repository for the organization’s ETL ecosystem.
Informatica jobs (sessions) can be arranged logically into worklets and workflows in folders.
Leads to an ecosystem which is easier to maintain and quicker for architects and analysts to analyze and enhance.
Job monitoring and recovery-
Easy to monitor jobs using Informatica Workflow Monitor.
Easier to identify and recover in case of failed jobs or slow running jobs.
Ability to restart from failure row /
InformaticaMarketPlace- one stop shop for lots of tools and accelerators to make the SDLC faster, and improve application support.
Plenty of developers in the market with varying skill levels and expertise
Lots of connectors to various databases, including support for Teradata mLoad, tPump, FastLoad and Parallel Transporter in addition to the regular (and slow) ODBC drivers.Some ‘exotic’ connectors may need to be procured and hence could cost extra.Examples – Power Exchange for Facebook, Twitter, etc which source data from such social media sources.
Surrogate key generation through shared sequence generators inside Informatica could be faster than generating them inside the database.
If the company decides to move away from Teradata to another solution, then vendors like Infosys can execute migration projects to move the data, and change the ETL code to work with the new database quickly, accurately and efficiently using automated solutions.
Pushdown optimization can be used to process the data in the database.
Ability to code ETL such that processing load is balanced between ETL server and the database box – useful if the database box is ageing and/or in case the ETL server has a fast disk/ large enough memory & CPU to outperform the database in certain tasks.
Ability to publish processes as web services.Teradata over Informatica
Cheaper (initially) – No initial ETL tool license costs (which can be significant), and lower OPEX costs as one doesn’t need to pay for yearly support from Informatica Corp.
Great choice if all the data to be loaded are available as structured files – which can then be processed inside the database after an initial stage load.
Good choice for a lower complexity ecosystem
Only Teradata developers or resources with good ANSI/Teradata SQL / BTEQ knowledge required to build and enhance the system.
Q14. What is Informatica ETL Tool?
Ans: Informatica ETL tool is market leader in data integration and data quality services. Informatica is successful ETL and EAI tool with significant industry coverage.ETL refers to extract, transform, load. Data integration tools are different from other software platforms and languages.
They have no inbuilt feature to build user interface where end user can see the transformed data. Informatica ETL tool “power center” has capability to manage, integrate and migrate enterprise data.
Q15. What is InformaticaPowerCenter?
Ans: InformaticaPowerCenter is one of the Enterprise Data Integration products developed by Informatica Corporation. InformaticaPowerCenter is an ETL tool used for extracting data from the source, transforming and loading data in to the target.The Extraction part involves understanding, analyzing and cleaning of the source data.
Transformation part involves cleaning of the data more precisely and modifying the data as per the business requirements.
The loading part involves assigning the dimensional keys and loading into the warehouse.
Q16. What is an Expression Transformation in Informatica?
Ans: An expression transformation in Informatica is a common Powercenter mapping transformation. It is used to transform data passed through it one record at a time. The expression transformation is passive and connected. Within an expression, data can be manipulated, variables created, and output ports generated. We can write conditional statements within output ports or variables to help transform data according to our business requirements.
Q17. What is the need of an ETL tool?
Ans: The problem comes with traditional programming languages where you need to connect to multiple sources and you have to handle errors. For this you have to write complex code. ETL tools provide a ready-made solution for this. You don’t need to worry about handling these things and can concentrate only on coding the requirement part.
Q18.What is meant by active and passive transformation?
Ans: An active transformation is the one that performs any of the following actions:
Change the number of rows between transformation input and output. Example: Filter transformation
Change the transaction boundary by defining commit or rollback points., example transaction control transformation
Change the row type, example Update strategy is active because it flags the rows for insert, delete, update or reject
On the other hand a passive transformation is the one which does not change the number of rows that pass through it. Example: Expression transformation.
Q19. What is the difference between Router and Filter?
Ans: Following differences can be noted:
Router transformation divides the incoming records into multiple groups based on some condition. Such groups can be mutually inclusive (Different groups may contain same record) Filter transformation restricts or blocks the incoming record set based on one given condition.
Router transformation itself does not block any record. If a certain record does not match any of the routing conditions, the record is routed to default group Filter transformation does not have a default group. If one record does not match filter condition, the record is blocked
Router acts like CASE.. WHEN statement in SQL (Or Switch().. Case statement in C) Filter acts like WHERE condition is SQL.
Q20).Under what condition selecting Sorted Input in aggregator may fail the session?
If the input data is not sorted correctly, the session will fail.
Also if the input data is properly sorted, the session may fail if the sort order by port and the group by ports of the aggregator are not in the same order.
Q21. Why is Sorter an Active Transformation?
Ans: This is because we can select the "distinct" option in the sorter property.
When the Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. The number of Input Rows will vary as compared with the Output rows and hence it is an Active transformation.
Q22. What is the difference between Static and Dynamic Lookup Cache?
Ans: We can configure a Lookup transformation to cache the underlying lookup table. In case of static or read-only lookup cache the Integration Service caches the lookup table at the beginning of the session and does not update the lookup cache while it processes the Lookup transformation.
In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The dynamic cache is synchronized with the target.
Q23. What is the difference between STOP and ABORT options in Workflow Monitor?
Ans: When we issue the STOP command on the executing session task, the Integration Service stops reading data from source. It continues processing, writing and committing the data to targets. If the Integration Service cannot finish processing and committing data, we can issue the abort command.
In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.
Q24. How to Delete duplicate row using Informatica
Scenario 1: Duplicate rows are present in relational database
Suppose we have Duplicate records in Source System and we want to load only the unique records in the Target System eliminating the duplicate rows. What will be the approach?
Assuming that the source system is a Relational Database, to eliminate duplicate records, we can check the Distinct option of the Source Qualifier of the source table and load the target accordingly.
But what if the source is a flat file? Then how can we remove the duplicates from flat file source?
Scenario 2: Deleting duplicate rows / selecting distinct rows for FLAT FILE sources
Here since the source system is a Flat File you will not be able to select the distinct option in the source qualifier as it will be disabled due to flat file source table. Hence the next approach may be we use a Sorter Transformation and check the Distinct option. When we select the distinct option all the columns will the selected as keys, in ascending order by default.
Deleting Duplicate Record Using Informatica Aggregator
Other ways to handle duplicate records in source batch run is to use an Aggregator Transformation and using the Group By checkbox on the ports having duplicate occurring data. Here you can have the flexibility to select the last or the first of the duplicate column value records.
Q25. Loading Multiple Target Tables Based on Conditions
Suppose we have some serial numbers in a flat file source. We want to load the serial numbers in two target files one containing the EVEN serial numbers and the other file having the ODD ones.
After the Source Qualifier place a Router Transformation. Create two Groups namely EVEN and ODD, with filter conditions as:
MOD(SERIAL_NO,2)=0 and MOD(SERIAL_NO,2)=1
... Respectively. Then output the two groups into two flat file targets.
Q26. Normalizer Related Questions
Suppose in our Source Table we have data as given below:
Student Name Maths Life Science Physical Science
Sam 100 70 80
John 75 100 85
Tom 80 100 85
We want to load our Target Table as:
Student Name Subject Name Marks
Sam Maths 100
Sam Life Science 70
Sam Physical Science 80
John Maths 75
John Life Science 100
John Physical Science 85
Tom Maths 80
Tom Life Science 100
Tom Physical Science 85
Describe your approach.
Here to convert the Rows to Columns we have to use the Normalizer Transformation followed by an Expression Transformation to Decode the column taken into consideration.
Name the transformations which converts one to many rows i.E increases the i/p:o/p row count. Also what is the name of its reverse transformation.
Normalizer as well as Router Transformations are the Active transformation which can increase the number of input rows to output rows.
Suppose we have a source table and we want to load three target tables based on source rows such that first row moves to first target table, second row in second target table, third row in third target table, fourth row again in first target table so on and so forth. Describe your approach.
We can clearly understand that we need a Router transformation to route or filter source data to the three target tables. Now the question is what will be the filter conditions. First of all we need an Expression Transformation where we have all the source table columns and along with that we have another i/o port say seq_num, which is gets sequence numbers for each source row from the port NextVal of a Sequence Generator start value 0 and increment by 1. Now the filter condition for the three router groups will be:
MOD(SEQ_NUM,3)=1 connected to 1st target table
MOD(SEQ_NUM,3)=2 connected to 2nd target table
MOD(SEQ_NUM,3)=0 connected to 3rd target table
Q27. Loading Multiple Flat Files using one mapping
Suppose we have ten source flat files of same structure. How can we load all the files in target database in a single batch run using a single mapping.
After we create a mapping to load data in target database from flat files, next we move on to the session property of the Source Qualifier. To load a set of source files we need to create a file say final.Txt containing the source falt file names, ten files in our case and set the Source filetype option as Indirect. Next point this flat file final.Txt fully qualified through Source file directory and Source filename.
Q28. Aggregator Transformation Related Questions
Q1. How can we implement Aggregation operation without using an Aggregator Transformation in Informatica?
We will use the very basic concept of the Expression Transformation that at a time we can access the previous row data as well as the currently processed data in an expression transformation. What we need is simple Sorter, Expression and Filter transformation to achieve aggregation at Informatica level.
Suppose in our Source Table we have data as given below:
Student Name Subject Name Marks
Sam Maths 100
Tom Maths 80
Sam Physical Science 80
John Maths 75
Sam Life Science 70
John Life Science 100
John Physical Science 85
Tom Life Science 100
Tom Physical Science 85
We want to load our Target Table as:
Student Name Maths Life Science Physical Science
Sam 100 70 80
John 75 100 85
Tom 80 100 85
Describe your approach.
Here our scenario is to convert many rows to one rows, and the transformation which will help us to achieve this is Aggregator.
Our Mapping will look like this:
We will sort the source data based on STUDENT_NAME ascending followed by SUBJECT ascending.
Now based on STUDENT_NAME in GROUP BY clause the following output subject columns are populated as
MATHS: MAX(MARKS, SUBJECT=Maths)
LIFE_SC: MAX(MARKS, SUBJECT=Life Science)
PHY_SC: MAX(MARKS, SUBJECT=Physical Science)
[teaserbox type="5" img="2803" title="Interested in Learning Devops " subtitle="Join myTectra Now!" link_url="http://www.Mytectra.Com/devops-training-in-bangalore.Html" target="blank"]Q29).Revisiting Source Qualifier Transformation
Q2. What is a Source Qualifier? What are the tasks we can perform using a SQ and why it is an ACTIVE transformation?
A Source Qualifier is an Active and Connected Informatica transformation that reads the rows from a relational database or flat file source.
We can configure the SQ to join [Both INNER as well as OUTER JOIN] data originating from the same source database.
We can use a source filter to reduce the number of rows the Integration Service queries.
We can specify a number for sorted ports and the Integration Service adds an ORDER BY clause to the default SQL query.
We can choose Select Distinctoption for relational databases and the Integration Service adds a SELECT DISTINCT clause to the default SQL query.
Also we can write Custom/Used Defined SQL query which will override the default query in the SQ by changing the default settings of the transformation properties.
Also we have the option to write Pre as well as Post SQL statements to be executed before and after the SQ query in the source database.
Since the transformation provides us with the property Select Distinct, when the Integration Service adds a SELECT DISTINCT clause to the default SQL query, which in turn affects the number of rows returned by the Database to the Integration Service and hence it is an Active transformation.
Q3. What happens to a mapping if we alter the datatypes between Source and its corresponding Source Qualifier?
Ans: The Source Qualifier transformation displays the transformation datatypes. The transformation datatypes determine how the source database binds data when the Integration Service reads it.Now if we alter the datatypes in the Source Qualifier transformation or the datatypes in the source definition and Source Qualifier transformation do not match, the Designer marks the mapping as invalid when we save it.
Q4. Suppose we have used the Select Distinct and the Number Of Sorted Ports property in the SQ and then we add Custom SQL Query. Explain what will happen.
Ans: Whenever we add Custom SQL or SQL override query it overrides the User-Defined Join, Source Filter, Number of Sorted Ports, and Select Distinct settings in the Source Qualifier transformation. Hence only the user defined SQL Query will be fired in the database and all the other options will be ignored .
Q5. Describe the situations where we will use the Source Filter, Select Distinct and Number Of Sorted Ports properties of Source Qualifier transformation.
Ans: Source Filter option is used basically to reduce the number of rows the Integration Service queries so as to improve performance.
Select Distinct option is used when we want the Integration Service to select unique values from a source, filtering out unnecessary data earlier in the data flow, which might improve performance.
Number Of Sorted Ports option is used when we want the source data to be in a sorted fashion so as to use the same in some following transformations like Aggregator or Joiner, those when configured for sorted input will improve the performance.
Q6. What will happen if the SELECT list COLUMNS in the Custom override SQL Query and the OUTPUT PORTS order in SQ transformation do not match?
Ans: Mismatch or Changing the order of the list of selected columns to that of the connected transformation output ports may result is session failure.
Q7. What happens if in the Source Filter property of SQ transformation we include keyword WHERE say, WHERE CUSTOMERS.CUSTOMER_ID > one thousand.
Ans: We use supply filter to lessen the quantity of supply information. If we consist of the string WHERE inside the source filter out, the Integration Service fails the session.
Q8. Describe the eventualities in which we move for Joiner transformation rather than Source Qualifier transformation.
Ans: While becoming a member of Source Data of heterogeneous resources as well as to join flat documents we will use the Joiner transformation. Use the Joiner transformation while we need to sign up for the following kinds of assets:
Join records from exclusive Relational Databases.
Join facts from specific Flat Files.
Join relational assets and flat documents.
Q9. What is the most range we will use in Number Of Sorted Ports for Sybase source device.
Ans: Sybase helps a most of 16 columns in an ORDER BY clause. So if the supply is Sybase, do no longer kind extra than 16 columns.
Q10. Suppose we have two Source Qualifier modifications SQ1 and SQ2 connected to Target tables TGT1 and TGT2 respectively. How do you make certain TGT2 is loaded after TGT1?
Ans: If we've got multiple Source Qualifier alterations related to more than one goals, we will designate the order in which the Integration Service loads data into the objectives.
In the Mapping Designer, We want to configure the Target Load Plan primarily based at the Source Qualifier ameliorations in a mapping to specify the specified loading order.
Q11. Suppose we've a Source Qualifier transformation that populates target tables. How do you ensure TGT2 is loaded after TGT1?
Ans: In the Workflow Manager, we will Configure Constraint based totally load ordering for a consultation. The Integration Service orders the target load on a row-via-row foundation. For every row generated by using an active supply, the Integration Service loads the corresponding transformed row first to the number one key table, then to the foreign key desk.
Hence if we've one Source Qualifier transformation that gives statistics for a couple of goal tables having number one and overseas key relationships, we will move for Constraint primarily based load ordering.
Q30. Revisiting Filter Transformation
Q1. What is a Filter Transformation and why it's miles an Active one?
Ans: A Filter transformation is an Active and Connected transformation that may clear out rows in a mapping.
Only the rows that meet the Filter Condition skip thru the Filter transformation to the next transformation inside the pipeline. TRUE and FALSE are the implicit go back values from any filter out situation we set. If the filter out situation evaluates to NULL, the row is assumed to be FALSE.
The numeric equivalent of FALSE is 0 (zero) and any non-zero price is the equal of TRUE.
As an ACTIVE transformation, the Filter transformation may also alternate the wide variety of rows exceeded via it. A filter out circumstance returns TRUE or FALSE for each row that passes through the transformation, relying on whether or not a row meets the specified condition. Only rows that return TRUE pass thru this alteration. Discarded rows do no longer seem inside the consultation log or reject documents.
Q2. What is the distinction between Source Qualifier alterations Source Filter to Filter transformation?
SQ Source Filter Filter Transformation
Source Qualifier transformation filters rows while examine from a supply. Filter transformation filters rows from inside a mapping
Source Qualifier transformation can handiest clear out rows from Relational Sources. Filter transformation filters rows coming from any type of supply device within the mapping level.
Source Qualifier limits the row set extracted from a supply. Filter transformation limits the row set despatched to a goal.
Source Qualifier reduces the variety of rows used in the course of the mapping and therefore it affords higher overall performance. To maximize session performance, encompass the Filter transformation as near the assets within the mapping as feasible to clear out undesirable records early inside the glide of information from resources to goals.
The clear out situation inside the Source Qualifier transformation handiest makes use of popular SQL as it runs within the database. Filter Transformation can define a circumstance the use of any announcement or transformation function that returns both a TRUE or FALSE fee.
Q31. Revisiting Joiner Transformation
Q1. What is a Joiner Transformation and why it is an Active one?
Ans: A Joiner is an Active and Connected transformation used to enroll in source statistics from the identical supply system or from two related heterogeneous resources dwelling in one-of-a-kind places or file systems.
The Joiner transformation joins sources with at the least one matching column. The Joiner transformation uses a situation that fits one or extra pairs of columns between the two resources.
The two enter pipelines consist of a grasp pipeline and a detail pipeline or a master and a detail department. The grasp pipeline ends on the Joiner transformation, at the same time as the element pipeline keeps to the goal.
In the Joiner transformation, we should configure the transformation houses particularly Join Condition, Join Type and Sorted Input option to improve Integration Service performance.
The be part of condition includes ports from both enter resources that should in shape for the Integration Service to join two rows. Depending at the kind of join decided on, the Integration Service either adds the row to the end result set or discards the row.
The Joiner transformation produces result sets primarily based on the join kind, circumstance, and input statistics resources. Hence it is an Active transformation.
Q2. State the restrictions where we can not use Joiner inside the mapping pipeline.
Ans: The Joiner transformation accepts input from maximum transformations. However, following are the limitations:
Joiner transformation cannot be used when both of the enter pipeline carries an Update Strategytransformation.
Joiner transformation cannot be used if we connect a Sequence Generator transformation immediately earlier than the Joiner transformation.
Q3. Out of the two input pipelines of a joiner, which one will you put because the master pipeline?
Ans: During a session run, the Integration Service compares each row of the grasp source in opposition to the detail supply. The master and element assets need to be configured for highest quality overall performance.
To enhance performance for an Unsorted Joiner transformation, use the source with fewer rows as the grasp supply. The fewer precise rows inside the grasp, the less iterations of the join assessment occur, which speeds the join procedure.
When the Integration Service methods an unsorted Joiner transformation, it reads all grasp rows earlier than it reads the element rows. The Integration Service blocks the element source at the same time as it caches rows from the grasp source. Once the Integration Service reads and caches all grasp rows, it unblocks the element supply and reads the element rows.
To improve overall performance for a Sorted Joiner transformation, use the source with fewer replica key values because the master source.
When the Integration Service approaches a taken care of Joiner transformation, it blocks facts based totally at the mapping configuration and it shops fewer rows inside the cache, growing performance.
Blocking good judgment is possible if master and detail enter to the Joiner transformation originate from special sources. Otherwise, it does no longer use blocking off good judgment. Instead, it shops greater rows within the cache.
Q4. What are the special forms of Joins available in Joiner Transformation?
Ans: In SQL, a be a part of is a relational operator that mixes facts from more than one tables right into a unmarried result set. The Joiner transformation is similar to an SQL join besides that information can originate from one-of-a-kind forms of assets.
The Joiner transformation helps the subsequent types of joins :
Note: A regular or master outer be a part of performs faster than a complete outer or element outer join.
Q5. Define the numerous Join Types of Joiner Transformation.
In a everyday be part of , the Integration Service discards all rows of records from the grasp and detail supply that do not match, primarily based on the be part of situation.
A grasp outer join maintains all rows of data from the detail supply and the matching rows from the grasp supply. It discards the unequalled rows from the grasp supply.
A detail outer be a part of maintains all rows of facts from the grasp supply and the matching rows from the element supply. It discards the unrivaled rows from the detail supply.
A full outer join maintains all rows of records from each the grasp and detail assets.
Q6. Describe the effect of wide variety of be a part of conditions and join order in a Joiner Transformation.
Ans: We can define one or greater conditions primarily based on equality among the required master and detail sources. Both ports in a condition ought to have the equal datatype.
If we need to use ports in the be a part of circumstance with non-matching datatypes we need to convert the datatypes in order that they match. The Designer validates datatypes in a join situation.
Additional ports in the be part of condition increases the time vital to sign up for sources.