40
How can you recognise whether or not the newly added rows in the source are gets insert in the target? In the Type2 maping we have three options to recognise the newly added rows Version number Flagvalue Effective date Range. You can see that in the session log properties; take some 10 new records, and run the workflow, then look into the session log, there you can find the effected row, applied row, and rejected row. Add one timestamp column in target table. If the target table timestamp is updated, means the record is updated/inserted in the workflow. What is the difference between Informatica 7.0 and 8.0 ? The major difference is Informatica 8x mainly works on SOA where as the Informatica 7x works on Client and Server Architecture concept. the main differences are time concept also and more advanced transformations are released like java and sql transformations etc Differences between Normalizer and Normalizer transformation. Normalizer: It is a transormation mainly using for cobol sources, it's change the rows into coloums and columns into rows Normalization:To remove the retundancy and inconsitecy Performance tuning in Informatica? The goal of performance tuning is optimize session performance so sessions run during the available load window for the Informatica Server.Increase the session performance by following. The performance of the Informatica Server is related to network connections. Data generally moves across a network at less than 1

Informatica Q&A

Embed Size (px)

Citation preview

Page 1: Informatica Q&A

How can you recognise whether or not the newly added rows in the source are gets insert in the target?

In the Type2 maping we have three options to recognise the newly added rowsVersion numberFlagvalueEffective date Range.

You can see that in the session log properties; take some 10 new records, and run the workflow, then look into the session log, there you can find the effected row, applied row, and rejected row.

Add one timestamp column in target table. If the target table timestamp is updated, means the record is updated/inserted in the workflow.

What is the difference between Informatica 7.0 and 8.0 ?

The major difference is Informatica 8x mainly works on SOA where as the Informatica 7x works on Client and Server Architecture concept.

the main differences are time concept also and more advanced transformations are released like java and sql transformations etc

Differences between Normalizer and Normalizer transformation.

Normalizer: It is a transormation mainly using for cobol sources,it's change the rows into coloums and columns into rowsNormalization:To remove the retundancy and inconsitecy

Performance tuning in Informatica?

The goal of performance tuning is optimize session performance so sessions run during the available load window for the Informatica Server.Increase the sessionperformance by following.The performance of the Informatica Server is related to network connections. Data generally moves across a network at less than 1 MB per second, whereas a local diskmoves data five to twenty times faster. Thus network connections ofteny affect on session performance.So aviod netwrok connections.Flat files: If u?r flat files stored on a machine other than the informatca server, move those files to the machine that consists of informatica server.Relational datasources: Minimize the connections to sources ,targets and informatica server toimprove session performance.Moving target database into server system may improve sessionperformance.Staging areas: If u use staging areas u force informatica server to perform multiple datapasses.Removing of staging areas may improve session performance.You can run the multiple informatica servers againist the same repository.Distibuting the session load to multiple informatica servers may improve session performance.

Page 2: Informatica Q&A

Run the informatica server in ASCII datamovement mode improves the session performance.Because ASCII datamovement mode stores a character value in onebyte.Unicode mode takes 2 bytes to store a character.If a session joins multiple source tables in one Source Qualifier, optimizing the query may improve performance. Also, single table select statements with an ORDER BY orGROUP BY clause may benefit from optimization such as adding indexes.We can improve the session performance by configuring the network packet size,which allowsdata to cross the network at one time.To do this go to server manger ,choose server configure database connections.If u are target consists key constraints and indexes u slow the loading of data.To improve the session performance in this case drop constraints and indexes before u run thesession and rebuild them after completion of session.Running a parallel sessions by using concurrent batches will also reduce the time of loading thedata.So concurent batches may also increase the session performance.Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines.In some cases if a session contains a aggregator transformation ,You can use incremental aggregation to improve session performance.Aviod transformation errors to improve the session performance.If the sessioin containd lookup transformation You can improve the session performance by enabling the look up cache.If U?r session contains filter transformation ,create that filter transformation nearer to the sourcesor You can use filter condition in source qualifier.Aggreagator,Rank and joiner transformation may oftenly decrease the session performance .Because they must group data before processing it.To improve sessionperformance in this case use sorted ports option.

Increase the temporary database space also improves the performance.

How do you handle decimal places while importing a flatfile into informatica?

while importing flat file definetion just specify the scale for a neumaric data type. in the mapping, the flat file source supports only number datatype(no decimal and integer). In the SQ associated with that source will have a data type as decimal for that number port of the source.source ->number datatype port ->SQ -> decimal datatype.Integer is not supported. hence decimal is taken care.

Import the field as string and then use expression to convert it, so that we can avoid truncation if decimal places in source itself.

What is the use of incremental aggregation? Explain me in brief with an example.

When using incremental aggregation, you apply captured changes in the source to aggregate calculations in a session. If the source changes incrementally and you can capture changes, you can configure the session to process those changes. This allows the Integration Service to update the target

Page 3: Informatica Q&A

incrementally, rather than forcing it to process the entire source and recalculate the same data each time you run the session.

Consider using incremental aggregation in the following circumstances:You can capture new source data. Use incremental aggregation when you can capture new source data each time you run the session. Use a Stored Procedure or Filter transformation to process new data.

Incremental changes do not significantly change the target. Use incremental aggregation when the changes do not significantly change the target. If processing the incrementally changed source alters more than half the existing target, the session may not benefit from using incremental aggregation. In this case, drop the table and recreate the target with complete source data.

Note: Do not use incremental aggregation if the mapping contains percentile or median functions. The Integration Service uses system memory to process these functions in addition to the cache memory you configure in the session properties. As a result, the Integration Service does not store incremental aggregation values for percentile and median functions in disk caches.

What is the target load order?

You specify the target loadorder based on source qualifiers in a maping.If you have the multiple source qualifiers connected to the multiple targets,You can designate the order in which informatica server loads data into the targets.

How do we do unit testing in informatica?How do we load data in informatica ?

Unit testing are of two types1. Quantitaive testing 2.Qualitative testingSteps.1.First validate the mapping2.Create session on themapping and then run workflow.Once the session is succeeded the right click on session and go for statistics tab.There you can see how many number of source rows are applied and how many number of rows loaded in to targets and how many number of rows rejected.This is called Quantitative testing.If once rows are successfully loaded then we will go for qualitative testing.

Steps1.Take the DATM(DATM means where all business rules are mentioned to the corresponding source columns) and check whether the data is loaded according to the DATM in to target table.If any data is not loaded according to the DATM then go and check in the code and rectify it.This is called Qualitative testing.This is what a devloper will do in Unit Testing.

What is power center repository?

Page 4: Informatica Q&A

a power center repository is a data dictionary, where we store object definitions.1.folders2.source and target definitions3. transformation rules5.mappings6.sessions7.workflows and scheduling 8. users permission and previlizations it's also one type of relational database where we store metadata.this repository created at the time of installation by informatica administrators

Diff between static and dynamic cache? and please explain with one example?

Difference between static and dynamic cache- Static- Once the data is cached , it will not change. example unconnected lookup uses static cache.Dynamic- The cache is updated as to reflect the update in the table( or source) for which it is reffering to.(ex. connected lookup).

while using a static cache in lookup we can use all operators like =,<,>... while giving condition in condition tab

but in using dynamic cache we only can use = operator

How the informatica server sorts the string values in Ranktransformation?

When Informatica Server runs in UNICODE data movement mode ,then it uses the sort order configured in session properties.

We can run informatica server either in UNICODE data moment mode or ASCII data moment mode.Unicode mode: in this mode informatica server sorts the data as per the sorted order in session.ASCII Mode:in this mode informatica server sorts the date as per the binary order

Explain about Informatica server Architecture?

when we will strt the workflow the data loaded into load manager and load to dispatcher there the parts are there first one isreader thread:=it is a subprogrames. uses the source table and source table connection to read the source data from the source database.second one issharedmemory :=in this sharedmemory the extract data from reader is stored under shared memory is called

Page 5: Informatica Q&A

staging area.writerthread :=to colllect the data from shared memory and uses target table and target table and target table connections to load the data into target database.

How can you improve session performance in aggregator transformation?

Use sorted input.

One way is supplying the sorted input to aggregator transformation. In situations where sorted input cannot be supplied, we need to configure data cache and index cache at session/transformation level to allocate more space to support aggregation.

Is sorter an active or passive transformation?What happens if we uncheck the distinct option in sorter.Will it be under active or passive transformation?

Sorter is an active transformation. if you don't check the distinct option it is considered as a passive transformation. becos this distinct option eliminates the duplicate records from the table.

In update strategy target table or flat file which gives more performance ? why?

Pros: Loading, Sorting, Merging operations will be faster as there is no index concept and Data will be in ASCII mode.Cons: There is no concept of updating existing records in flat file.As there is no indexes, while lookups speed will be lesser.

What is the difference between stop and abort

stop: _______If the session u want to stop is a part of batch you must stop the batch,if the batch is part of nested batch, Stop the outer most bacthAbort:----You can issue the abort command , it is similar to stop command except it has 60 second time out .If the server cannot finish processing and commiting data with in 60 sec

How can you create or import flat file definition in to the warehouse designer?

You can not create or import flat file defintion in to warehouse designer directly.Instead you must analyze the file in source analyzer,then drag it into the warehousedesigner.When you drag the flat file source defintion into warehouse desginer workspace,the warehouse designer creates a relational target defintion not a file defintion.If you want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flat file.

How many types of dimensions are available in Informatica?

Page 6: Informatica Q&A

The types of dimensions available are:1. Junk dimension2. Degenerative Dimension3. Conformed Dimension

When we create a target as flat file and source as oracle.. how can i specify first rows as column names in flat files...

use a pre sql statement....but this is a hardcoding method...if you change the column names or put in extra columns in the flat file, you will have to change the insert statement

You can also achive this by changing the setting in the Informatica Repository manager to display the columns heading. The only disadvantage of this is that it will be applied on all the files that will be generated by This server

IIF(CUME(1)=1, 'col1,col2,col3,col4'||CHR(10)||to_char(col1 ),to_char(col1))

In Version 8.2 of Informatica , we have in session properties , an option Called " Header options". Use that to get the field names as the first row in the Target Flat file

How can we use pmcmd command in a workflow or to run a session

pmcmd>startworkflow -f foldername workflowname

What is the difference between filter and lookup transformation?

1) Filter transformation is an Active transformation and Lookup is a Passive transformation

2) Filter transformation is used to Filter rows based on condition and Lookup is used to to look up data in a flat file or a relational table, view, or synonym

How do you configure mapping in informatica

You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations. For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.

Which tasks can be performed on port level(using one specific port)?

I think unconnected Lookup or expression transformation can be used for single port for a row.

What is difference between maplet and reusable transformation?

Page 7: Informatica Q&A

Maplet: one or more transformations

Reusable transformation: only one transformationMapplet : set of transformations that are reusable.

Reusuable Transformation: Single transformation which is reusable.

When do u use a unconnected lookup and connected lookup....

what is the difference between dynamic and static lookup...y and when do v use these types of lookups ( ie...dynamic and static )

n static lookup cache, you cache all the lookup data at the starting of the session. in dynamic lookup cache, you go and query the database to get the lookup value for each record which needs the lookup. static lookup cache adds to the session run time....but it saves time as informatica does not need to connect to your databse every time it needs to lookup. depending on how many rows in your mapping needs a lookup, you can decide on this...also remember that static lookup eats up space...so remember to select only those columns which are needed

Unconnected LookupPhysically unconnected from other transformations-NO data flowarrows leading to or from an unconnected LookupLookup data is called from the point in the Mapping that needs it so there are less number of looks upLookup function can be set within any transformation that supportsexpressions.

Unconnected lookup will be used if we need to use same lookup tx multiple times within the same mapping.Connected lookup will be used if we need to use a lookup Tx only once in a mapping.Differance B/w Dynamic and static lookup:1.if we use Dynamic lookup, the Integration service wll query the lookup source once and builds a cache which can be inserted/updated based on the availability of new records and the changes in the source records before loading to target table. If we use static lookup, the cache will be build by the integration service when the a row from the source requests lookup first time. Integration Service doesn't inserts/updates the lookup cache based on the new rows/changes from the source 2.Static lookup could be both connected and unconnected whereas Dynamic lookup will be only connected.3.static lookup will not consume much memory when compared to Dynamic lookup.

Dynamic lookup wll be used if there is a chance of getting a new record and a change in that record as 2 differant rows in a single session run. Ex: row1:creation of a Customer c1 data; Row2: Changing the profile of the same customer c1. Here , row 1 should be inserted to target and row2 should be updated to target.Static lookup will be used if there is no chance to get above case.

Page 8: Informatica Q&A

How many types of facts and what are they?

There are Factless Facts:Facts without any measures.Additive Facts:Fact data that can be additive/aggregative.Non-Additive facts: Facts that are result of non-additonSemi-Additive Facts: Only few colums data can be added.Periodic Facts: That stores only one row per transaction that happend over a period of time.Accumulating Fact: stores row for entire lifetime of event.

What are the out put files that the informatica server creates during the session running?

Informatica server log: Informatica server(on unix) creates a log for all status and error messages(default name: pm.server.log). It also creates an error log for errormessages. These files will be created in informatica home directory:-

Session log file: Informatica server creates session log file for each session.It writes information about session into log files such as initialization process,creation of sqlcommands for reader and writer threads,errors encountered and load summary.The amount of detail in session log file depends on the tracing level that you set.

Session detail file: This file contains load statistics for each targets in mapping.Session detail include information such as table name,number of rows written or rejected.Ucan view this file by double clicking on the session in monitor window

Performance detail file: This file contains information known as session performance details which helps you where performance can be improved.To genarate this file selectthe performance detail option in the session property sheet.

Reject file: This file contains the rows of data that the writer does notwrite to targets.

Control file: Informatica server creates control file and a target file when you run a session that uses the external loader.The control file contains the information about thetarget flat file such as data format and loading instructios for the external loader.Post session email: Post session email allows you to automatically communicate information about a session run to designated recipents.You can create two differentmessages.One if the session completed sucessfully the other if the session fails.

Indicator file: If you use the flat file as a target,You can configure the informatica server to create indicator file.For each target row,the indicator file contains a number to indicatewhether the row was marked for insert,update,delete or reject.output file: If session writes to a target file,the informatica server creates the target file based on file

Page 9: Informatica Q&A

prpoerties entered in the session property sheet.

Cache files: When the informatica server creates memory cache it also creates cache files.

For the following circumstances informatica server creates index and datacache files:-Aggreagtor transformationJoiner transformationRank transformationLookup transformation

Can anyone explain error handling in informatica with examples so that it will be easy to explain the same in the interview.

Go to the session log file there we will find the information regarding to the

session initiation process,

errors encountered.

load summary.

so by seeing the errors encountered during the session running, we can resolve the errors.

There is one file called the bad file which generally has the format as *.bad and it contains the records rejected by informatica server. There are two parameters one fort the types of row and other for the types of columns. The row indicators signifies what operation is going to take place ( i.e. insertion, deletion, updation etc.). The column indicators contain information regarding why the column has been rejected.( such as violation of not null constraint, value error, overflow etc.) If one rectifies the error in the data preesent in the bad file and then reloads the data in the target,then the table will contain only valid data.

What is parameter file?

When you start a workflow, you can optionally enter the directory and name of a parameter file. The Informatica Server runs the workflow using the parameters in the file you specify.

For UNIX shell users, enclose the parameter file name in single quotes:

-paramfile '$PMRootDir/myfile.txt'

For Windows command prompt users, the parameter file name cannot have beginning or trailing spaces. If the name includes spaces, enclose the file name in double quotes:

-paramfile ?$PMRootDirmy file.txt?

Page 10: Informatica Q&A

Note: When you write a pmcmd command that includes a parameter file located on another machine, use the backslash () with the dollar sign ($). This ensures that the machine where the variable is defined expands the server variable.

pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile '$PMRootDir/myfile.txt'

Discuss the advantages & Disadvantages of star & snowflake schema?

In a STAR schema there is no relation between any two dimension tables, whereas in a SNOWFLAKE schema there is a possible relation between the dimension tables.

What is source qualifier transformation?

When you add a relational or a flat file source definition to a maping,U need to connect it to a source qualifer transformation.The source qualifier transformation represnets the records that the informatica server reads when it runs a session.

SQ transformation is a transformation which is automatically generated to read data from source tables into informatica designer.

In Dimensional modeling fact table is normalized or denormalized?in case of star schema and incase of snow flake schema?

In Dimensional modeling, Star Schema: A Single Fact table will be surrounded by a group of Dimensional tables comprise of de- normalized data Snowflake Schema: A Single Fact table will be surrounded by a group of Dimensional tables comprised of normalized dataThe Star Schema (sometimes referenced as star join schema) is the simplest data warehouse schema, consisting of a single "fact table" with a compound primary key, with one segment for each "dimension" and with additional columns of additive, numeric facts.The Star Schema makes multi-dimensional database (MDDB) functionality possible using a traditional relational database. Because relational databases are the most common data management system in organizations today, implementing multi-dimensional views of data using a relational database is very appealing. Even if you are using a specific MDDB solution, its sources likely are relational databases. Another reason for using star schema is its ease of understanding. Fact tables in star schema are mostly in third normal form (3NF), but dimensional tables in de-normalized second normal form (2NF). If you want to normalize dimensional tables, they look like snowflakes (see snowflake schema) and the same problems of relational databases arise - you need complex queries and business users cannot easily understand the meaning of data. Although query performance may be improved by advanced DBMS technology and hardware, highly normalized tables make reporting difficult and applications complex.The Snowflake Schema is a more complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake schema because the diagram of the schema resembles a snowflake.Snowflake schemas normalize dimensions to eliminate redundancy. That

Page 11: Informatica Q&A

is, the dimension data has been grouped into multiple tables instead of one large table. For example, a product dimension table in a star schema might be normalized into a products table, a Product-category table, and a product-manufacturer table in a snowflake schema. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance.

Difference between Rank and Dense Rank?

Rank:12<--2nd position2<--3rd position45

Same Rank is assigned to same totals/numbers. Rank is followed by the Position. Golf game ususally Ranks this way. This is usually a Gold Ranking.

Dense Rank:12<--2nd position2<--3rd position34

Same ranks are assigned to same totals/numbers/names. the next rank follows the serial number.

What is update strategy transformation ?

The model you choose constitutes your update strategy, how to handle changes to existing rows. In PowerCenter and PowerMart, you set your update strategy at two different levels:

Within a session. When you configure a session, you can instruct the Informatica Server to either treat all rows in the same way (for example, treat all rows as inserts), or use instructions coded into the session mapping to flag rows for different database operations. Within a mapping. Within a mapping, you use the Update Strategy transformation to flag rows for insert, delete, update, or reject.

What is the difference between constraind base load ordering and target load plan

Constraint based load ordering

example:

Page 12: Informatica Q&A

Table 1---Master

Tabke 2---Detail

If the data in table1 is dependent on the data in table2 then table2 should be loaded first.In such cases to control the load order of the tables we need some conditional loading which is nothing but constraint based load

In Informatica this feature is implemented by just one check box at the session level.

A CBl specifies the order in which data loads into the targets based on key constraints

A target load plan defines the order in which data being extracted from the source qualifier

What is the default join that source qualifier provides?

Inner equi join.

cross join

How can we partition a session in Informatica?

The Informatica? PowerCenter? Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning. GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks, while ensuring data integrity throughout the execution process. As the amount of data within an organization expands and real-time demand for information grows, the PowerCenter Partitioning option enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users.

What is difference between IIF and DECODE function

You can use nested IIF statements to test multiple conditions. The following example tests for various conditions and returns 0 if sales is zero or negative:

IIF( SALES > 0, IIF( SALES < 50, SALARY1, IIF( SALES < 100, SALARY2, IIF( SALES < 200, SALARY3, BONUS))), 0 )

Page 13: Informatica Q&A

You can use DECODE instead of IIF in many cases. DECODE may improve readability. The following shows how you can use DECODE instead of IIF :

SALES > 0 and SALES < 50, SALARY1,

SALES > 49 AND SALES < 100, SALARY2,

SALES > 99 AND SALES < 200, SALARY3,

SALES > 199, BONUS)

Decode function can used in sql statement. where as if statment cant use with SQL statement.

What is the difference between connected and unconnected stored procedures.

Unconnected:

The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping.

connected:

The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. All data entering the transformation through the input ports affects the stored procedure. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation.

Waht are main advantages and purpose of using Normalizer Transformation in Informatica?

Narmalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data

Differences between connected and unconnected lookup?

Connected lookup:-1> Receives input values diectly from the pipe line.2> You can use a dynamic or static cache.

Page 14: Informatica Q&A

3> Cache includes all lookup columns used in the maping.4> Support user defined default values.

Unconnected lookup:-1> Receives input values from the result of a lkp expression in a another transformation.2> You can use a static cache.3> Cache includes all lookup out put ports in the lookup condition and the lookup/return port.4> Does not support user defiend default values.

What are the join types in joiner transformation?

Normal (Default)Master outerDetail outerFull outer.

What are the methods for creating reusable transforamtions?

In 2 ways1. Using transformation developer tool.2.Converting a non reusable transformation into a reusable transformation in mapping.Restriction : It does not support Source Qualifier Transformation as Reusable Transformation

If you have four lookup tables in the workflow. How do you troubleshoot to improve performance?

There r many ways to improve the mapping which has multiple lookups.

1) we can create an index for the lookup table if we have permissions(staging area).

2) divide the lookup mapping into two (a) dedicate one for insert means: source - target,, these r new rows . only the new rows will come to mapping and the process will be fast . (b) dedicate the second one to update : source=target,, these r existing rows. only the rows which exists allready will come into the mapping.

3)we can increase the chache size of the lookup.

How the informatica server increases the session performance through partitioning the source?

For a relational sources informatica server creates multiple connections for each parttion of a single source and extracts seperate range of data for each connection.Informatica server reads multiple partitions of a single source concurently.Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurently.For XML and file sources,informatica server reads multiple files concurently.For loading the data informatica server creates a seperate file for each partition(of a source file). You can choose to merge the targets.

Page 15: Informatica Q&A

To achieve the session partition what are the necessary tasks you have to do?

Configure the session to partition source data.Install the informatica server on a machine with multiple CPU?s.

why did u use update stategy in your application?

Update Strategy is used to drive the data to be Inert, Update and Delete depending upon some condition. You can do this on session level tooo but there you cannot define any condition.For eg: If you want to do update and insert in one mapping...you will create two flows and will make one as insert and one as update depending upon some condition.Refer : Update Strategy in Transformation Guide for more information

What is difference between partioning of relatonal target and partitioning of file targets?

Partition's can be done on both relational and flat files.

Informatica supports following partitions

1.Database partitioning

2.RoundRobin

3.Pass-through

4.Hash-Key partitioning

5.Key Range partitioning

All these are applicable for relational targets.For flat file only database partitioning is not applicable.

Informatica supports Nway partitioning.U can just specify the name of the target file and create the partitions, rest will be taken care by informatica session.

How can u work with remote database in informatica?did u work directly by using remoteconnections?

To work with remote datasource u need to connect it with remote connections.But it is notpreferable to work with that remote source directly by using remote connections .Instead u bring that source into U r local machine where informatica server resides.If uwork directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time.

Page 16: Informatica Q&A

What is Datadriven?

The informatica server follows instructions coded into update strategy transformations with in the session maping determine how to flag records for insert, update, delete orreject. If you do not choose data driven option setting,the informatica server ignores all update strategy transformations in the mapping.

If the data driven option is selected in the session properties,it follows the instructions in the update strategytransformation in the mapping o.w it follows instuctions specified in the session.

If a session fails after loading of 10,000 records in to the target.How can u load the records from 10001 th record when u run the session next time?

As explained above informatcia server has 3 methods to recovering the sessions.Use performing recovery to load the records from where the session fails.

Why did you use stored procedure in your ETL Application?

usage of stored procedure has the following advantages

1checks the status of the target database

2drops and recreates indexes

3determines if enough space exists in the database

4performs aspecilized calculation

What are the joiner caches?

When a Joiner transformation occurs in a session, the Informatica Server reads all the records from the master source and builds index and data caches based on themaster rows.After building the caches, the Joiner transformation reads records from the detail source and perform joins.

What are the basic needs to join two sources in a source qualifier?

Two sources should have primary and Foreign key relation ships.Two sources should have matching data types.

Page 17: Informatica Q&A

Basic need to join two sources using source qualifier:

1) Both sources should be in same database2) The should have at least one column in common with same data types

In a scenario I have col1, col2, col3, under that 1,x,y, and 2,a,b and I want in this form col1, col2 and 1,x and 1,y and 2,a and 2,b, what is the procedure?

Use Normalizer : create two ports -first port occurs = 1 second make occurs = 2 two output ports are created andconnect to target

How can you improve the performance of Aggregate transformation?

we can improve the agrregator performence in the following ways

1.send sorted input.

2.increase aggregator cache size.i.e Index cache and data cache.

3.Give input/output what you need in the transformation.i.e reduce number of input and output ports.

Use Sorter Transformation to sort input in aggregrator propertiesfilter the records before

How do you create single lookup transformation using multiple tables?

Write a override sql query. Adjust the ports as per the sql query.

By writing SQL override and specifying joins in the SQL override.

Page 18: Informatica Q&A

On a day, I load 10 rows in my target and on next day if I get 10 more rows to be added to my target out of which 5 are updated rows how can I send them to target? How can I insert and update the record?

We can use do this by identifying the granularity of the target table .We can use CRC external procedure after that to compare newly generated CRC no. with the old one and if they do not match then update the row.

In which condtions we can not use joiner transformation(Limitaions of joiner transformation)?

Both pipelines begin with the same original data source.Both input pipelines originate from the same Source Qualifier transformation.Both input pipelines originate from the same Normalizer transformation.Both input pipelines originate from the same Joiner transformation.Either input pipelines contains an Update Strategy transformation.Either input pipelines contains a connected or unconnected Sequence Generator transformation.

What are the diffrence between joiner transformation and source qualifier transformation?

You can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation.You need matching keys to join two relational sources in source qualifier transformation.Where as you doesn?t need matching keys to join two sources.Two relational sources should come from same datasource in sourcequalifier.You can join relatinal sources which are coming from diffrent sources also.

Joiner Transformation can be used to join tables from hetrogenious (different sources), but we still need a common key from both tables. If we join two tables without a common key we will end up in a Cartesian Join. Joiner can be used to join tables from difference source systems where as Source qualifier can be used to join tables in the same database.

We definitely need a common key to join two tables no mater they are in same database or difference databases.

what are the difference between view and materialized view?

Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. E.g. to construct a data warehouse.

A materialized view provides indirect access to table data by storing the results of a query in a separate schema object. Unlike an ordinary view, which does not take up any storage space or contain any data

Page 19: Informatica Q&A

How can i transform row to column?

Through Normalizer Transformation we can do this.

1.we can use normalizer transformation or2.use pivot function in oracle

This is a scenario in which the source has 2 cols10 A 10 A20 C30 D40 E20 CAnd there should be 2 targets one to show the duplicate values and another target for distinct rows.T1 T210 A 10 A20 C 20 C30 DWhich transformation can be used to load data into target?40 E

Step1: sort the source data based on the unique key.

Expression;----------

Flag= iif(col1 =prev_col1,'Y','N')prev_col1 = col1

Router:-------1.for duplicate record: condition: falg = 'Y'2. for distinct Records condition flag = 'N'

How to recover sessions in concurrent batches?

If multiple sessions in a concurrent batch fail, you might want to truncate all targets and run the batch again. However, if a session in a concurrent batch fails and the rest ofthe sessions complete successfully, you can recover the session as a standalone session.To recover a session in a concurrent batch:

Page 20: Informatica Q&A

1.Copy the failed session using Operations-Copy Session.2.Drag the copied session outside the batch to be a standalone session.3.Follow the steps to recover a standalone session.4.Delete the standalone copy.

What are two types of processes that informatica runs the session?

Load manager Process: Starts the session, creates the DTM process, and sends post-session email when the session completes.The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.

How to import oracle sequence into Informatica.

CREATE ONE PROCEDURE AND DECLARE THE SEQUENCE INSIDE THE PROCEDURE,FINALLY CALL THE PROCEDURE IN INFORMATICA WITH THE HELP OF STORED PROCEDURE TRANSFORMATION.

Can you start a session inside a batch idividually?

We can start our required session only in case of sequential batch.in case of concurrent batch we cant do like this.

What r the types of lookup caches?

1)Static Cache2)Dynamic Cache3)Persistent Cache4)Reusable Cache5)Shared Cache

What is pushdown optimizations in pc 8.x with example?

Use pushdown optimization to push transformation logic to the source or target database. The Integration Service analyzes the transformation logic, mapping, and session configuration to determine the transformation logic it can push to the database. At run time, the Integration Service executes any SQL statement generated against the source or target tables, and it processes any transformation logic that it cannot push to the database.Select one of the following values:- None. The Integration Service does not push any transformation logic to the database.

- To Source. The Integration Service pushes as much transformation logic as possible to the source database.

- To Target. The Integration Service pushes as much transformation logic as possible to the target

Page 21: Informatica Q&A

database.

- Full. The Integration Service pushes as much transformation logic as possible to both the source database and target database.

- $$PushdownConfig. The $$PushdownConfig mapping parameter allows you to run the same session with different pushdown optimization configurations at different times. For more information about configuring the $$PushdownConfig mapping parameter and parameter file, see Using the $$PushdownConfig Mapping Parameter.

How do u check the source for the latest records that are to be loaded into the target.i.e i have loaded some records yesterday, today again the file has been populated with some more records today, so how do i find the records populated today.

a) Create a lookup to target table from Source Qualifier based on primary Key.

b) Use and expression to evaluate primary key from target look-up. ( If a new source record look-up primary key port for target table should return null). Trap this with decode and proceed.

In which circumstances that informatica server creates Reject files?

When it encounters the DD_Reject in update strategy transformation.Violates database constraintFiled in the rows was truncated or overflowed.

What is batch and describe about types of batches?

Batch--- is a group of any thing

Different batches ----Different groups of different things.

There are two types of batches1. Concurrent2. Sequential

concurrent and sequential

What is the method of loading 5 flat files of having same structure to a single target and which transformations I can use?

Two Methods.1.write all files in one directory then use file repository concept(dont forget to type source file

Page 22: Informatica Q&A

type as indirect in the session).2.use union t/r to combine multiple input files into a single target.

What is the procedure to load the fact table.Give in detail?

Based on the requirement to your fact table, choose the sources and data and transform it based onyour business needs. For the fact table, you need a primary key so use a sequence generatortransformation to generate a unique key and pipe it to the target (fact) table with the foreign keysfrom the source tables.

Can you use the maping parameters or variables created in one maping into any other reusable transformation?

Yes.Because reusable tranformation is not contained with any maplet or maping.

What are variable ports and list two situations when they can be used?

Variable acts as a local to that transformation we can do further calculation,we cannot pass to the next level.But Output can pass to the next level,we cannot do further calculation in o/p ports.

What are the options in the target session of update strategy transsformatioin?

InsertDeleteUpdateUpdate as updateUpdate as insertUpdate esle insertTruncate table

Why you use repository connectivity?

When you edit,schedule the sesion each time,informatica server directly communicates the repository to check whether or not the session and users are valid.All the metadata of sessions and mappings will be stored in repository.

What are the scheduling options to run a sesion?A session can be scheduled to run at a given time or intervel,or u can manually run the session.Different options of scheduling

Page 23: Informatica Q&A

Run only on demand: server runs the session only when user starts session explicitlyRun once: Informatica server runs the session only once at a specified date and time.Run every: Informatica server runs the session at regular intervels as u configured.Customized repeat: Informatica server runs the session at the dats and times secified in the repeat dialog box.What is change data capture?Change data capture (CDC) is a set of software design patterns used to determine the data that has changed in a database so that action can be taken using the changed data.

There are 3 depts in dept table and one with 100 people and 2nd with 5 and 3rd with some 30 and so. i want to diplay those deptno where more than 10 people existsYes! the answer provided is absolutely right. by an SQL application(Oracle).

If you want to perform it thru informatica, the Fire the same query in the SQL Override of Source qualifier transformation and make a simple pass thru mapping.

Other wise, you can also do it by using a Filter.Router transformation by giving the condition there deptno>=10.How u will create header and footer in target using informatica?If you are focus is about the flat files then one can set it in file properties while creating a mapping or at the session level in session properties

What is meant by EDW?EDW is Enterprise Datawarehouse which means that its a centralised DW for the whole organization.

this apporach is the apporach on Imon which relies on the point of having a single warehouse/centralised where the kimball apporach says to have seperate data marts for each vertical/department.

Advantages of having a EDW:

1. Golbal view of the Data

2. Same point of source of data for all the users acroos the organization.

3. able to perform consistent analysis on a single Data Warehouse.

to over come is the time it takes to develop and also the management that is required to build a centralised database.

What is worklet and what use of worklet and in which situation we can use it

A set of worlflow tasks is called worklet,

Workflow tasks means

1)timer2)decesion3)command4)eventwait5)eventrise6)mail etc......

Page 24: Informatica Q&A

But we r use diffrent situations by using this only

How can u work with remote database in informatica?did u work directly by using remote connections?You can work with remote,

But you have to

Configure FTP

Connection details

IP address

User authenticationWhat is data merging, data cleansing, sampling?

Cleansing:---TO identify and remove the retundacy and inconsistency

sampling: just smaple the data throug send the data from source to target

Data merging: It is a process of combining the data with similar structures in to a single output.

Data Cleansing: It is a process of identifying and rectifying the inconsistent and inaccurate data into consistent and accurate data

Data Sampling:It is the process of sample by sending the data from source to target

What is the exact meaning of domain?

Domain is nothing but give a comlete information on a particular subject area..

like sales domain,telecom domain..etc

Page 25: Informatica Q&A

The PowerCenter domain is the fundamental administrative unit in PowerCenter. The domain supports the administration of the distributed services. A domain is a collection of nodes and services that you can group in folders based on administration ownership.

I have an requirement where in the columns names in a table (Table A) should appear in rows of target table (Table B) i.e. converting columns to rows. Is it possible through Informatica? If so, how?if data in tables as follows Table AKey-1 char(3);table A values_______123

Table Bbkey-a char(3);bcode char(1);table b values1 T1 A1 G2 A2 T2 L3 A

and output required is as

1, T, A2, A, T, L3, A

the SQL query in source qualifier should be

select key_1,max(decode( bcode, 'T', bcode, null )) t_code,max(decode( bcode, 'A', bcode, null )) a_code,max(decode( bcode, 'L', bcode, null )) l_codefrom a, bwhere a.key_1 = b.bkey_agroup by key_1

What r the active and passive transforamtions?

Page 26: Informatica Q&A

Transformations can be active or passive. An active transformation can change the number of rows that pass through it, such as a Filter transformation that removes rows that do not meet the filter condition.

A passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.

In an active transformations the no of outputs will be less than the no of inputs ...in an passive transformation the no of oputputs equals the no of inputs

What r the tasks that source qualifier performs?Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query. Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query. Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query. Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.What is Transaction?Transaction is a logical unit of work that comprises one or more sql statements executed by a single user

Can we run a group of sessions without using workflow managerya Its Posible using pmcmd Command with out using the workflow Manager run the group of session.

as per my knowledge i give the answer.

Can u tell me how to go for SCD's and its types.Where do we use them mostlyThe "Slowly Changing Dimension" problem is a common one particular to data warehousing. In a nutshell, this applies to cases where the attribute for a record varies over time. We give an example below: Christina is a customer with ABC Inc. She first lived in Chicago, Illinois. So, the original entry in the customer lookup table has the following record: Customer Key Name State 1001 Christina IllinoisAt a later date, she moved to Los Angeles, California on January, 2003. How should ABC Inc. now modify its customer table to reflect this change? This is the "Slowly Changing Dimension" problem. There are in general three ways to solve this type of problem, and they are categorized as follows: In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In other words, no history is kept. In our example, recall we originally have the following table: Customer Key Name State 1001 Christina IllinoisAfter Christina moved from Illinois to California, the new information replaces the

Page 27: Informatica Q&A

new record, and we have the following table: Customer Key Name State 1001 Christina CaliforniaAdvantages: - This is the easiest way to handle the Slowly Changing Dimension problem, since there is no need to keep track of the old information. Disadvantages: - All history is lost. By applying this methodology, it is not possible to trace back in history. For example, in this case, the company would not be able to know that Christina lived in Illinois before. Usage: About 50% of the time. When to use Type 1: Type 1 slowly changing dimension should be used when it is not necessary for the data warehouse to keep track of historical changes. In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key. In our example, recall we originally have the following table: Customer Key Name State 1001 Christina IllinoisAfter Christina moved from Illinois to California, we add the new information as a new row into the table: Customer Key Name State 1001 Christina Illinois 1005 Christina CaliforniaAdvantages: - This allows us to accurately keep all historical information. Disadvantages: - This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern. - This necessarily complicates the ETL process. Usage: About 50% of the time. When to use Type 2: Type 2 slowly changing dimension should be used when it is necessary for the data warehouse to track historical changes. In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value. There will also be a column that indicates when the current value becomes active. In our example, recall we originally have the following table: Customer Key Name State1001 Christina IllinoisTo accomodate Type 3 Slowly Changing Dimension, we will now have the following columns: ? Customer Key ? Name ? Original State ? Current State ? Effective Date After Christina moved from Illinois to California, the original information gets updated, and we have the following table (assuming the effective date of change is January 15, 2003): Customer Key Name Original State Current State Effective Date 1001 Christina Illinois California 15-JAN-2003Advantages: - This does not increase the size of the table, since new information is updated. - This allows us to keep some part of history. Disadvantages: - Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Christina later moves to Texas on December 15, 2003, the California information will be lost. Usage: Type 3 is rarely used in actual practice. When to use Type 3: Type III slowly changing dimension should only be used when it is necessary for the data warehouse to track historical changes, and when such changes will only occur for a finite number of time.How can we join the tables if the tables have no primary and forien key relation and no matchig port to join?

without common column or common data type we can join two sources using dummy ports.

1.Add one dummy port in two sources.

2.In the expression trans assing '1' to each port.

2.Use Joiner transformation to join the sources using dummy port(use join conditions).

hope this will help.

If you are workflow is running slow in informatica. Where do you start trouble shooting and what are the steps you follow?

Page 28: Informatica Q&A

When the work flow is running slowly u have to find out the bottlenecks

in this order

target

source

mapping

session

system

How to delete duplicate rows in flat files source is any option in informatica

Use a sorter transformation , in that u will have a "distinct" option make use of it .

What is rank transformation?where can we use this transformation?

Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products.we can go for rank transformation.

Can any body write a session parameter file which will change the source and targets for every session. i.e different source and targets for each session run.

You are supposed to define a parameter file. And then in the Parameter file, you can define two parameters, one for source and one for target.

Give like this for example:

$Src_file = c:program filesinformaticaserverinabc_source.txt

$tgt_file = c: argetsabc_targets.txt

Then go and define the parameter file:

[folder_name.WF:workflow_name.ST:s_session_name]$Src_file =c:program filesinformaticaserverinabc_source.txt$tgt_file = c: argetsabc_targets.txt

If its a relational db, you can even give an overridden sql at the session level...as a parameter. Make sure the sql is in a single line.

Page 29: Informatica Q&A

How to retrive the records from a rejected file. explane with syntax or example

During the execution of workflow all the rejected rows will be stored in bad files(where your informatica server get installed;C:Program FilesInformatica PowerCenter 7.1Server) These bad files can be imported as flat a file in source then thro' direct maping we can load these files in desired format.

Can u start a batches with in a batch?U can not. If u want to start batch that resides in a batch,create a new independent batch and copy the necessary sessions into the new batch.What is exact use of 'Online' and 'Offline' server connect Options while defining Work flow in Work flow monitor? . The system hangs when 'Online' server connect option. The Informatica is installed on a Personal laptop.

When the repo is up and the PMSERVER is also up, workflow monitor always will be connected on-line.

When PMserver is down and the repo is still up we will be prompted for an off-line connection with which we can just monitor the workflows

What r the tasks that source qualifier performs?

Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query. Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query. Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query. Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.