Pages

Checking In and Out Versioned Repository Objects

When you work with versioned objects, you must check out an object if you want to change it, and save it when you want to commit the changes to the repository. You must check in the object to allow other users to make changes to it. Checking in an object adds a new numbered version to the object history.

Checking In Objects

You commit changes to the repository by checking in objects. When you check in an object, the repository creates a new version of the object and assigns it a version number. The repository increments the version number by one each time it creates a new version.

To check in an object from the Workflow Manager workspace, select the object or objects and click Versioning > Check in. If you are checking in multiple objects, you can choose to apply comment to all objects.

If you want to check out or check in scheduler objects in the Workflow Manager, you can run an object query to search for them. You can also check out a scheduler object in the Scheduler Browser window when you edit the object. However, you must run an object query to check in the object.

If you want to check out or check in session configuration objects in the Workflow Manager, you can run an object query to search for them. You can also check out objects from the Session Config Browser window when you edit them.

You also can check out and check in session configuration and scheduler objects from the Repository Manager.

Viewing and Comparing Versioned Repository Objects

You can view and compare versions of objects in the Workflow Manager. If an object has multiple versions, you can find the versions of the object in the View History window. In addition to comparing versions of an object in a window, you can view the various versions of an object in the workspace to graphically compare them.

Use the following rules and guidelines when you view older versions of objects in the workspace:

 You cannot simultaneously view multiple versions of composite objects, such as workflows and worklets.

 Older versions of a composite object might not include the child objects that were used when the composite object was checked in. If you open a composite object that includes a child object version that is purged from the repository, the preceding version of the child object appears in the workspace as part of the composite object. For example, you might want to view version 5 of a workflow that originally included version 3 of a session, but version 3 of the session is purged from the repository. When you view version 5 of the workflow, version 2 of the session appears as part of the workflow.

 You cannot view older versions of sessions if they reference deleted or invalid mappings, or if they do not have a session configuration.

Opening an Older Version of an Object

When you view an older version, the version number appears as a prefix before the object name. You can simultaneously view multiple versions of a non-composite object in the workspace.

To open an older version of an object in the workspace:

1.  In the workspace or Navigator, select the object and click Versioning > View History.

2.  Select the version you want to view in the workspace and click Tools > Open in Workspace.


Comparing Two Versions of an Object

You can compare two versions of an object through the workspace, Navigator, or the View History window.

To compare two versions of an object:

1.  In the workspace or Navigator, select an object and click Versioning > View History.

2.  Select the versions you want to compare and click Compare > Selected Versions.

-or-

Select a version and click Compare > Previous Version to compare a version of the object with the previous version.

The Diff Tool appears.

 

Workflow Tasks

You can create the following types of tasks in the Workflow Manager:

 Assignment. Assigns a value to a workflow variable. For more information, see Working with the Assignment Task.

 Command. Specifies a shell command to run during the workflow. For more information, see Command Task.

 Control. Stops or aborts the workflow. For more information about the Control task, see Control Task.

 Decision. Specifies a condition to evaluate. For more information, see Working with the Decision Task.

 Email. Sends email during the workflow. For more information about the Email task, see Sending Email.

 Event-Raise. Notifies the Event-Wait task that an event has occurred. For more information, see Working with the Event Task.

 Event-Wait. Waits for an event to occur before executing the next task. For more information, see Working with the Event Task.

 Session. Runs a mapping you create in the Designer. For more information about the Session task, see Sessions.

 Timer. Waits for a timed event to trigger. For more information, see Workflow Schedules.


Workflow Manager Tools

To create a workflow, you first create tasks such as a session, which contains the mapping you build in the Designer. You then connect tasks with conditional links to specify the order of execution for the tasks you created. The Workflow Manager consists of three tools to help you develop a workflow:
Task Developer. Use the Task Developer to create tasks you want to run in the workflow.
Workflow Designer. Use the Workflow Designer to create a workflow by connecting tasks with links. You can also create tasks in the Workflow Designer as you develop the workflow.
Worklet Designer. Use the Worklet Designer to create a worklet.

Workflow Manager Overview

In the Workflow Manager, you define a set of instructions called a workflow to execute mappings you build in the Designer. Generally, a workflow contains a session and any other task you may want to perform when you run a session. Tasks can include a session, email notification, or scheduling information. You connect each task with links in the workflow.

You can also create a worklet in the Workflow Manager. A worklet is an object that groups a set of tasks. A worklet is similar to a workflow, but without scheduling information. You can run a batch of worklets inside a workflow. 

After you create a workflow, you run the workflow in the Workflow Manager and monitor it in the Workflow Monitor.

Performance Tuning Overview

The goal of performance tuning is to optimize session performance by eliminating performance bottlenecks. To tune session performance, first identify a performance bottleneck, eliminate it, and then identify the next performance bottleneck until you are satisfied with the session performance. You can use the test load option to run sessions when you tune session performance.

If you tune all the bottlenecks, you can further optimize session performance by increasing the number of pipeline partitions in the session. Adding partitions can improve performance by utilizing more of the system hardware while processing the session.

Because determining the best way to improve performance can be complex, change one variable at a time, and time the session both before and after the change. If session performance does not improve, you might want to return to the original configuration.

Complete the following tasks to improve session performance:

1.
Optimize the target. Enables the Integration Service to write to the targets efficiently.
2.
Optimize the source. Enables the Integration Service to read source data efficiently.
3.
Optimize the mapping. Enables the Integration Service to transform and move data efficiently.
4.
Optimize the transformation. Enables the Integration Service to process transformations in a mapping efficiently.
5.
Optimize the session. Enables the Integration Service to run the session more quickly.
6.
Optimize the grid deployments. Enables the Integration Service to run on a grid with optimal performance.
7.
Optimize the PowerCenter components. Enables the Integration Service and Repository Service to function optimally.
8.
Optimize the system. Enables PowerCenter service processes to run more quickly.

Decimal - Informatica

When a session runs without high precision, the Integration Service converts decimal values to doubles. The transformation Decimal datatype supports precision of up to 28 digits, while the Double datatype supports precision of up to 15 digits. Therefore, precision loss occurs if the decimal value has a precision greater than 15 digits.

For example, you have a mapping with Decimal (20,0) that passes the number 40012030304957666903. If the session does not run with high precision, the Integration Service converts the decimal value to double and passes 4.00120303049577 x 1019.

To ensure precision of up to 28 digits, use the Decimal datatype and enable high precision in the session properties. When you run a session with high precision, the Integration Service processes decimal values as Decimal. Precision loss does not occur in a calculation unless the result produces a value with precision greater than 28 digits. In this case, the Integration Service stores the result as a double.

Bigint - informatica

In calculations that can produce decimal values, the Integration Service processes bigint values as doubles or decimals. When a session contains a calculation that can produce decimal values and runs without high precision, the Integration Service converts bigint values to doubles before it performs the calculation. The transformation Double datatype supports precision of up to 15 digits, while the Bigint datatype supports precision of up to 19 digits. Therefore, precision loss can occur in calculations that produce bigint values with precision of more than 15 digits.

For example, an expression transformation contains the following calculation:

POWER( BIGINTVAL, EXPVAL )

Before it performs the calculation, the Integration Service converts the inputs to the POWER function to double values. If the BIGINTVAL port contains the bigint value 9223372036854775807, the Integration Service converts this value to 9.22337203685478e+18, losing the last four digits of precision. If the EXPVAL port contains the value 1.0 and the result port is a bigint, this calculation produces a row error since the result, 9223372036854780000, exceeds the maximum bigint value.

When you use a bigint value in a calculation that can produce decimal values and you run the session with high precision, the Integration Service converts the bigint values to decimals. The transformation Decimal datatype supports precision of up to 28 digits. Therefore, precision loss does not occur in a calculation unless the result produces a value with precision greater than 28 digits. In this case, the Integration Service stores the result as a double.

High Precision Data Overview

High precision data determines how large numbers are represented with greater accuracy. The precision attributed to a number includes the scale of the number. For example, the value 11.47 has a precision of 4 and a scale of 2. Large numbers can lose accuracy because of rounding when used in a calculation that produces an overflow. Incorrect results may arise because of a failure to truncate the high precision data.

High precision data values have greater accuracy. Enable high precision if you require accurate values.

You enable high precision on the properties tab of the session. The Integration Service processes high precision data differently for bigint and decimal values.

Target-Based Commits

During a target-based commit session, the Integration Service commits rows based on the number of target rows and the key constraints on the target table. The commit point depends on the following factors:
Commit interval. The number of rows you want to use as a basis for commits. Configure the target commit interval in the session properties.
Writer wait timeout. The amount of time the writer waits before it issues a commit. Configure the writer wait timeout in the Integration Service setup.
Buffer blocks. Blocks of memory that hold rows of data during a session. You can configure the buffer block size in the session properties, but you cannot configure the number of rows the block holds.
When you run a target-based commit session, the Integration Service may issue a commit before, on, or after, the configured commit interval. The Integration Service uses the following process to issue commits:
When the Integration Service reaches a commit interval, it continues to fill the writer buffer block. When the writer buffer block fills, the Integration Service issues a commit.
If the writer buffer fills before the commit interval, the Integration Service writes to the target, but waits to issue a commit. It issues a commit when one of the following conditions is true:
The writer is idle for the amount of time specified by the Integration Service writer wait timeout option.
The Integration Service reaches the commit interval and fills another writer buffer.
Note: When you choose target-based commit for a session containing an XML target, the Workflow Manager disables the On Commit session property on the Transformations view of the Mapping tab.

Commit Points Overview

A commit interval is the interval at which the Integration Service commits data to targets during a session. The commit point can be a factor of the commit interval, the commit interval type, and the size of the buffer blocks. The commit interval is the number of rows you want to use as a basis for the commit point. The commit interval type is the type of rows that you want to use as a basis for the commit point. You can choose between the following commit types:
Target-based commit. The Integration Service commits data based on the number of target rows and the key constraints on the target table. The commit point also depends on the buffer block size, the commit interval, and the Integration Service configuration for writer timeout.
Source-based commit. The Integration Service commits data based on the number of source rows. The commit point is the commit interval you configure in the session properties.
User-defined commit. The Integration Service commits data based on transactions defined in the mapping properties. You can also configure some commit and rollback options in the session properties.
Source-based and user-defined commit sessions have partitioning restrictions. If you configure a session with multiple partitions to use source-based or user-defined commit, you can choose pass-through partitioning at certain partition points in a pipeline.

Running Full Pushdown Optimization Sessions

To use full pushdown optimization, the source and target databases must be in the same relational database management system. When you run a session configured for full pushdown optimization, the Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the target database. It generates and executes SQL statements against the source or target based on the transformation logic it can push to the database.

When you run a session with large quantities of data and full pushdown optimization, the database server must run a long transaction. Consider the following database performance issues when you generate a long transaction:
A long transaction uses more database resources.
A long transaction locks the database for longer periods of time. This reduces database concurrency and increases the likelihood of deadlock.
A long transaction increases the likelihood of an unexpected event.
To minimize database performance issues for long transactions, consider using source-side or target-side pushdown optimization.

Running Target-Side Pushdown Optimization Sessions

When you run a session configured for target-side pushdown optimization, the Integration Service analyzes the mapping from the target to the source or until it reaches an upstream transformation it cannot push to the target database. It generates an INSERT, DELETE, or UPDATE statement based on the transformation logic for each transformation it can push to the target database. The Integration Service processes the transformation logic up to the point that it can push the transformation logic to the database. Then, it executes the generated SQL on the target database.

Source-Side Pushdown Optimization Sessions

When you run a session configured for source-side pushdown optimization, the Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the source database. 

The Integration Service generates and executes a SELECT statement based on the transformation logic for each transformation it can push to the database. Then, it reads the results of this SQL query and processes the remaining transformations.

Pushdown Optimization Types

You can configure the following types of pushdown optimization:
Source-side pushdown optimization. The Integration Service pushes as much transformation logic as possible to the source database.
Target-side pushdown optimization. The Integration Service pushes as much transformation logic as possible to the target database.
Full pushdown optimization. The Integration Service attempts to push all transformation logic to the target database. If the Integration Service cannot push all transformation logic to the database, it performs both source-side and target-side pushdown optimization.

Pushdown Optimization Overview

You can push transformation logic to the source or target database using pushdown optimization. When you run a session configured for pushdown optimization, the Integration Service translates the transformation logic into SQL queries and sends the SQL queries to the database. The source or target database executes the SQL queries to process the transformations. 

The amount of transformation logic you can push to the database depends on the database, transformation logic, and mapping and session configuration. The Integration Service processes all transformation logic that it cannot push to a database. 

Use the Pushdown Optimization Viewer to preview the SQL statements and mapping logic that the Integration Service can push to the source or target database. You can also use the Pushdown Optimization Viewer to view the messages related to pushdown optimization. 

The Integration Service generates an INSERT SELECT statement to retrieve the ID, name, and description values from the source table, create new item IDs, and insert the values into the ITEM_ID, ITEM_NAME, and ITEM_DESC columns in the target table. It concatenates the store number 5419, an underscore, and the original ITEM ID to get the new item ID.


Hash User Keys Partition Type

In hash user keys partitioning, the Integration Service uses a hash function to group rows of data among partitions based on a user-defined partition key. You choose the ports that define the partition key:

SQ->FIL->SORT-> AGG->TGT

When you specify hash auto-keys partitioning in the preceding mapping, the Sorter transformation receives rows of data grouped by the sort key, such as ITEM_DESC. If the item description is long, and you know that each item has a unique ID number, you can specify hash user keys partitioning at the Sorter transformation and select ITEM_ID as the hash key. This might improve the performance of the session since the hash function usually processes numerical data more quickly than string data.

If you select hash user keys partitioning at any partition point, you must specify a hash key. The Integration Service uses the hash key to distribute rows to the appropriate partition according to group.

For example, if you specify key range partitioning at a Source Qualifier transformation, the Integration Service uses the key and ranges to create the WHERE clause when it selects data from the source. Therefore, you can have the Integration Service pass all rows that contain customer IDs less than 135000 to one partition and all rows that contain customer IDs greater than or equal to 135000 to another partition.

If you specify hash user keys partitioning at a transformation, the Integration Service uses the key to group data based on the ports you select as the key. For example, if you specify ITEM_DESC as the hash key, the Integration Service distributes data so that all rows that contain items with the same description go to the same partition. 

To specify the hash key, select the partition point on the Partitions view of the Mapping tab, and click Edit Keys. This displays the Edit Partition Key dialog box. The Available Ports list displays the connected input and input/output ports in the transformation. To specify the hash key, select one or more ports from this list, and then click Add. 

To rearrange the order of the ports that define the key, select a port in the Selected Ports list and click the up or down arrow.

Hash Auto-Keys Partition Type

Use hash auto-keys partitioning at or before Rank, Sorter, Joiner, and unsorted Aggregator transformations to ensure that rows are grouped properly before they enter these transformations.

The following mapping flow shows a mapping with hash auto-keys partitioning. The Integration Service distributes rows to each partition according to group before they enter the Sorter and Aggregator transformations:

SQ->FIL->SORT->AGG->TGT

In this mapping, the Sorter transformation sorts items by item description. If items with the same description exist in more than one source file, each partition will contain items with the same description. Without hash auto-keys partitioning, the Aggregator transformation might calculate average costs and prices for each item incorrectly.

To prevent errors in the cost and prices calculations, set a partition point at the Sorter transformation and set the partition type to hash auto-keys. When you do this, the Integration Service redistributes the data so that all items with the same description reach the Sorter and Aggregator transformations in a single partition.

Configuring Partition-Level Attributes

When you use dynamic partitioning, the Integration Service defines the partition-level attributes for each partition it creates at run time. It names the file and directory attributes based on session-level attribute names that you define in the session properties. 

For example, you define the session reject file name as accting_detail.bad. When the Integration Service creates partitions at run time, it creates a reject file for each partition, such as accting_detail1.bad, accting_detail2.bad, accting_detail3.bad.

Using Dynamic Partitioning with Partition Types

The following rules apply to using dynamic partitioning with different partition types:

  • Pass-through partitioning. If you change the number of partitions at a partition point, the number of partitions in each pipeline stage changes. If you use pass-through partitioning with a relational source, the session runs in one partition in the stage.
  • Key range partitioning. You must define a closed range of numbers or date keys to use dynamic partitioning. The keys must be numeric or date datatypes. Dynamic partitioning does not scale partitions with key range partitioning on relational targets.
  • Database partitioning. When you use database partitioning, the Integration Service creates session partitions based on the source database partitions. Use database partitioning with Oracle and IBM DB2 sources.
  • Hash auto-keys, hash user keys, or round-robin. Use hash user keys, hash auto-keys, and round-robin partition types to distribute rows with dynamic partitioning. Use hash user keys and hash auto-keys partitioning when you want the Integration Service to distribute rows to the partitions by group. Use round-robin partitioning when you want the Integration Service to distribute rows evenly to partitions.

Rules and Guidelines for Dynamic Partitioning

Use the following rules and guidelines with dynamic partitioning:

  1. Dynamic partitioning uses the same connection for each partition.
  2. You cannot use dynamic partitioning with XML sources and targets.
  3. You cannot use dynamic partitioning with the Debugger.
  4. Sessions that use SFTP fail if you enable dynamic partitioning.
  5. When you set dynamic partitioning to a value other than disabled, and you manually partition the session on the Mapping tab, you invalidate the session.
  6. The session fails if you use a parameter other than $DynamicPartitionCount to set the number of partitions.

The following dynamic partitioning configurations cause a session to run with one partition:
  • You override the default cache directory for an Aggregator, Joiner, Lookup, or Rank transformation. The Integration Service partitions a transformation cache directory when the default is $PMCacheDir.
  • You override the Sorter transformation default work directory. The Integration Service partitions the Sorter transformation work directory when the default is $PMTempDir.
  • You use an open-ended range of numbers or date keys with a key range partition type.
  • You use datatypes other than numbers or dates as keys in key range partitioning.
  • You use key range relational target partitioning.
  • You create a user-defined SQL statement or a user-defined source filter.
  • You set dynamic partitioning to the number of nodes in the grid, and the session does not run on a grid.
  • You use pass-through relational source partitioning.
  • You use dynamic partitioning with an Application Source Qualifier.
  • You use SDK or PowerConnect sources and targets with dynamic partitioning.


Configuring Dynamic Partitioning

Configure dynamic partitioning on the Config Object tab of session properties. Configure dynamic partitioning using one of the following methods:

  • Disabled: Do not use dynamic partitioning. Defines the number of partitions on the Mapping tab.
  • Based on number of partitions. Sets the partitions to a number that you define in the Number of Partitions attribute. Use the $DynamicPartitionCount session parameter, or enter a number greater than 1.
  • Based on number of nodes in grid: Sets the partitions to the number of nodes in the grid running the session. If you configure this option for sessions that do not run on a grid, the session runs in one partition and logs a message in the session log.
  • Based on source partitioning: Determines the number of partitions using database partition information. The number of partitions is the maximum of the number of partitions at the source. For Oracle sources that use composite partitioning, the number of partitions is the maximum of the number of subpartitions at the source.
  • Based on number of CPUs: Sets the number of partitions equal to the number of CPUs on the node that prepares the session. If the session is configured to run on a grid, dynamic partitioning sets the number of partitions equal to the number of CPUs on the node that prepares the session multiplied by the number of nodes in the grid.

Dynamic Partitioning

If the volume of data grows or you add more CPUs, you might need to adjust partitioning so the session run time does not increase. When you use dynamic partitioning, you can configure the partition information so the Integration Service determines the number of partitions to create at run time.

The Integration Service scales the number of session partitions at run time based on factors such as source database partitions or the number of nodes in a grid.

If any transformation in a stage does not support partitioning, or if the partition configuration does not support dynamic partitioning, the Integration Service does not scale partitions in the pipeline. The data passes through one partition.

Complete the following tasks to scale session partitions with dynamic partitioning:

  • Set the partitioning: The Integration Service increases the number of partitions based on the partitioning method you choose.
  • Set session attributes for dynamic partitions You can set session attributes that identify source and target file names and directories. The session uses the session attributes to create the partition-level attributes for each partition it creates at run time.
  • Configure partition types: You can edit partition points and partition types using the Partitions view on the Mapping tab of session properties.
 Note: Do not configure dynamic partitioning for a session that contains manual partitions. If you set dynamic partitioning to a value other than disabled and you manually partition the session, the session is invalid.

Partition Types

When you configure the partitioning information for a pipeline, you must define a partition type at each partition point in the pipeline. The partition type determines how the Integration Service redistributes data across partition points.

The Integration Services creates a default partition type at each partition point. If you have the Partitioning option, you can change the partition type. The partition type controls how the Integration Service distributes data among partitions at partition points. You can create different partition types at different points in the pipeline.

You can define the following partition types in the Workflow Manager:

  • Database partitioning: The Integration Service queries the IBM DB2 or Oracle database system for table partition information. It reads partitioned data from the corresponding nodes in the database. You can use database partitioning with Oracle or IBM DB2 source instances on a multi-node tablespace. You can use database partitioning with DB2 targets.
  • Hash auto-keys: The Integration Service uses a hash function to group rows of data among partitions. The Integration Service groups the data based on a partition key. The Integration Service uses all grouped or sorted ports as a compound partition key. You may need to use hash auto-keys partitioning at Rank, Sorter, and unsorted Aggregator transformations.
  • Hash user keys: The Integration Service uses a hash function to group rows of data among partitions. You define the number of ports to generate the partition key.
  • Key range: With key range partitioning, the Integration Service distributes rows of data based on a port or set of ports that you define as the partition key. For each port, you define a range of values. The Integration Service uses the key and ranges to send rows to the appropriate partition. Use key range partitioning when the sources or targets in the pipeline are partitioned by key range.
  • Pass-through: In pass-through partitioning, the Integration Service processes data without redistributing rows among partitions. All rows in a single partition stay in the partition after crossing a pass-through partition point. Choose pass-through partitioning when you want to create an additional pipeline stage to improve performance, but do not want to change the distribution of data across partitions.
  • Round-robin: The Integration Service distributes data evenly among all partitions. Use round-robin partitioning where you want each partition to process approximately the same number of rows.

Partitioning Multiple Input Group Transformations

The master thread creates a reader and transformation thread for each pipeline in the target load order group. A target load order group has multiple pipelines when it contains a transformation with multiple input groups.

When you connect more than one pipeline to a multiple input group transformation, the Integration Service maintains the transformation threads or creates a new transformation thread depending on whether or not the multiple input group transformation is a partition point:

  • Partition point does not exist at multiple input group transformation: When a partition point does not exist at a multiple input group transformation, the Integration Service processes one thread at a time for the multiple input group transformation and all downstream transformations in the stage.
  • Partition point exists at multiple input group transformation: When a partition point exists at a multiple input group transformation, the Integration Service creates a new pipeline stage and processes the stage with one thread for each partition. The Integration Service creates one transformation thread for each partition regardless of the number of output groups the transformation contains.

Number of Partitions - Informatica

The number of threads that process each pipeline stage depends on the number of partitions. A partition is a pipeline stage that executes in a single reader, transformation, or writer thread. The number of partitions in any pipeline stage equals the number of threads in that stage.

You can define up to 64 partitions at any partition point in a pipeline. When you increase or decrease the number of partitions at any partition point, the Workflow Manager increases or decreases the number of partitions at all partition points in the pipeline. The number of partitions remains consistent throughout the pipeline. If you define three partitions at any partition point, the Workflow Manager creates three partitions at all other partition points in the pipeline. In certain circumstances, the number of partitions in the pipeline must be set to one.

Increasing the number of partitions or partition points increases the number of threads. Therefore, increasing the number of partitions or partition points also increases the load on the node. If the node contains enough CPU bandwidth, processing rows of data in a session concurrently can increase session performance. However, if you create a large number of partitions or partition points in a session that processes large amounts of data, you can overload the system.

The number of partitions you create equals the number of connections to the source or target. If the pipeline contains a relational source or target, the number of partitions at the source qualifier or target instance equals the number of connections to the database. If the pipeline contains file sources, you can configure the session to read the source with one thread or with multiple threads.

For example, when you define three partitions across the mapping, the master thread creates three threads at each pipeline stage, for a total of 12 threads.

The Integration Service runs the partition threads concurrently. When you run a session with multiple partitions, the threads run as follows:

  • The reader threads run concurrently to extract data from the source.
  • The transformation threads run concurrently in each transformation stage to process data. The Integration Service redistributes data among the partitions at each partition point.
  • The writer threads run concurrently to write data to the target.

Partition Points - Informatica

By default, the Integration Service sets partition points at various transformations in the pipeline. Partition points mark thread boundaries and divide the pipeline into stages. A stage is a section of a pipeline between any two partition points. When you set a partition point at a transformation, the new pipeline stage includes that transformation.

When you add a partition point, you increase the number of pipeline stages by one. Similarly, when you delete a partition point, you reduce the number of stages by one. Partition points mark the points in the pipeline where the Integration Service can redistribute data across partitions. 

For example, if you place a partition point at a Filter transformation and define multiple partitions, the Integration Service can redistribute rows of data among the partitions before the Filter transformation processes the data. The partition type you set at this partition point controls the way in which the Integration Service passes rows of data to each partition.

Partitioning Attributes - Informatica

You can set the following attributes to partition a pipeline:

  • Partition points: Partition points mark thread boundaries and divide the pipeline into stages. The Integration Service redistributes rows of data at partition points.
  • Number of partitions: A partition is a pipeline stage that executes in a single thread. If you purchase the Partitioning option, you can set the number of partitions at any partition point. When you add partitions, you increase the number of processing threads, which can improve session performance.
  • Partition types: The Integration Service creates a default partition type at each partition point. If you have the Partitioning option, you can change the partition type. The partition type controls how the Integration Service distributes data among partitions at partition points.

Understanding Pipeline Partitioning Overview - Informatica

You create a session for each mapping you want the Integration Service to run. Each mapping contains one or more pipelines. A pipeline consists of a source qualifier and all the transformations and targets that receive data from that source qualifier. When the Integration Service runs the session, it can achieve higher performance by partitioning the pipeline and performing the extract, transformation, and load for each partition in parallel.

A partition is a pipeline stage that executes in a single reader, transformation, or writer thread. The number of partitions in any pipeline stage equals the number of threads in the stage. By default, the Integration Service creates one partition in every pipeline stage.

Active Transformation in Informatica

I see many people still confused with the definition of the active transformation. Here is the exact definition:

An active transformation can perform any of the following actions:
  • Change the number of rows that pass through the transformation. For example, the Filter transformation is active because it removes rows that do not meet the filter condition. All multi-group transformations are active because they might change the number of rows that pass through the transformation.
  • Change the transaction boundary. For example, the Transaction Control transformation is active because it defines a commit or roll back transaction based on an expression evaluated for each row.
  • Change the row type. For example, the Update Strategy transformation is active because it flags rows for insert, delete, update, or reject.

Stored Procedure Transformation Overview Informatica

Stored Procedure: A Stored Procedure transformation is an important tool for populating and maintaining databases. Database administrators create stored procedures to automate tasks that are too complicated for standard SQL statements.

A stored procedure is a pre-compiled collection of Transact-SQL, PL-SQL or other database procedural statements and optional flow control statements, similar to an executable script. Stored procedures are stored and run within the database. You can run a stored procedure with the EXECUTE SQL statement in a database client tool, just as you can run SQL statements. Unlike standard SQL, however, stored procedures allow user-defined variables, conditional statements, and other powerful programming features.

The stored procedure must exist in the database before creating a Stored Procedure transformation, and the stored procedure can exist in a source, target, or any database with a valid connection to the Integration Service.