You can change the properties of a dataflow like Execute once, cache type, database link, parallelism, etc.
Step 1 − To change the properties of data flow, right click on Data flow → Properties
You can set various properties for a dataflow. The properties are given below.
Sr. No. | Properties & Description |
---|---|
1 | Execute only once When you specify that a dataflow should only execute once, a batch job will never re-execute that data flow after the data flow completes successfully, except if the data flow is contained in a work flow that is a recovery unit that re-executes and has not completed successfully elsewhere outside the recovery unit. It is recommended that you do not mark a dataflow as Execute only once if a parent work flow is a recovery unit. |
2 | Use database links Database links are communication paths between one database server and another. Database links allow local users to access data on a remote database, which can be on the local or a remote computer of the same or different database type. |
3 | Degree of parallelism Degree Of Parallelism (DOP) is a property of a data flow that defines how many times each transform within a data flow replicates to process a parallel subset of data. |
4 | Cache type You can cache data to improve performance of operations such as joins, groups, sorts, filtering, lookups, and table comparisons. You can select one of the following values for the Cache type option on your data flow Properties window −
|
Step 2 − Change the properties such as Execute only once, Degree of parallelism and cache types.
A data flow can extract or load a data directly using the following objects −
Source objects − Source objects define the source from which data is extracted or you read the data.
Target objects − Target Objects defines the target to which you load or write the data.
The following type of source object can be used and different access methods are used for the source objects.
Table | A file formatted with columns and rows as used in relational databases | Direct or through adapter |
Template table | A template table that has been created and saved in another data flow(used in development) | Direct |
File | A delimited or fixed-width flat file | Direct |
Document | A file with an application-specific format(not readable by SQL or XML parser) | Through adapter |
XML file | A file formatted with XML tags | Direct |
XML message | Used as a source in real-time jobs | Direct |
The following Target objects can be used and different access method can be applied.
Table | A file formatted with columns and rows as used in relational databases | Direct or through adapter |
Template table | A table whose format is based on the output of the preceding transform(used in development) | Direct |
File | A delimited or fixed-width flat file | Direct |
Document | A file with an application-specific format(not readable by SQL or XML parser) | Through adapter |
XML file | A file formatted with XML tags | Direct |
XML template file | An XML file whose format is based on the preceding transform output(used in development, primarily for debugging data flows) | Direct |