The performance of an ETL job depends on the system on which you are using Data Services software, number of moves, etc.
There are various other factors that contribute to the performance in an ETL task. They are −
Source Data Base − Source database should be set to perform the Select statements quickly. This can be done by increasing the size of database I/O, increasing the size of the shared buffer to cache more data and not allowing parallel for small tables, etc.
Source Operating System − Source Operating System should be configured to read the data quickly from the disks. Set the read ahead protocol to 64KB.
Target Database − Target Database must be configured to perform INSERT and UPDATE quickly. This can be done by −
Target Operating System − Target Operating System has to be configured in order to write the data to the disks quickly. You can turn on asynchronous I/O to make the Input/output operations as fast as possible.
Network − Network bandwidth should be enough to transfer the data from source to target system.
BODS Repository Database − To improve the performance of BODS jobs, the following can be performed −
Monitor Sample Rate − In case you are processing a large amount of data set in an ETL job, monitor Sample Rate to a higher value to reduce the number of I/O calls to the log file thereby improving the performance.
You can also exclude the Data Services logs from the virus scan if the virus scan is configured on the job server as it can cause a performance degradation
Job Server OS − In Data Services, one data flow in a job initiates one ‘al_engine’ process, which initiates four threads. For maximum performance, consider a design that runs one ‘al_engine’ process per CPU at a time. The Job Server OS should be tuned in such a way that all the threads are spread to all the available CPUs.