|
Third Generation ETL: Delivering the Best Performance (Part 2)
(Continued from Page 1)
|
|
|
Generated Code. The code generated by the ETL tool needs to be native SQL code. A tool that generates generic SQL limits itself to a small fraction of the capabilities of the RDBMS. As mentioned earlier, database vendors have improved SQL with their own transformation functions to a point where all required data transformations can be performed using SQL. A truly flexible tool must be able to take advantage of all of these functions.
Likewise, optimizing the generated code must be easy. It is not enough to generate SQL to obtain good performance: the code generated by the ETL tool cannot be optimal in 100% of the cases. It is thus important that the optimization of that code be easily implemented through re-usable components so that any improvement can be carried along to new projects.
Platform. Not every database fits every need. Some industries or companies have their preferred RDBMS vendor; some applications require a specific technology. In this regard, it is important for the ETL tool to respect these choices even to the point of addressing where the metadata and project repository are stored. This not only improves overall maintainability, but it also enables organizations to take advantage of in-house expertise, guaranteeing better performances because of a more homogeneous environment.
ETL or ELT?
The third generation of ETL products allows introduction of a new acronym: ELT (for Extract, Load, and Transform) instead of ETL (for Extract, Transform, and Load). With the ELT approach, data is transformed on the target after being loaded. Indeed, if the target database is powerful enough, it can be used to perform all transformations and optimize both performance and investment.
In such a case, there is no useless data transferred on the network. This approach takes full advantage of the power of the RDBMSs always having the flexibility to revert to a more traditional ETL architecture whenever that is needed. A good third generation ETL tool can implement ELT using any database.
Data Access Technologies
There are numerous ways to extract and load data each RDBMS provides specific technologies and utilities in addition to the industry-standard drivers.
|
 |
|
Other
Articles by this Author
|
|
|