2012年3月22日星期四

Anyone else finding this painful?

Although I am sure I will grow to love this new product I can honestly say that everyday has been a painful experience. Most of this, once again, is the learning curve to the new product. Moreover, each accomplishment leaves us feeling as if we are doing things the wrong way.

Here is an example... We are populating lookup tables from a staging environment. This is pretty straight forward, insert records that don't already exist in the destination. Unfortunately, the only way we could get this to work properly, outside of just writing T-SQL for everything, was to use a lookup transformation and redirect the error rows into the table. Although we have ran numerous tests against it with no problems it certainly doesn't let you walk away with a very comfortable feeling and certainly isn't a choice I would recommend. Are others having this same struggle and uncomfortableness?

Select aspirin
into Body
from Ailments
where condition like '%SSIS%'

-Krusty

Krusty,

Don't fret. That's a legitimate approach and ndeed is the commonly accepted approach. Fair enough - it doesn't look right that good rows are coming down the Error output but don't worry - its not a problem.

-Jamie

|||I feel your pain; but Jamie is right. This is a pretty common technique. I was not aware of the Slowly Changing Dimension Wizard which does a variation of the same thing with a twist that I haven't figured out yet. The wizard can be found in the data flow task by the same name and might be worth looking at sometime - for now, we do it exactly the way you are doing it. I did not realize I had a Slowly Changing Dimension... so I wrote this data flow on my own. The Wizard does it for you.|||

Note that the SCD uses a LOOKUP under the covers so its basically the same thing with a few bells and whistles.

-Jamie

|||

Thanks for the comments it is somewhat comforting to know that others are using the same approach. I guess after being comfortable in the 2000-DTS world for so long it is tough to walk into an environment that puts you back at the beginning.

-Krusty

PS: Thanks for all the blogging Jamie... between here and there I have managed to make some progress.

|||

Hi,

Do not you find that the Slowly Changing Dimension transform works a little too slow?

Sure it saves you tons of code but I would love it to be faster.

Philippe

|||Hey Krusty - I tend to agree with you, there should be an easier and cleaner way to accomplish your task. The techniques described above do work, but it feels like a hack. There should be a transform or something that allows you to "insert where not exists". This is fairly common scenario in my experience.|||

An Upsert estination adapter would be fantastic. Insert it if its not there, update it otherwise.

-Jamie

|||

There would need to be more metatdata entered and persisted in the package to support that type of functionality (the unique identifiers that determine whether a row is a "new" row or exists) so that SSIS could "know" which operation to perform. Solende's Warehouse Workbench has exactly this functionality in the GUI. I loved it.

Of course also it should examine the actual data in each column of source vs. target if the decision is to update ,as none of the data may actually have changed. In that case you would probably not want to perform an update on the target for no real reason that may also update a timestamp column or some other indicator in many systems that indicates when rows have been updated. Many systems that have such a column also have logic to make sure an update is performed only when data actually is changed.

Ken

|||I think it is a very common scenario. My current SSIS project is

pulling updated/new rows from Oracle and transferring to staging

tables. I played with the Lookup transforms, but balked at writing

basically the same DataFlow stuff many times over for many different

tables.

In the end I just wrote minimal DataFlows for the transfer to staging,

then produced update/insert tsql stored in files and used a ForEach

loop to iterate over them. It means as I implement new table transfers,

I can just drop new script files into a particular folder (location

configurable with variable) and know they will be executed in the

loop. I took the idea from the AdventureWorks DataWarehouse SSIS

example.

- Jerzy|||That is a very interesting idea... I will have to take a look at that option. Thanks.

没有评论:

发表评论