2012年3月22日星期四

Anyone else have problems with Sort/Merge in For Each loop?

Just wondering if I am an isolated case or just doing stuff other people don't. Ive created around a dozen packages for importing various types of data into a datawarehouse and some of the packages have random crashes. Over time I have been trying to tie together what the crashing ones have in common and now believe I have completely isolated it: Sort + Merge inside a dataflow in a for each loop. The more sorts and merges in the flow the more likely the problem is to occur.

This week I created 2 identical packages to import some XML... one used some gnarly complicated substrings and such to extract some data and add it as derived columns while the other used the XML correctly that produced 2 streams (one being meta data - other being detail data) and sorted + merged them. The gnarly substring package could run all night (processes about a file every 2 or 3 seconds so you can do the math) while the merged data flows crashes somewhere between 1 and 4 hrs of running - usually right around 3 hrs (also processes about 1 file every 2 seconds). The file it crashes on wil be completly random and not related to file size - I've had crashes on files with as little as 1 row of data and 1 row of metadata and as big as 1 row of metadata and 100,000+ rows of data. Another more complicated package I created has 4 merges (and thus 8 sorts) and it will crash anywhere between 5 minutes and 20 minutes of running.

The crash is usually one where the whole process just shuts down with a memory dump but occasionally I have gotten errors about out of memory or warnings about threads leaking buffers.

I am trying to do some "research" to see if others have this problem. I've spent 1.5 months writing all ETL jobs in SSIS and its gotten to the point management has asked me to explore other platforms. I do have a case open with Microsoft on the package with 4 merges - but we've spent weeks now just trying to get debugging tools to get them some useful info when the process crashes. I am running SP1 and kb91822.

Anyone else?

FYI to anyone that comes across this thread - msft identified a bug in the sort object (stack heap corruption) and now has a hotfix. If you run into this same situation please contact them (until the patch is released to the public)

没有评论:

发表评论