2012年3月8日星期四
Any SQL experts?
I've added a section of the SP below for reference. Here is an overview of
my table structure (simplified).
EMPLOYEE TABLE
employeeid
name
IsActive
EMPLOYEE HISTORY TABLE
employeeid
name
IsActive
FromDate -- this signifies the start date the current employee record info
was valid.
ToDate -- this signifies the end date the current employee record info was
valid.
PROCESS TABLE
processid
name
startdate
enddate
Basically, this is what the SP is meant to do...... when the sp runs, it
will only carry out processes where the current date is between startdate
and enddate of the process record. And it should only process employees who
were active between these dates. I can find out backdated info about an
employee in the history table. However, if there have been no changes to an
employee (ie: new joiner), then there will be no records in the history
table so I have to read the current status from the master employee table.
Hope all this makes sense.
OLD CODE:
-- this works fairly quickly, but is not correct as it's
not looking at historic employee data
(employees.IsActive = 1)
NEW CODE:
(
--if a history record exists for current
employee, the read the info
(
(select top
1 IsActive
from
employeehistory eh
where
eh.todate > process.enddate
and
employeeid = employee.employeeid
order by
eh.enddate) = 1
)
OR
--if no history record exists
for current employee, then read current employee
--data in master employee table
(
not exists
(select *
from
employeehistory
where
employeeid = employee.employeeid
)
AND
(employees.IsActive
= 1)
)
)
The changed section has increased the process time massively. Furthermore,
it is doing a lot of work on tempdb and hence using up all the harddisk
space (and eventually fails). The SP is processing approx 500,000 records.
Does anyone know why this would happen, and what I can do to improve the
performace. What is it writing to tempdb? I've tried adding indexes to the
date/employee columns in the history table, but it doesn't help.Well, iterating over 500000 records will take a long time. You don't
provide the code of your SP; so we don't know how the iteration is taking
place and make it hard to give you any relevant help; however, using an
UNION instead of a Cursor for example might be a solution in your case.
Finally, I really don't understand why you have put an Order By in a
subquery.
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: http://cerbermail.com/?QugbLEWINF
"amy" <amy@.nospam.com> wrote in message
news:uyBhOeCUGHA.4452@.TK2MSFTNGP12.phx.gbl...
> Hi, I have an SP that was working fine (took about 10 mins to process).
> I've added a section of the SP below for reference. Here is an overview
> of my table structure (simplified).
> EMPLOYEE TABLE
> employeeid
> name
> IsActive
> EMPLOYEE HISTORY TABLE
> employeeid
> name
> IsActive
> FromDate -- this signifies the start date the current employee record info
> was valid.
> ToDate -- this signifies the end date the current employee record info was
> valid.
> PROCESS TABLE
> processid
> name
> startdate
> enddate
> Basically, this is what the SP is meant to do...... when the sp runs, it
> will only carry out processes where the current date is between startdate
> and enddate of the process record. And it should only process employees
> who were active between these dates. I can find out backdated info about
> an employee in the history table. However, if there have been no changes
> to an employee (ie: new joiner), then there will be no records in the
> history table so I have to read the current status from the master
> employee table.
> Hope all this makes sense.
>
> OLD CODE:
> -- this works fairly quickly, but is not correct as
> it's not looking at historic employee data
> (employees.IsActive = 1)
>
> NEW CODE:
> (
> --if a history record exists for
> current employee, the read the info
> (
> (select top
> 1 IsActive
> from
> employeehistory eh
> where
> eh.todate > process.enddate
> and
> employeeid = employee.employeeid
> order by
> eh.enddate) = 1
> )
> OR
>
> --if no history record exists
> for current employee, then read current employee
> --data in master employee table
> (
> not exists
> (select *
> from
> employeehistory
> where
> employeeid = employee.employeeid
> )
> AND
> (employees.IsActive = 1)
> )
> )
>
> The changed section has increased the process time massively.
> Furthermore, it is doing a lot of work on tempdb and hence using up all
> the harddisk space (and eventually fails). The SP is processing approx
> 500,000 records. Does anyone know why this would happen, and what I can do
> to improve the performace. What is it writing to tempdb? I've tried
> adding indexes to the date/employee columns in the history table, but it
> doesn't help.
>
>|||Thanks for your response. There are no cursors being used, just straight
forward selects/joins. The full SP is very long, I have only included the
part that has changed and is making the tempdb grow massively.
The order by is required because I need to find the 1st instance of the
employee history record after the process date.
Eg:
If the process date is 3rd feb and the employee history is:
empid name IsActive fromdate todate
2 tom 0 20th feb 20th march
2 tom 1 1st feb 20th feb
2 tom 0 29th jan 1st feb
2 tom 1 1st jan 29th jan
then i need to find the state of the employee as at 3rd feb. i do this by
finding the first record where the enddate > 3rd feb.
"Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)>
wrote in message news:OIqBtGEUGHA.5468@.TK2MSFTNGP14.phx.gbl...
> Well, iterating over 500000 records will take a long time. You don't
> provide the code of your SP; so we don't know how the iteration is taking
> place and make it hard to give you any relevant help; however, using an
> UNION instead of a Cursor for example might be a solution in your case.
> Finally, I really don't understand why you have put an Order By in a
> subquery.
> --
> Sylvain Lafontaine, ing.
> MVP - Technologies Virtual-PC
> E-mail: http://cerbermail.com/?QugbLEWINF
>
> "amy" <amy@.nospam.com> wrote in message
> news:uyBhOeCUGHA.4452@.TK2MSFTNGP12.phx.gbl...
>|||amy
It's hard to suggest without seeing the whole code and understand all
business requirements
I'd start looking at an execution plan , whether or not the optimizer ia
available to use indexes
Aaron has a great article at his web site
http://www.aspfaq.com/show.asp?id=2446
"amy" <amy@.nospam.com> wrote in message
news:O46x64GUGHA.3192@.TK2MSFTNGP09.phx.gbl...
> Thanks for your response. There are no cursors being used, just straight
> forward selects/joins. The full SP is very long, I have only included the
> part that has changed and is making the tempdb grow massively.
> The order by is required because I need to find the 1st instance of the
> employee history record after the process date.
> Eg:
> If the process date is 3rd feb and the employee history is:
> empid name IsActive fromdate todate
> 2 tom 0 20th feb 20th march
> 2 tom 1 1st feb 20th feb
> 2 tom 0 29th jan 1st feb
> 2 tom 1 1st jan 29th jan
> then i need to find the state of the employee as at 3rd feb. i do this by
> finding the first record where the enddate > 3rd feb.
>
> "Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)>
> wrote in message news:OIqBtGEUGHA.5468@.TK2MSFTNGP14.phx.gbl...
>|||[Reposted, as posts from outside msnews.microsoft.com does not seem to make
it in.]
amy (amy@.nospam.com) writes:
> The order by is required because I need to find the 1st instance of the
> employee history record after the process date.
> Eg:
> If the process date is 3rd feb and the employee history is:
> empid name IsActive fromdate todate
> 2 tom 0 20th feb 20th march
> 2 tom 1 1st feb 20th feb
> 2 tom 0 29th jan 1st feb
> 2 tom 1 1st jan 29th jan
> then i need to find the state of the employee as at 3rd feb. i do this by
> finding the first record where the enddate > 3rd feb.
The standard idiom is something like:
SELECT eh.issactive
FROM (SELECT * FROM process WHERE processid = @.processid)
CROSS JOIN (employees e
JOIN employeehistory eh
ON e.employessid = eh.empolyeeid
AND e.startdate = (SELECT MAX(eh2.employeedate)
FROM employeehistory eh2
WHERE eh2.empoloyeeid = eh.employessid
AND e.employeedate <= p.processdate)
Unforteunately, this is not going to perform very well. I guess it is
not possible for you change the tables, but for this sort of operation,
it can be far more effecient to have one row per employee and day, even
if it takes up a lot more disk space.
It helps if you include CREATE TABLE and CREATE INDEX statements for your
tables. Also sample data as INSERT statements with sample data is good,
as that helps to test the logic of a query.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.seBooks Online for SQL
Server 2005
athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000
athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx|||Hi Amy
Check out http://www.aspfaq.com/etiquette.asp?id=5006 on how to post useful
DDL and example data. You don't say what should happen if the employee was
only active for part of the period. This will say if the employee was active
at the start of the period or currently active (but not necessarily an
employee at the time of the period) which may be a starting place.
SELECT
e.employeeid
e.name
[EMPLOYEE TABLE] e
JOIN [EMPLOYEE HISTORY TABLE] h ON h.employeeid = e.employeeid AND
h.IsActive = 1 AND ((h.FromDate <= @.FromDate AND h.ToDate >= @.FromDate)
UNION ALL
SELECT
e.employeeid
e.name
[EMPLOYEE TABLE] e
WHERE NOT EXISTS ( SELECT * FROM [EMPLOYEE HISTORY TABLE] h WHERE
h.employeeid = e.employeeid )
AND e.IsActive = 1
John
"amy" wrote:
> Hi, I have an SP that was working fine (took about 10 mins to process).
> I've added a section of the SP below for reference. Here is an overview o
f
> my table structure (simplified).
> EMPLOYEE TABLE
> employeeid
> name
> IsActive
> EMPLOYEE HISTORY TABLE
> employeeid
> name
> IsActive
> FromDate -- this signifies the start date the current employee record info
> was valid.
> ToDate -- this signifies the end date the current employee record info was
> valid.
> PROCESS TABLE
> processid
> name
> startdate
> enddate
> Basically, this is what the SP is meant to do...... when the sp runs, it
> will only carry out processes where the current date is between startdate
> and enddate of the process record. And it should only process employees w
ho
> were active between these dates. I can find out backdated info about an
> employee in the history table. However, if there have been no changes to a
n
> employee (ie: new joiner), then there will be no records in the history
> table so I have to read the current status from the master employee table.
> Hope all this makes sense.
>
> OLD CODE:
> -- this works fairly quickly, but is not correct as it
's
> not looking at historic employee data
> (employees.IsActive = 1)
>
> NEW CODE:
> (
> --if a history record exists for curre
nt
> employee, the read the info
> (
> (select to
p
> 1 IsActive
> from
> employeehistory eh
> where
> eh.todate > process.enddate
> and
> employeeid = employee.employeeid
> order by
> eh.enddate) = 1
> )
> OR
>
> --if no history record exists
> for current employee, then read current employee
> --data in master employee tabl
e
> (
> not exists
> (select *
> from
> employeehistory
> where
> employeeid = employee.employeeid
> )
> AND
> (employees
.IsActive
> = 1)
> )
> )
>
> The changed section has increased the process time massively. Furthermore
,
> it is doing a lot of work on tempdb and hence using up all the harddisk
> space (and eventually fails). The SP is processing approx 500,000 records
.
> Does anyone know why this would happen, and what I can do to improve the
> performace. What is it writing to tempdb? I've tried adding indexes to t
he
> date/employee columns in the history table, but it doesn't help.
>
>
>|||Depending on the circumstances, joining on a sub-query can result in
performance issues.
Instead of doing this:
not exists (select * from employeehistory where employeeid =
employee.employeeid)
Consider doing this:
left join employeehistory on employeehistory.employeeid =
employee.employeeid
Also consider inserting relevent transactions from employeehistory into a
temporary tables and joining with that.
You can use the Display Estimated Execution Plan feature of Query Analyzer
to determine what lookups and indexes are used and compare different
versions of a SQL statement:
http://msdn.microsoft.com/library/d... />
1_5pde.asp
"amy" <amy@.nospam.com> wrote in message
news:uyBhOeCUGHA.4452@.TK2MSFTNGP12.phx.gbl...
> Hi, I have an SP that was working fine (took about 10 mins to process).
> I've added a section of the SP below for reference. Here is an overview
> of my table structure (simplified).
> EMPLOYEE TABLE
> employeeid
> name
> IsActive
> EMPLOYEE HISTORY TABLE
> employeeid
> name
> IsActive
> FromDate -- this signifies the start date the current employee record info
> was valid.
> ToDate -- this signifies the end date the current employee record info was
> valid.
> PROCESS TABLE
> processid
> name
> startdate
> enddate
> Basically, this is what the SP is meant to do...... when the sp runs, it
> will only carry out processes where the current date is between startdate
> and enddate of the process record. And it should only process employees
> who were active between these dates. I can find out backdated info about
> an employee in the history table. However, if there have been no changes
> to an employee (ie: new joiner), then there will be no records in the
> history table so I have to read the current status from the master
> employee table.
> Hope all this makes sense.
>
> OLD CODE:
> -- this works fairly quickly, but is not correct as
> it's not looking at historic employee data
> (employees.IsActive = 1)
>
> NEW CODE:
> (
> --if a history record exists for
> current employee, the read the info
> (
> (select top
> 1 IsActive
> from
> employeehistory eh
> where
> eh.todate > process.enddate
> and
> employeeid = employee.employeeid
> order by
> eh.enddate) = 1
> )
> OR
>
> --if no history record exists
> for current employee, then read current employee
> --data in master employee table
> (
> not exists
> (select *
> from
> employeehistory
> where
> employeeid = employee.employeeid
> )
> AND
> (employees.IsActive = 1)
> )
> )
>
> The changed section has increased the process time massively.
> Furthermore, it is doing a lot of work on tempdb and hence using up all
> the harddisk space (and eventually fails). The SP is processing approx
> 500,000 records. Does anyone know why this would happen, and what I can do
> to improve the performace. What is it writing to tempdb? I've tried
> adding indexes to the date/employee columns in the history table, but it
> doesn't help.
>
>
Any samples or Demos of Microsoft Office 2007 workng with SQL Server 2005 Analysis Services?
Hi, all experts here,
As we know, SQL Server 2005 Analysis Services has actually got its add-on (like data mining, and OLAP browsing)for Microsoft Office 2007. Could any experts here give me any idea if there is any available demo built on it for us to have a look?
Thanks a lot and I am looking forward to hearing from you shortly.
With best regards,
Yours sincerely,
There are some virtual labs here:
http://msdn2.microsoft.com/en-us/virtuallabs/aa740409.aspx
Is that the kind of thing you're looking for?
Chris
|||Hi, Chris,
Thanks a lot for the link. It's been very helpful.
I am looking for demos which demonstrating Microsoft integrated BI leveraging SQL Server 2005, Office 2007 and ProClarity. As they perfectly have the integration among each other. It will be very helpful if we could have a look at any demos with these integrations among these products together.
Thanks again.
With best regards,
Yours sincerely,
|||Hello Helen. You will see more of that integration in Performancepoint analytics.
The last version of ProClarity as an analytical platform is 6.3, recently released. In this release they have dropped the Business Reporter so the integration is weaker than before.
But wait for Performancepoint or sign up for the beta and you will see more about ProClarity's future.
HTH
Thomas Ivarsson
|||Hi, Thomas,
Thanks a lot. But does PerformancePoint Beta version have the full features of its future release?
With best regards,
Yours sincerely,
|||Hi Helen. You would have to ask the development team about that.
Full features are normally presented about 1-2 month's before the release.
Kind regards
Thomas
|||Hi, Thomas,
Thanks a lot for your helo.
With best regards,
Yours sincerely,
Any Reporting Service Log experts out there?
undertand how our users are (or are not) leveraging our Reporting Services
implementation. We have been writing execution log data to a database based
on the Msft provided SSIS package that pulls data from the Report Server
database and I've discovered a hole in the information that I'm hoping
someone can help me fix.
We use data drive subscriptions pretty heavily which of course require
cached data credentials. In the "ExecutionLogs" table, these entries appear
as being requested by "System" and the user shows our proxy account. If you
review the "ReportServerService_..." log on the report server, you can see
the actual detail surrounding the processing of the subscription, but I can't
find a way to correlate these entries back to the ExecutionLog table. The
"ExecutionLogId" in the ExecutionLog table doesn't reference any of the
uniqueidentifiers that you see in the text ReportServerService log.
My end goal is to be able to update the User column in the ExecutionLog
table with the user who was the actual recipient of the report as identified
in the ReportServerService log file.
Anyone tackle this yet or have any ideas as to how it might be accomplished?I'll give you a big hint.. :)
Remember, you are not limited to just select statements in your datasets for
your reports.
I have a dataset that does INSERTS for my data driven reports... (since it
is data driven, you KNOW who the users are that are going to get the
reports)
cheers!
=-Chris
"KS" <ks@.community.nospam> wrote in message
news:C44736CF-304A-4478-98E9-01BDA5A7C0B2@.microsoft.com...
> We are working on developing some statistical reports to help us to better
> undertand how our users are (or are not) leveraging our Reporting Services
> implementation. We have been writing execution log data to a database
> based
> on the Msft provided SSIS package that pulls data from the Report Server
> database and I've discovered a hole in the information that I'm hoping
> someone can help me fix.
> We use data drive subscriptions pretty heavily which of course require
> cached data credentials. In the "ExecutionLogs" table, these entries
> appear
> as being requested by "System" and the user shows our proxy account. If
> you
> review the "ReportServerService_..." log on the report server, you can
> see
> the actual detail surrounding the processing of the subscription, but I
> can't
> find a way to correlate these entries back to the ExecutionLog table. The
> "ExecutionLogId" in the ExecutionLog table doesn't reference any of the
> uniqueidentifiers that you see in the text ReportServerService log.
> My end goal is to be able to update the User column in the ExecutionLog
> table with the user who was the actual recipient of the report as
> identified
> in the ReportServerService log file.
> Anyone tackle this yet or have any ideas as to how it might be
> accomplished?|||Thanks for the hint, Chris.
Unless I'm missing something, however, that still doesn't provide you with
the ability to cross reference the actual report delivery with the
ExecutionLog record to be able to access the other metrics that are being
captured (TimeDataRetrieval, TimeProcessing, TimeRendering, ByteCount,
RowCount...)
Need to find a way to accurately identify a specific data driven
subscription execution with the corresponding ExecutionLog entry.
Any other ideas? Like I said, all the necessary info is in
ReportServerServices_xxx.log, but I don't see how I can accurately tie that
back to the appropriate ExecutionLogId in the ExecutionLog table. The text
file inlcudes a number of uniqueidentifiers, but none of which match up to
the ExecutionLog. Can't go by exact time either as the text file is not
precise enough in the event of batch data driven subscription processing.
"Chris Conner" wrote:
> I'll give you a big hint.. :)
> Remember, you are not limited to just select statements in your datasets for
> your reports.
> I have a dataset that does INSERTS for my data driven reports... (since it
> is data driven, you KNOW who the users are that are going to get the
> reports)
> cheers!
> =-Chris
>
> "KS" <ks@.community.nospam> wrote in message
> news:C44736CF-304A-4478-98E9-01BDA5A7C0B2@.microsoft.com...
> > We are working on developing some statistical reports to help us to better
> > undertand how our users are (or are not) leveraging our Reporting Services
> > implementation. We have been writing execution log data to a database
> > based
> > on the Msft provided SSIS package that pulls data from the Report Server
> > database and I've discovered a hole in the information that I'm hoping
> > someone can help me fix.
> >
> > We use data drive subscriptions pretty heavily which of course require
> > cached data credentials. In the "ExecutionLogs" table, these entries
> > appear
> > as being requested by "System" and the user shows our proxy account. If
> > you
> > review the "ReportServerService_..." log on the report server, you can
> > see
> > the actual detail surrounding the processing of the subscription, but I
> > can't
> > find a way to correlate these entries back to the ExecutionLog table. The
> > "ExecutionLogId" in the ExecutionLog table doesn't reference any of the
> > uniqueidentifiers that you see in the text ReportServerService log.
> >
> > My end goal is to be able to update the User column in the ExecutionLog
> > table with the user who was the actual recipient of the report as
> > identified
> > in the ReportServerService log file.
> >
> > Anyone tackle this yet or have any ideas as to how it might be
> > accomplished?
>
>|||Hmm... this is a nice challenge.
Let's try this - forget the log for the moment - I know I could look this
up - but I'm not at a report server at the moment - does the report
Globals!ExecutionTime match the time stored in the ExectionLog? I mean, you
know the report name ...
Here is what I was thinking:
select name, b.executiontime
from reportserver.dbo.catalog c
inner join reportserver.dbo.executionlog ex
ON (c.ItemID = ex.ReportID)
inner join BobTable b on (c.name = b.ReportName)
where name = b.ReportName and b.executiontime between c.TimeStart and
c.TimeEnd
Where "BobTable" is your table that you store the report name and report
execution time when the report runs.
=-Chris
"KS" <ks@.community.nospam> wrote in message
news:2E0EE652-8A08-4D89-9DB3-50EB8B367FF6@.microsoft.com...
> Thanks for the hint, Chris.
> Unless I'm missing something, however, that still doesn't provide you with
> the ability to cross reference the actual report delivery with the
> ExecutionLog record to be able to access the other metrics that are being
> captured (TimeDataRetrieval, TimeProcessing, TimeRendering, ByteCount,
> RowCount...)
> Need to find a way to accurately identify a specific data driven
> subscription execution with the corresponding ExecutionLog entry.
> Any other ideas? Like I said, all the necessary info is in
> ReportServerServices_xxx.log, but I don't see how I can accurately tie
> that
> back to the appropriate ExecutionLogId in the ExecutionLog table. The
> text
> file inlcudes a number of uniqueidentifiers, but none of which match up to
> the ExecutionLog. Can't go by exact time either as the text file is not
> precise enough in the event of batch data driven subscription processing.
> "Chris Conner" wrote:
>> I'll give you a big hint.. :)
>> Remember, you are not limited to just select statements in your datasets
>> for
>> your reports.
>> I have a dataset that does INSERTS for my data driven reports... (since
>> it
>> is data driven, you KNOW who the users are that are going to get the
>> reports)
>> cheers!
>> =-Chris
>>
>> "KS" <ks@.community.nospam> wrote in message
>> news:C44736CF-304A-4478-98E9-01BDA5A7C0B2@.microsoft.com...
>> > We are working on developing some statistical reports to help us to
>> > better
>> > undertand how our users are (or are not) leveraging our Reporting
>> > Services
>> > implementation. We have been writing execution log data to a database
>> > based
>> > on the Msft provided SSIS package that pulls data from the Report
>> > Server
>> > database and I've discovered a hole in the information that I'm hoping
>> > someone can help me fix.
>> >
>> > We use data drive subscriptions pretty heavily which of course require
>> > cached data credentials. In the "ExecutionLogs" table, these entries
>> > appear
>> > as being requested by "System" and the user shows our proxy account.
>> > If
>> > you
>> > review the "ReportServerService_..." log on the report server, you can
>> > see
>> > the actual detail surrounding the processing of the subscription, but I
>> > can't
>> > find a way to correlate these entries back to the ExecutionLog table.
>> > The
>> > "ExecutionLogId" in the ExecutionLog table doesn't reference any of the
>> > uniqueidentifiers that you see in the text ReportServerService log.
>> >
>> > My end goal is to be able to update the User column in the ExecutionLog
>> > table with the user who was the actual recipient of the report as
>> > identified
>> > in the ReportServerService log file.
>> >
>> > Anyone tackle this yet or have any ideas as to how it might be
>> > accomplished?
>>