Hi, I have an SP that was working fine (took about 10 mins to process).
I've added a section of the SP below for reference. Here is an overview of
my table structure (simplified).
EMPLOYEE TABLE
employeeid
name
IsActive
EMPLOYEE HISTORY TABLE
employeeid
name
IsActive
FromDate -- this signifies the start date the current employee record info
was valid.
ToDate -- this signifies the end date the current employee record info was
valid.
PROCESS TABLE
processid
name
startdate
enddate
Basically, this is what the SP is meant to do...... when the sp runs, it
will only carry out processes where the current date is between startdate
and enddate of the process record. And it should only process employees who
were active between these dates. I can find out backdated info about an
employee in the history table. However, if there have been no changes to an
employee (ie: new joiner), then there will be no records in the history
table so I have to read the current status from the master employee table.
Hope all this makes sense.
OLD CODE:
-- this works fairly quickly, but is not correct as it's
not looking at historic employee data
(employees.IsActive = 1)
NEW CODE:
(
--if a history record exists for current
employee, the read the info
(
(select top
1 IsActive
from
employeehistory eh
where
eh.todate > process.enddate
and
employeeid = employee.employeeid
order by
eh.enddate) = 1
)
OR
--if no history record exists
for current employee, then read current employee
--data in master employee table
(
not exists
(select *
from
employeehistory
where
employeeid = employee.employeeid
)
AND
(employees.IsActive
= 1)
)
)
The changed section has increased the process time massively. Furthermore,
it is doing a lot of work on tempdb and hence using up all the harddisk
space (and eventually fails). The SP is processing approx 500,000 records.
Does anyone know why this would happen, and what I can do to improve the
performace. What is it writing to tempdb? I've tried adding indexes to the
date/employee columns in the history table, but it doesn't help.Well, iterating over 500000 records will take a long time. You don't
provide the code of your SP; so we don't know how the iteration is taking
place and make it hard to give you any relevant help; however, using an
UNION instead of a Cursor for example might be a solution in your case.
Finally, I really don't understand why you have put an Order By in a
subquery.
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: http://cerbermail.com/?QugbLEWINF
"amy" <amy@.nospam.com> wrote in message
news:uyBhOeCUGHA.4452@.TK2MSFTNGP12.phx.gbl...
> Hi, I have an SP that was working fine (took about 10 mins to process).
> I've added a section of the SP below for reference. Here is an overview
> of my table structure (simplified).
> EMPLOYEE TABLE
> employeeid
> name
> IsActive
> EMPLOYEE HISTORY TABLE
> employeeid
> name
> IsActive
> FromDate -- this signifies the start date the current employee record info
> was valid.
> ToDate -- this signifies the end date the current employee record info was
> valid.
> PROCESS TABLE
> processid
> name
> startdate
> enddate
> Basically, this is what the SP is meant to do...... when the sp runs, it
> will only carry out processes where the current date is between startdate
> and enddate of the process record. And it should only process employees
> who were active between these dates. I can find out backdated info about
> an employee in the history table. However, if there have been no changes
> to an employee (ie: new joiner), then there will be no records in the
> history table so I have to read the current status from the master
> employee table.
> Hope all this makes sense.
>
> OLD CODE:
> -- this works fairly quickly, but is not correct as
> it's not looking at historic employee data
> (employees.IsActive = 1)
>
> NEW CODE:
> (
> --if a history record exists for
> current employee, the read the info
> (
> (select top
> 1 IsActive
> from
> employeehistory eh
> where
> eh.todate > process.enddate
> and
> employeeid = employee.employeeid
> order by
> eh.enddate) = 1
> )
> OR
>
> --if no history record exists
> for current employee, then read current employee
> --data in master employee table
> (
> not exists
> (select *
> from
> employeehistory
> where
> employeeid = employee.employeeid
> )
> AND
> (employees.IsActive = 1)
> )
> )
>
> The changed section has increased the process time massively.
> Furthermore, it is doing a lot of work on tempdb and hence using up all
> the harddisk space (and eventually fails). The SP is processing approx
> 500,000 records. Does anyone know why this would happen, and what I can do
> to improve the performace. What is it writing to tempdb? I've tried
> adding indexes to the date/employee columns in the history table, but it
> doesn't help.
>
>|||Thanks for your response. There are no cursors being used, just straight
forward selects/joins. The full SP is very long, I have only included the
part that has changed and is making the tempdb grow massively.
The order by is required because I need to find the 1st instance of the
employee history record after the process date.
Eg:
If the process date is 3rd feb and the employee history is:
empid name IsActive fromdate todate
2 tom 0 20th feb 20th march
2 tom 1 1st feb 20th feb
2 tom 0 29th jan 1st feb
2 tom 1 1st jan 29th jan
then i need to find the state of the employee as at 3rd feb. i do this by
finding the first record where the enddate > 3rd feb.
"Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)>
wrote in message news:OIqBtGEUGHA.5468@.TK2MSFTNGP14.phx.gbl...
> Well, iterating over 500000 records will take a long time. You don't
> provide the code of your SP; so we don't know how the iteration is taking
> place and make it hard to give you any relevant help; however, using an
> UNION instead of a Cursor for example might be a solution in your case.
> Finally, I really don't understand why you have put an Order By in a
> subquery.
> --
> Sylvain Lafontaine, ing.
> MVP - Technologies Virtual-PC
> E-mail: http://cerbermail.com/?QugbLEWINF
>
> "amy" <amy@.nospam.com> wrote in message
> news:uyBhOeCUGHA.4452@.TK2MSFTNGP12.phx.gbl...
>|||amy
It's hard to suggest without seeing the whole code and understand all
business requirements
I'd start looking at an execution plan , whether or not the optimizer ia
available to use indexes
Aaron has a great article at his web site
http://www.aspfaq.com/show.asp?id=2446
"amy" <amy@.nospam.com> wrote in message
news:O46x64GUGHA.3192@.TK2MSFTNGP09.phx.gbl...
> Thanks for your response. There are no cursors being used, just straight
> forward selects/joins. The full SP is very long, I have only included the
> part that has changed and is making the tempdb grow massively.
> The order by is required because I need to find the 1st instance of the
> employee history record after the process date.
> Eg:
> If the process date is 3rd feb and the employee history is:
> empid name IsActive fromdate todate
> 2 tom 0 20th feb 20th march
> 2 tom 1 1st feb 20th feb
> 2 tom 0 29th jan 1st feb
> 2 tom 1 1st jan 29th jan
> then i need to find the state of the employee as at 3rd feb. i do this by
> finding the first record where the enddate > 3rd feb.
>
> "Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)>
> wrote in message news:OIqBtGEUGHA.5468@.TK2MSFTNGP14.phx.gbl...
>|||[Reposted, as posts from outside msnews.microsoft.com does not seem to make
it in.]
amy (amy@.nospam.com) writes:
> The order by is required because I need to find the 1st instance of the
> employee history record after the process date.
> Eg:
> If the process date is 3rd feb and the employee history is:
> empid name IsActive fromdate todate
> 2 tom 0 20th feb 20th march
> 2 tom 1 1st feb 20th feb
> 2 tom 0 29th jan 1st feb
> 2 tom 1 1st jan 29th jan
> then i need to find the state of the employee as at 3rd feb. i do this by
> finding the first record where the enddate > 3rd feb.
The standard idiom is something like:
SELECT eh.issactive
FROM (SELECT * FROM process WHERE processid = @.processid)
CROSS JOIN (employees e
JOIN employeehistory eh
ON e.employessid = eh.empolyeeid
AND e.startdate = (SELECT MAX(eh2.employeedate)
FROM employeehistory eh2
WHERE eh2.empoloyeeid = eh.employessid
AND e.employeedate <= p.processdate)
Unforteunately, this is not going to perform very well. I guess it is
not possible for you change the tables, but for this sort of operation,
it can be far more effecient to have one row per employee and day, even
if it takes up a lot more disk space.
It helps if you include CREATE TABLE and CREATE INDEX statements for your
tables. Also sample data as INSERT statements with sample data is good,
as that helps to test the logic of a query.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.seBooks Online for SQL
Server 2005
athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000
athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx|||Hi Amy
Check out http://www.aspfaq.com/etiquette.asp?id=5006 on how to post useful
DDL and example data. You don't say what should happen if the employee was
only active for part of the period. This will say if the employee was active
at the start of the period or currently active (but not necessarily an
employee at the time of the period) which may be a starting place.
SELECT
e.employeeid
e.name
[EMPLOYEE TABLE] e
JOIN [EMPLOYEE HISTORY TABLE] h ON h.employeeid = e.employeeid AND
h.IsActive = 1 AND ((h.FromDate <= @.FromDate AND h.ToDate >= @.FromDate)
UNION ALL
SELECT
e.employeeid
e.name
[EMPLOYEE TABLE] e
WHERE NOT EXISTS ( SELECT * FROM [EMPLOYEE HISTORY TABLE] h WHERE
h.employeeid = e.employeeid )
AND e.IsActive = 1
John
"amy" wrote:
> Hi, I have an SP that was working fine (took about 10 mins to process).
> I've added a section of the SP below for reference. Here is an overview o
f
> my table structure (simplified).
> EMPLOYEE TABLE
> employeeid
> name
> IsActive
> EMPLOYEE HISTORY TABLE
> employeeid
> name
> IsActive
> FromDate -- this signifies the start date the current employee record info
> was valid.
> ToDate -- this signifies the end date the current employee record info was
> valid.
> PROCESS TABLE
> processid
> name
> startdate
> enddate
> Basically, this is what the SP is meant to do...... when the sp runs, it
> will only carry out processes where the current date is between startdate
> and enddate of the process record. And it should only process employees w
ho
> were active between these dates. I can find out backdated info about an
> employee in the history table. However, if there have been no changes to a
n
> employee (ie: new joiner), then there will be no records in the history
> table so I have to read the current status from the master employee table.
> Hope all this makes sense.
>
> OLD CODE:
> -- this works fairly quickly, but is not correct as it
's
> not looking at historic employee data
> (employees.IsActive = 1)
>
> NEW CODE:
> (
> --if a history record exists for curre
nt
> employee, the read the info
> (
> (select to
p
> 1 IsActive
> from
> employeehistory eh
> where
> eh.todate > process.enddate
> and
> employeeid = employee.employeeid
> order by
> eh.enddate) = 1
> )
> OR
>
> --if no history record exists
> for current employee, then read current employee
> --data in master employee tabl
e
> (
> not exists
> (select *
> from
> employeehistory
> where
> employeeid = employee.employeeid
> )
> AND
> (employees
.IsActive
> = 1)
> )
> )
>
> The changed section has increased the process time massively. Furthermore
,
> it is doing a lot of work on tempdb and hence using up all the harddisk
> space (and eventually fails). The SP is processing approx 500,000 records
.
> Does anyone know why this would happen, and what I can do to improve the
> performace. What is it writing to tempdb? I've tried adding indexes to t
he
> date/employee columns in the history table, but it doesn't help.
>
>
>|||Depending on the circumstances, joining on a sub-query can result in
performance issues.
Instead of doing this:
not exists (select * from employeehistory where employeeid =
employee.employeeid)
Consider doing this:
left join employeehistory on employeehistory.employeeid =
employee.employeeid
Also consider inserting relevent transactions from employeehistory into a
temporary tables and joining with that.
You can use the Display Estimated Execution Plan feature of Query Analyzer
to determine what lookups and indexes are used and compare different
versions of a SQL statement:
http://msdn.microsoft.com/library/d... />
1_5pde.asp
"amy" <amy@.nospam.com> wrote in message
news:uyBhOeCUGHA.4452@.TK2MSFTNGP12.phx.gbl...
> Hi, I have an SP that was working fine (took about 10 mins to process).
> I've added a section of the SP below for reference. Here is an overview
> of my table structure (simplified).
> EMPLOYEE TABLE
> employeeid
> name
> IsActive
> EMPLOYEE HISTORY TABLE
> employeeid
> name
> IsActive
> FromDate -- this signifies the start date the current employee record info
> was valid.
> ToDate -- this signifies the end date the current employee record info was
> valid.
> PROCESS TABLE
> processid
> name
> startdate
> enddate
> Basically, this is what the SP is meant to do...... when the sp runs, it
> will only carry out processes where the current date is between startdate
> and enddate of the process record. And it should only process employees
> who were active between these dates. I can find out backdated info about
> an employee in the history table. However, if there have been no changes
> to an employee (ie: new joiner), then there will be no records in the
> history table so I have to read the current status from the master
> employee table.
> Hope all this makes sense.
>
> OLD CODE:
> -- this works fairly quickly, but is not correct as
> it's not looking at historic employee data
> (employees.IsActive = 1)
>
> NEW CODE:
> (
> --if a history record exists for
> current employee, the read the info
> (
> (select top
> 1 IsActive
> from
> employeehistory eh
> where
> eh.todate > process.enddate
> and
> employeeid = employee.employeeid
> order by
> eh.enddate) = 1
> )
> OR
>
> --if no history record exists
> for current employee, then read current employee
> --data in master employee table
> (
> not exists
> (select *
> from
> employeehistory
> where
> employeeid = employee.employeeid
> )
> AND
> (employees.IsActive = 1)
> )
> )
>
> The changed section has increased the process time massively.
> Furthermore, it is doing a lot of work on tempdb and hence using up all
> the harddisk space (and eventually fails). The SP is processing approx
> 500,000 records. Does anyone know why this would happen, and what I can do
> to improve the performace. What is it writing to tempdb? I've tried
> adding indexes to the date/employee columns in the history table, but it
> doesn't help.
>
>