Part of my work involves a SQL Server 2000 data warehouse,
updated nightly with extracts from an ERP system. I knew that the DTSs
tended to run long, and I wanted to get a better idea of exactly how much data
we were moving around.
First thing I did was add a table with the following
schema:
CREATE TABLE [tblDtsLog]
(
[fldEntry] [int] IDENTITY (1, 1) NOT NULL ,
[fldStep]
[int] NOT NULL ,
[fldStepType] [varchar] (5) COLLATE
SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[fldStepName] [varchar] (50)
COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[fldTimeStamp]
[datetime] NOT NULL ,
[fldNotes] [varchar] (255) COLLATE
SQL_Latin1_General_CP1_CI_AS NULL ,
[fldPackage] [varchar] (50)
COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON
[PRIMARY]
This is a pretty general structure, and allows me to
add different logging and debugging steps to the DTS. fldEntry is simply
an incrementing value to help sorting what happened when. fldStep refers
to the stepf of the DTS package, fldStepType is a short description of the DTS
step. I use fldStep because someone always asks me what that step is doing
when we’re looking at the numbers, and it’s easier to log it than open the DTS
and find it. fldStepName is there because I inherited almost all of these
DTSs, and the names don’t always describe what’s going on. fldTimeStamp is
the execution time of the step, fldNotes are some information I want to include
about the step (usually just a rowcount), and fldPackage is the name of the DTS
package.
One of the early steps for most DTSs is the extract
from the ERP system to an intermediate table. Depending on the package,
the data in the intermediate table used for comparisons or modified before being
migrated to the production tables. By adding a SQL task immediately
after the extract, I could get an exact idea of how many data rows we just
moved:
select * from IntermediateTable
insert
into tblDtsLog(fldStep, fldStepType, fldStepName, fldTimeStamp,fldNotes,
fldPackage) values (1,’Begin’, ‘Extract Step 1’,getdate(),@@rowcount, ‘Some
DTS’)
This simly adds one entry to the above table with a
count of how many rows exist in the intermediate table. For other steps, I use the Begin and End flags in conjunction with
the getdate() to see how much time elapsed for the processing of a particular
matching or modification step. By sprinkling these liberally through my
DTSs (some packages have 15-20 of these logging sinks in them), I have excellent
insight into what is running slow, and then we can examine each step and try to
optimize it.
Recently, we used the results of this monitoring to
make a small tweak to one step of one DTS, which resulted in an overall gain of
2 hours of processing time in the overall update!