SQL QUERY NIGHTMARE
Last week, we faced a basic and yet the only major problem that sql developers/admin had to deal with and yes your guess is right, It was with SQL QUERY RUNNING SLOW. I will be explaining the problem below, however I want everyone reading this to help me find answers because I still haven't found one!
One more amazing aspect of the issue was that it worked fairly quickly in Development but was running like a tortoise in Production even though the number of records were practically the same.
I'll start by throwing some light on to the execution plan, so that we narrow down to the code which was the culprit here. The code is enclosed into a stored proc. On contacting my dba to enquire about the issue, I got the following replay:-
I must hand it to our dba to provide such a detailed response.
So my job was now to look at the code and modify it. What I did was to shrink the calender table into a temp table consisting of few rows that would speed up the operation. Below you would see the culprit code and a rectified version of it.
Culprit Code :-
One more amazing aspect of the issue was that it worked fairly quickly in Development but was running like a tortoise in Production even though the number of records were practically the same.
I'll start by throwing some light on to the execution plan, so that we narrow down to the code which was the culprit here. The code is enclosed into a stored proc. On contacting my dba to enquire about the issue, I got the following replay:-
I must hand it to our dba to provide such a detailed response.
So my job was now to look at the code and modify it. What I did was to shrink the calender table into a temp table consisting of few rows that would speed up the operation. Below you would see the culprit code and a rectified version of it.
Culprit Code :-
-- Customer Table and latest advertiser
with complete hierarichy details
INSERT INTO
reporting.Obtain_CustAM_Hier
SELECT cod.AdvertiserId
,
cod.SalesConsultantId AS
end_employee_sid ,
COALESCE(st10.end_org_team_key,
stmax.END_ORG_TEAM_KEY)
end_org_team_key ,
COALESCE(st10.end_org_area_key,
stmax.END_ORG_AREA_KEY)
end_org_area_key ,
COALESCE(st10.end_org_region_key,
stmax.END_ORG_REGION_KEY) end_org_region_key ,
COALESCE(st10.end_org_channel_key,
stmax.END_ORG_CHANNEL_KEY) end_org_channel_key ,
COALESCE(st10.Cycle_Qtr,
latest_cod.FIN_QTR_KEY)
Cycle_Qtr ,
cod.FromDateKey ,
cod.ToDateKey
FROM #ADOWNER cod with (nolock)
INNER JOIN ( SELECT DISTINCT
DT.FIN_QTR_KEY ,
AdvertiserId ,
MAX(fromdatekey) FROMDATEKEY ,
MAX(ToDateKey) TODATEKEY
FROM (
SELECT ADV.* ,
DT.FIN_QTR_KEY END_QTR
FROM ( SELECT
AdvertiserId ,
DT.FIN_QTR_KEY STRT_QTR ,
MAX(FromDateKey) AS fromdatekey ,
MAX(ToDateKey) AS ToDateKey
FROM
#ADOWNER AO
INNER JOIN
CustomerMeasurementTool.workflow.Calendar DT ON DT.DATE_KEY = AO.FromDateKey
AND FIN_QTR_KEY >=
20131
GROUP BY AdvertiserId
,
DT.FIN_QTR_KEY
) ADV
LEFT JOIN
CustomerMeasurementTool.workflow.Calendar DT ON DT.DATE_KEY = ADV.ToDateKey
) AO
LEFT JOIN
CustomerMeasurementTool.workflow.Calendar DT ON DT.DATE_KEY BETWEEN
fromdatekey AND ToDateKey
WHERE CALENDAR_DATE <= CONVERT(DATETIME, CONVERT(VARCHAR, GETDATE()
- 1, 101))
GROUP BY
FIN_QTR_KEY ,
AdvertiserId
) latest_cod ON
cod.AdvertiserId =
latest_cod.AdvertiserId
AND cod.FromDateKey
= latest_cod.FROMDATEKEY
LEFT JOIN
#ALL_CONSULATANTS_Prev st10 ON cod.SalesConsultantId =
st10.end_employee_sid
AND latest_cod.FIN_QTR_KEY
= st10.Cycle_Qtr
LEFT JOIN ( SELECT EHD.EMPLOYEE_SID AS
END_EMPLOYEE_SID ,
EHD.ORG_TEAM_KEY AS
END_ORG_TEAM_KEY ,
EHD.ORG_AREA_KEY AS
END_ORG_AREA_KEY ,
EHD.ORG_REGION_KEY AS
END_ORG_REGION_KEY ,
EHD.ORG_CHANNEL_KEY AS
END_ORG_CHANNEL_KEY
FROM [BIStage].[dbo].[EMPLOYEE_HIST_DIM_V]
EHD with (nolock)
INNER JOIN ( SELECT
EMPLOYEE_SID ,
MAX(BEGIN_DATE_KEY) AS BEGIN_DATE_KEY
FROM [BIStage].[dbo].[EMPLOYEE_HIST_DIM_V] with
(nolock)
WHERE
POSITION_KEY > 0
GROUP BY EMPLOYEE_SID
) EMP ON EHD.EMPLOYEE_SID = EMP.EMPLOYEE_SID
AND EHD.BEGIN_DATE_KEY
= EMP.BEGIN_DATE_KEY
) stmax ON cod.SalesConsultantId =
stmax.END_EMPLOYEE_SID;
CHANGED CODE :-
I have changed the yellowed part of the code and insert new
code higlighted in green as :-
--
Temp Tables to build a mini version of the Calender Table.
select * into
#Mini_Cal from CustomerMeasurementTool.workflow.Calendar where FIN_QTR_KEY >=
20131
SELECT DISTINCT
DT.FIN_QTR_KEY ,
AdvertiserId ,
MAX(fromdatekey) FROMDATEKEY ,
MAX(ToDateKey) TODATEKEY into
#latest_cod
FROM (
SELECT ADV.* ,
DT.FIN_QTR_KEY END_QTR
FROM ( SELECT
AdvertiserId ,
DT.FIN_QTR_KEY STRT_QTR ,
MAX(FromDateKey) AS fromdatekey ,
MAX(ToDateKey) AS ToDateKey
FROM
#ADOWNER AO
INNER JOIN
#Mini_Cal DT ON DT.DATE_KEY
= AO.FromDateKey
GROUP BY AdvertiserId
,
DT.FIN_QTR_KEY
) ADV
LEFT JOIN #Mini_Cal
DT ON DT.DATE_KEY
= ADV.ToDateKey
) AO
LEFT JOIN
#Mini_Cal DT ON DT.DATE_KEY
BETWEEN fromdatekey AND
ToDateKey
WHERE CALENDAR_DATE <= CONVERT(DATETIME, CONVERT(VARCHAR, GETDATE()
- 1, 101))
GROUP BY
FIN_QTR_KEY ,
AdvertiserId
-- Customer Table and latest advertiser
with complete hierarichy details
INSERT INTO
reporting.Obtain_CustAM_Hier
SELECT cod.AdvertiserId
,
cod.SalesConsultantId AS
end_employee_sid ,
COALESCE(st10.end_org_team_key,
stmax.END_ORG_TEAM_KEY)
end_org_team_key ,
COALESCE(st10.end_org_area_key,
stmax.END_ORG_AREA_KEY)
end_org_area_key ,
COALESCE(st10.end_org_region_key,
stmax.END_ORG_REGION_KEY) end_org_region_key ,
COALESCE(st10.end_org_channel_key,
stmax.END_ORG_CHANNEL_KEY) end_org_channel_key ,
COALESCE(st10.Cycle_Qtr,
latest_cod.FIN_QTR_KEY)
Cycle_Qtr ,
cod.FromDateKey ,
cod.ToDateKey
FROM #ADOWNER cod with (nolock)
INNER JOIN
#latest_cod latest_cod ON cod.AdvertiserId = latest_cod.AdvertiserId
AND cod.FromDateKey
= latest_cod.FROMDATEKEY
LEFT JOIN
#ALL_CONSULATANTS_Prev st10 ON cod.SalesConsultantId =
st10.end_employee_sid
AND latest_cod.FIN_QTR_KEY
= st10.Cycle_Qtr
LEFT JOIN ( SELECT EHD.EMPLOYEE_SID AS
END_EMPLOYEE_SID ,
EHD.ORG_TEAM_KEY AS
END_ORG_TEAM_KEY ,
EHD.ORG_AREA_KEY AS
END_ORG_AREA_KEY ,
EHD.ORG_REGION_KEY AS
END_ORG_REGION_KEY ,
EHD.ORG_CHANNEL_KEY AS
END_ORG_CHANNEL_KEY
FROM [BIStage].[dbo].[EMPLOYEE_HIST_DIM_V]
EHD with (nolock)
INNER JOIN ( SELECT
EMPLOYEE_SID ,
MAX(BEGIN_DATE_KEY) AS BEGIN_DATE_KEY
FROM [BIStage].[dbo].[EMPLOYEE_HIST_DIM_V] with
(nolock)
WHERE
POSITION_KEY > 0
GROUP BY
EMPLOYEE_SID
) EMP ON EHD.EMPLOYEE_SID = EMP.EMPLOYEE_SID
AND EHD.BEGIN_DATE_KEY
= EMP.BEGIN_DATE_KEY
) stmax ON cod.SalesConsultantId =
stmax.END_EMPLOYEE_SID;
Apologies for such a messy code, however if you are able to follow or understand the code (by going through highlighted rows) you will find that I have done following things:-
1. Limited the Calendar table.
2. Taken the complicated code out and put it into a temp table.
The above changes did the trick and the stored proc ran within the SLA of 6 minutes (was taking 3 hours on that fatal day).
However the query which remained unanswered is why the stored proc ran much faster in our DEV environment than compared to PROD environment.
I know I may not have given the complete info, please ask questions to solve this conundrum!
Hi, First in comparing test to production, are the databases similar in size?
ReplyDeleteUsually, when identical databases perform so drastically different for the same query, I would suspect that indexing and statistics related to those indexes are probably in need of an update. With a rebuild of all indexes and fresh statistics, let us know how your database query is performing.
Good luck.
Jeff Bennett
Missouri, US
Hi Benett....thanks for your comment. I was able to fine tune the query as you can see in the description. My main reason for writing this blog was to find out what was the issue. Coming back to your suggestion, development database has the stats refresh every week whereas in prod its on daily basis.
DeleteDidnt try rebuilding the indexes though.
Is parameter sniffing a possible culprit?
ReplyDeleteHi Jon. There were no input parameters so that ruled it out.
DeleteIs this a new query in Prod or has it run fine before?
ReplyDeleteHave you compared the query plans for Prod and Development, what about the io statistics for the query?
Have you checked that the prod and development databases are in the same compatibility mode / using the same cardinality estimates.
You say that the databases are about the same size but is the data similar?
1. Old query
Delete2. They were identically the same. In prod cpu io went to 100% which caused the stored proc to run very slow and you eventually have to terminate the execution. Whereas in dev io too reached 100 % but momentarily.
3. Didnt check that would do tommorow, whats the impact for this please?
4. its a transactional data. prod has a bit more data than dev, however not essentially the same
I counted a lot scans:
ReplyDelete> Outer Joins.
> A function call in a WHERE clause.
> Searches on greater than, less than. Note that all inequalities will always cause scans.
I don't have the time to dissect your query but you can assume that a big query with a lot of scans will take some time to execute and consume much if not most of the RAM on your server.
I think you can further improve the performance of your revised query.
I recommend:
> Break the query into smaller pieces.
> Use the sub-queries to populate Temp tables before you execute the main query. In other words, take the sub-queries our of the main query. Join on the temp tables in the main query. Sometimes, you can use Table variables for this. There are rules that matter with Table variables; I don't have the time to cover them here. Used improperly, Table variables cause scans.
> Try to eliminate outer joins wherever possible.
> Try to eliminate any data comparison that is not an equality.
> Eliminate all function calls from WHERE, ON, and HAVING statements.
> Think about every component of your query and how many locks it must hold to complete. Do everything you can to reduce the number of locks held and their duration.
> Get rid of the DISTINCT statement if you can. It causes additional work and execution time.
One last thing. When databases are small and servers are not under load, every query tends to execute quickly. My guess is this might describe your test server.
The production server is likely a very different environment. The database(s) are likely much bigger. It's a contentious environment where queries are competing for server resources and access to the same data. Of course, it's going to take longer to execute than it did on a test server.
In the production environment, you want to minimize the locking and blocking times by dividing your query up into smaller segments that "nibble" at the database rather than take one large bite. Then, at the end, you can assemble the desired result by using one large query that JOINS on your temp tables. They're your temp tables so your query will not block other queries, nor can it be blocked.
Hope this helps.
Thanks for the detailed explanation. As I explained in the description the data in the dev and prod databases is almost the same.
DeleteAll I want to see why a particular stored proc runs for 5 minutes in dev however takes ages to run in prod even though nothing has changed in the prod and the same code was running without any problems since so long.
Another difference that could have in production environment and developer, is the traceflag's.
ReplyDeleteIn case of difference on traceflag 4199 I suggest you to see if the version of SQL Server was the same in both and all hotfixes was installed. This leads you to different plans.
Another shot could be sp_configure: e.g. cost threshold for parallelism
And the last, connection options: Ansi_nulls, Ansi_Warnings..
Thanks Marccelo, I'll look into this!
DeleteAre the server specs the same?
ReplyDeleteHow about the max parallelism?
Any other settings on SQL different between servers?
Any resource governor in place on live?
I would ask the dba to check on that. Thanks
DeleteHave you compared execution plans between 2 environments? Any differences?
ReplyDeleteSomething as simple as disk fragmentation could cause the differences, or how many concurrent processes are running on both environments.
Plans were identical.
DeleteThanks
I appreciate your work on Sql Server It's such a wonderful read on MSBI course. Keep sharing stuffs like this. I am also educating people on similar MSBI training so if you are interested to know more you can watch this MSBI tutorial:-
ReplyDeletehttps://www.youtube.com/watch?v=tFG-VkaSvhI
and regarding the Certification Process and step by step instructions please check on this LInk :-
https://www.youtube.com/watch?v=yf__UkGxQ8c
and if you are a beginner in MSBI then its important to get started from the scratch so check this link for better understanding:-
https://www.youtube.com/watch?v=OzmdY0zCw4g
https://www.youtube.com/watch?v=EdF9tZliIok