About myself and everyone else.
Biggest bang for buck.
You finally find the time to do some pro active work, where to start.
The IT environments we work in are very seldom simple. You don’t rely on only one component to determine the performance , especially when it comes to your database. The database is in many cases the final stop, or stop-over for data.
sp_rename / alter. You don’t want to modify production code, don’t put yourself in that position. Not only is it dangerous for the DBA slip-ups. But it’s also prone to CLM (Career Limiting Move). Be thorough, rename a proc, create your own copy which you can modify in isolation.
Rollback \ begin tran. Never underestimate the use of this. Although a good DBA should always be able to recover from a log without anyone noticing, I still the prevention is better than the cure.
Apples with Apples. This is a very good point which I added to the post because I forgot to take the relevant precautions in order to ensure I’m comparing Apples with Apples. In the general sense it’s good to get an idea of performance difference, however in some cases the results can be so small by comparison that you need to be 100% sure of the experiment. If you are comparing Apples with Pears. It won’t hold in a court!
Set statistics io \ time on.In many cases I have found this handy to do comparative checks for procedures. A great way to do many iterations is running this in QA (Query analyser)
Exec proc1 @Setting3
Exec proc1 @Setting4
Exec proc1 @setting2
Exec proc1 @setting1
You will get a line by line difference in executing times depending on the statement or variable.
You could also alternative the lines with a freeproccache / drop clean buffers etc. Not recommended for production systems.
Caching. Before you publish any results, be aware of caching, this might be SQL buffer cache, Hardware cache, SAN cache and more. Do many iterations for cache flushes in between to make sure that it wasn’t a fluke. You could capture query plans and compare the “fluke” query plan. But this might be a bit of work and tricky to read the output if you have big procs.
1-Change-at-a-time.This is extremely important, even if you know both changes can improve performance do them one at a time. This will give you an idea on the ratio of improvement between changes. Ratios are fun things and most of the time follow the 20/80 principle closely give and take 10%.
The knee. I think this is a statistical terms but I first understood it while reading the 443 SQL book. This knee refers to the point where the wheels come off or things become exponentially worse because of a load factor. Testing might be going great and then have the sudden drop-off the cliff.
Patience. You can make changes after thorough testing which don’t have immediate ground breaking results when implemented in production. Give it some time. When new changes are implemented you should have confidence in your testing (this is why you should be pedantic about testing). A production system might need a couple of days to produce visible results, again this could be due to SAN caching, SQL procedure caching, even bad statistics.
Think big, outside the box. Think creative and try to understand data flow end to end. Pick out spots where there could be a bottleneck. This creative thinking usually delivers a lot of dead ends but on the flip side it can also lead to great discovers. I won’t discredit this line of thinking but you will have to be in a mindset that it could be all or nothing.
Mission or explore. It is good to have a goal and an idea what to fix or where to improve. It is also a good thing to take your time and explore, log through logs on systems not relating to SQL, you might find the a default IIS site with W3C logging and a poorly designed webpage is munching away on your disk. You may find all kinds of anomalies on a computer which could affect your performance. If you are not familiar with the server that got plugged out when the office cleaners cleaned the office, search for it, it’s a good laugh.
Notes. Make notes as you go along. If you find nuggets online or little gems , write the down. If I have followed through on that I would have had a book by now. I have a folder with printouts of good articles. Such as the one describing the overhead of SQL_Latin VS Windows collation. This might be a tiny overhead but on a highly transactional system it could be major. You also won’t pick this up with SQL testing unless you test two separate instances on the same box.
Don’t be afraid of the SAN MAN. As a DBA you will most likely at some stage need storage. Unless you are in the cloud right, many people seem to think once you stick the word cloud in your project storage issues vanish in a puff of smoke. right. Find out what the SAN guy needs, maybe he needs to move things around to get more space, maybe he needs to reclaim a slot in order to do maintenance on a unit, who knows. You will need to SAN MAN to get info on disks and especially when you want to start testing with SQLIO.
SQLSIM VS SQLIO. SQLSIM is not used for performance testing of your storage unit use SQL IO. There are plenty of info on the web on this subject.
Query plans. Do you understand a query plan? Here is a way to test it. Hash join. If you can write a query first time round that runs a has join then you understand query plan. If you don’t this is a good way to practice that you understand what is meant by the operator. Start with the easy stuff, a table scan, index scan, index seek, bookmark lookups etc. Then go into the different types of joins. Honestly said, I have no idea how to write a query to perform a spool, but I know it’s bad!
Easy wins. When you look at query plans it’s good to know how certain operations work such as hash joins, sorts etc. Some of these will always be more resource intensive. Unless you have a very good reason to order data don’t. Doing an order by desc to get a top 1 is not always the best way to solve the problem.
NOLOCK. The biggest room for improvement for this is using the hint in update operations that joins to lookup tables.
You are updating employee table with a status which is found in the lookup table(Your update statement joins the two). When the query runs it will always lock the lookup table untill the update completes because the lookup table is participating in the update and SQL needs to be sure the data won’t change. This lookup table is a great candidate to have a NOLOCK hint. I came across this when I was exploring the dmv sys.dm_db_index_operational_stats. I found a lookup table with 12 rows consuming more than 80% of the time waiting on my system!! I couldn’t understand this because the values in the lookup don’t change why the waiting. BTW it’s a very cool dmv, I might write something on it.
Post-IT. We don’t all have performance data collection systems. So how do we track changes. A good way to do it is post it notes. Just note it down when you make any change on any system and stick it somewhere. (not on your screen). It also looks cool, and by looking cool and busy you are on you way to management and with a good technical know-how you will make a great manager. Remember to train your successor.
It’s also good to read the difference between performance, load, stress and scalability testing.