Wednesday, November 9, 2016

Why I Avoid Using Hints in my Database Queries

For those who have watched my Pluralsight courses on either Oracle or SQL Server performance tuning, you will notice that I don't talk about database hints in either course.  This is intentional on my part.  I tend not to use hints in any of my production SQL statements and overall I only use hints in very limited cases.

I made the decision not to cover hints in my courses because too often times, I have seen hints used incorrectly, often by someone who didn't understand what the hint was doing.  I've seen cases where someone read about hints "on the Internet" and thought by dropping a hint into their SQL statement, it would act as some sort of magical performance booster for their statement.  Unfortunately, there is no magic going on, and like any technology or technique, the result can actually be a worse outcome if you don't understand what is going on.

So lets explain what a little bit about hints.

The Query Optimization Process
When you submit a SQL statement to any database, a piece of software called the Query Optimizer parses the SQL statement and determines the fastest way to process that statement.  It is the query optimizer that determines if an index can be used or a table scan should be performed.  It also determines how to perform joins between tables and in what order the tables should be joined if you have multiple join conditions.

To determine the most efficient way to process your statement, the optimizer looks at the statistics of the tables involved in the statement, including the total number of rows and the distribution of those rows in the table.  It also looks at the indexes on the table and the number of unique keys in the index and matches all of this data up with the where clauses and join criteria you have specified in your statement.  Using these statistics, the optimizer can estimate the cost of all of the different ways to perform the SQL statement, and it will pick the lowest cost combination of these operations.

All of this usually happens in 100 milliseconds or less.  And optimizers today are really, really good at picking the right execution plan that will result in the fastest way to execute your statement.

What Does a Hint Do
When you supply a database hint, you are taking control of how the statement will execute and taking this control away from the query optimizer.  The problem is that the best way to execute a statement may change over time.  Maybe a table gets more data or the distribution of the data in a table changes.  Under normal circumstances, the query optimizer can adjust to these changes and come up with a new plan which is most efficient for the current data set.

When you provide a hint, you are taking this flexibility away from the optimizer.  So now as things change in your database, the optimizer cannot adjust.  So now, you have a sub-optimal plan, and many times that plan is much less efficient than the plan the query optimizer would have come up with on its own.  The problem is that while you can figure out the right hint for the way the data looks today, you have no way of knowing what the data will look like tomorrow.  Yet by providing a hint, you are really committing to a specific execution plan, even if that plan is wildly inefficient for tomorrows problem.

Is There Ever a Time to Use Hints
Sometimes I am surprised when I get a certain execution plan.  For example, I may have expected the optimizer to use a different index than what it did.  So in my SQL editor, I'll use a hint to look at that different version of the plan.  In almost every case, the plan produced by the optimizer is less expensive than the plan with the hint.  Its not that I don't trust the optimizer, but sometimes being able to contrast the two execution plans helps me understand why the optimizer is making the decisions it is.

The other time to use a hint would be if you were instructed to do so by Oracle or Microsoft technical support.  I've never had to do this, and I'm guessing these instances are very few and very far between.

Summary
I really feel like it is best to avoid using hints in any production SQL statements your application might run.  At first, they may seem magical.  But the optimizers included in today's database products are really, really good.  When you start including hints in your statement, you are taking control away from the optimizer and its ability to find the best plan for the current state of the table.  This almost always results in a less efficient plan and therefore slower running SQL statement.  So I encourage you to avoid using hints in any production SQL you might run.

Monday, November 7, 2016

MKE DOT NET 2016

On Saturday, October 29th, I attended MKE DOT NET in Pewaukee, WI. This is a one day conference focused around .NET organized by Centare. I was fortunate enough to be accepted to speak at the event, which gave me a chance to share my knowledge as well as attend and learn from others.
One of the big draws of this event is the speaker lineup. In the two years the event has existed, Centare has used their connections in the .NET community to get some high profile speakers for the event. In 2015 (which I did not attend) they had uncle Bob Martin give the keynote. This year, the keynote was delivered by Scott Hanselman of Microsoft. In addition, Stephen Cleary and Ben Day were also speaking at this years event. However, there was also a strong contingent of speakers from the Milwaukee and Madison area, so there was a nice balance of well known names with local practitioners. This was one of the first things that I really liked about the event was this balance of different perspectives, and I hope this continues in future years. Below, I'll give a brief review of the talks I attended, my impressions and my takeaways.

 Keynote - Scott Hanselman
The morning started with a an hour long talk by Scott Hanselman discussing how and why Microsoft is evolving .NET to the .NET Core platform.  We read a lot about .NET Core these days, but there is very little written about the reasons why the change is being made.  Mainly, Microsoft wants a much smaller package that someone can get started with coding .NET without having to download a 2 GB package containing Visual Studio and all the other tooling that is necessary to code in .NET today.  He also talked about how while the .NET Core runtime has been released, a lot of the tooling is still in beta.  This is useful to know where the state of everything is as you decide what version to use for new projects.

The one piece of advice I really liked is when he said "know what level of abstraction you are working at and make sure to know how things work one level deeper".  This is great advice, and something I have always believed in.  I think the is a great way to put this and sums things up nicely.  We always want to know what are tools and frameworks are doing behind the scenes, and this statement encapsulates that advice nicely.

Session #2 - Ben Day - Claims Based Security
I was speaking in the first slot, so the first session I attended was Ben Day's session on Claim based security.  I think he did a really good job of summarizing why role based authentication is really too simplistic of a model in today's world.  He described claims based security, and how most systems will fist check if a user is in a role and then check individual claims.  I've started to see more system s move to more of a claims based model, and this talk provided a good overview of how this is implemented in .NET using the identity provider.


Session #3 - Josh Schultz - Working With Humans
This was a talk about how to work with people, and I found a lot of the advice very practical and useful.  I loved the suggestion of having lunch with a coworker you may not get along with very well and talking about anything outside of work.  A lot of his advice was very practical, like considering other perspectives and not making assumptions about what others are thinking.  All things we know, but that we should constantly be reminded of.  I also liked that he stressed that there is a really important aspect to development which is understanding the problem you are trying to solve, and asking lots of questions to make sure you do.  This was a really good talk.  While Josh claimed he didn't have all the answers, he had a lot of good advice in his presentation.

Session #4 - Andrew Bloechl - Building SaaS Products in the Cloud
This talk was really more like a retrospective of an effort to move one of their applications to the cloud.  He went through the decisions that they made, why they made those decisions and some of the decisions that they later reversed.  I really liked this talk because it wasn't the typical overview of a cloud service.  It was real world experience of the ups and downs.  

For example, they encountered problems where the system was very slow for the first user because Azure services would all shut down after 20 minutes of non-use, meaning the whole stack had to be fired up again when the next user hit the site.  There is now a feature where you can keep everything in an 'always up' state, but that isn't something you typically think about from the start.  They also had issues where people created virtual machines, forgot about them and left them running and got a big bill at the end of the month.  Overall, they did really like the cloud solution and the flexibility it brought, but it was good to hear about some of the challenges they had to overcome.  So often these solutions are presented in such a blue sky manner, its great to see talks like this that remind us there is a learning curve and can help the rest of us get up to speed on that curve.

Session #5 - Samrat Saha - Azure Functions
This was an interesting talk on Azure Functions, which allow you to write a function to be implemented in Azure and not worry at all about how the function is hosted or set up from an infrastructure perspective.  The example the speaker used was a series of functions that could respond to text messages.  He actually talked about how you could chain these functions together by using message queues so that you could keep each function small and concise, which is the point.  He also talked about some of the challenges they had in sharing code between different functions and how they solved that by coming up with a set of shared libraries.  Azure functions is something I want to look at more, so this provided a good introduction to the service.


Other Details
The event was extremely well organized.  When checking in, everyone received a nicely printed program listing all of the talks and what room they were in.  In addition, all of this information was posted on the outside of each room.  Each room had plenty of seating and there was a water station set up in the back

When I went to my room to setup my presentation, they had multiple video adapters up at the podium so you could connect your laptop to the projector.  I carry my own adapters, but this is really thoughtful just in case someone forgot to bring their adapter.  There was a staff member in each room to help with any setup issues as well.  Finally, each room was equipped with a wireless mic.  From a speakers perspective, the room setup was one of the best that I have seen.

Both breakfast and lunch was provided at the event.  In addition, they had a social event afterwards where everyone received two drink tickets plus a nice commemorative glass from the event.  Way better than a t-shirt, which all of us already have zillions of.  Now I have a nice glass that I can think back to the good memories any time I use it.

The cost to attend was $99.  While this might be a detriment to some, I think the event was well worth it.  Some first rate speakers, a nice facility, meals and first rate organization.  I don't have any connection to Centare other than speaking at this conference, but I think they are really to be commended for how well the event was put together.  If you are in the upper Midwest next fall, be sure to watch for when the event is announced, because I really think this event is worth attending.