Author Archives: RainerUnwin

Attitude

In order to illustrate a point I thought I would share a story which is loosely based on some events in my life although the scenario and people have been changed.

Once upon a time there were a group of builders who were working in the middle of a blazing hot summer. The scale of the building project made it seem like no progress was being made, and everyone became increasingly disheartened. Everyone? Why no, there was one jolly fellow, named Jack, who whistled all day and always had a joke to tell. No matter how hot the day, how strong the sun beating down on him and how thirsty he became while working his attitude never wavered. “How is this possible?” thought his co-workers, and yet no-one had bothered to ask.

One day, even hotter than all the others, Jack was whistling away and cracking jokes like always. Suddenly one colleague, David, asked Jack how he could be so merry in such conditions. All of a sudden everyone stopped to listen to Jack’s answer. Jack thought about the question for what seemed like an eternity to everyone else. Then, all of a sudden, with a burst of laughter and an innocent grin Jack just chuckled “I just decided I would be happy!”.

The fact is that we can choose the job we work to at least some degree. We can change jobs, or stick around. We can choose how we do the job we are employed to do, as long as our employer keeps us in the job. We can even choose how we respond to what we are asked to do. Beyond that we cannot choose the exact work that we will do because we are being paid to do a specific job. Projects come and go and the project work must be done. It may be boring, it may be repetitive, it may frustrating. Yet only you choose your attitude. You have the ability to bring a good attitude or a bad attitude. You have the power to have a laugh at work or be bored. You have a choice to play around a bit, have a coffee and a good time. You get to choose how you interact with other people and with your work.

What does this have to do with agile? It’s simple. Someone else may give you a process to follow or a project to work on. Only you decide how will approach the work. If anyone says that they are: –

  • Bored
  • Cannot think of anything to improve in a retrospective
  • It’s impossible to release more quickly
  • The project sucks
  • The software stack is terrible
  • The existing codebase looks like horse vomit covering dog poop, and smells worse
  • etc.

You are still the people who choose how you respond. If you bring a positive attitude and try and find a way, or choose to just make the most of the time with your colleagues, you can still have a good (Or at least better) time at work.

So what attitude will you bring to work? How will you respond to what you see? Will you roll your eyes and raise a snarky remark, or will you crack a joke and make your worklife better? Most of us have resigned and moved jobs often enough to know that the next job will likely be very similar to the previous one. It’s not the job that matters, it’s what attitude you decide to bring to work. So, what will it be? Sad or happy?

Writing software and analogies to manufacturing

There is something I heard at work the other day that I wanted to discuss. In my many years in various development departments I have often heard about “Feature factories”, or how your process is like building a bridge or something similar.

It seems as though we like to make comparisons to manufacturing for some reason. Maybe this is because that is where many managers came from in the past. Another possible explanation is because software engineers need to explain things to managers who have never written software. Whatever the reason is there seems to be a general misunderstanding around writing software, and that it is in fact very different to manufacturing.The downside is that, in my opinion, this has consequences for developers.

When we explain everything in a manner related to manufacturing we make it sound as though writing software is something that can be done once and then repeated easily. This is often not the case. When we are given a problem to solve in software, there are often differences to what we have done in the past. These subtle differences mean that we need to treat the problem as being new and different. Sure some components can be re-used and we can use various frameworks. Still it is quite different to manufacturing.

So what is a better analogy for writing software? I think that something like writing a book or doing a scientific experiments is closer to the mark. I think that experimentation is a better analogy than manufacturing. Writing a book is not a repeatable thing, each one is a bit different. The process might involve steps like brainstorming, create a rough draft and continue to refine it, etc. In the same way science is about trying things until you find what works. Software is like that, a little different each and every time.

Ok so it’s about experimentation, but why is that different to manufacturing? It is different because it is like the design phase of making something new each time. You wouldn’t want people to experiment with how to build a road each time they have to build a new road, you wouldn’t want every possible design of bridge built one after another in the same place to find the best fit (Well you’d have to knock the previous iteration down between each build as well so that’s kinda bad and wasteful too). Building something, even something complex like a car, is something we understand very well. It is the design that is a little different each time. Writing software is like that design phase, not the manufacturing phase.

When we think of car production lines we aren’t thinking of the design phase but the production phase. That is analogous to making a new copy of the same software. If you remember the glorious times of software on CDs and DVDs, then the production line is the act of producing the copies of the software to be sold by writing the CDs/DVDs. The act of writing software is like designing a new model of car.

In summary

When we talk to management and they make the mistake of using a poor analogy, we should stress the difference and that we cannot have a team of developers working like a “Feature factory” similar to how a car assembly line works. We are more like car designers or scientists trying to find the right solution to a new problem. The target is always in motion and we will make “Mistakes” and learn along the way.

The principles behind the (agile) manifesto

We’ve looked at the manifesto for agile software development. There are also 12 principles being the manifesto. As we are about to embark on a journey of exploration of some of the “agile” methodologies, it might be instructive to start by reviewing these principles and to try and group them a little bit.

  1. Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
  2. Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.
  3. Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
  4. Business people and developers must work together daily throughout the project.
  5. Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.
  6. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.
  7. Working software is the primary measure of progress.
  8. Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.
  9. Continuous attention to technical excellence and good design enhances agility.
  10. Simplicity–the art of maximising the amount of work not done–is essential.
  11. The best architectures, requirements, and designs emerge from self-organizing teams.
  12. At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

When we look at the principles only a few major categories jump out at me. What do you see? The categories are that I see are:

  • People focused
  • Software focused
  • Process focused

Naturally we would expect the principles behind the “Manifesto for agile software development” to be principally focused on software and processes wouldn’t we? What is interesting is that I wouldn’t categorise them that way. What might seem reasonable on the surface (Process and software focus) doesn’t seem to be the case. Allow me to split them up a bit by rearranging them the way I see them (For brevity I will only write out the part that stands out to me):

People focused

  1. Principle 1 – Customer satisfaction.
  2. Principle 2 – Changing requirements for customer advantage.
  3. Principle 4 – Developers and business people working together daily.
  4. Principle 5 – Teams of motivated individuals.
  5. Principle 6 – Face-to-face communication.
  6. Principle 8 – People able to maintain a constant pace pace.
  7. Principle 11 – Self-organising teams.

Software focused

  1. Principle 3 – Frequent delivery of software.
  2. Principle 7 – Working software as progress.
  3. Principle 9 – Focus on technical excellence.

Process focused

  1. Principle 10 – Reduce work done.
  2. Principle 12 – Inspect and adapt.

Whilst it would be easy to a put a few of the principles in more than one category, I put them in the category where I think their focus lies. When we do this we see that 7 of 12 principles focus mostly on people, only 3 have a mainly software focus and just 2 are primarily process focused. Surprising? I don’t think so. People do the work and generally for other people. A focus on people is important. We also see a similar focus on people in the values of the manifesto.

This also throws up an interesting question. If you are struggling to adopt agile ways of working, maybe you could ask yourself where your focus is. Are you focused on people? I’ve seen too many companies and teams try and become more agile by focusing on process and methodologies. It is almost like many people think that you can just copy another team or just do scrum and magically become agile.

In conclusion
If you focus on process and ignore the people then you will likely struggle to become more agile. Int has been my observation, at the various companies that I have worked at, that most “Agile transformations” focus on process over people. Companies that think that introducing some frameworks will magically make them agile often make little progress and little headway. The companies that focus on the people are much more likely to succeed.

My Agile – chosen methods

There are many methods and methodologies that fall under the umbrella of “Agile”, some of them have become very closely associated with agility, and even seem to be treated synonymously with “Agile”. Yet as we have seen, the manifesto for agile software development consists of a few short statements centred around 4 values statements. There are also 12 principles that we will look at over time on this blog. Whilst those principles are important, they are less relevant to the point I want to make here. Given what the manifesto for agile software development says, how do we get statements like “Agile says…”, “Agile user stories” or “Agile meetings”? In my opinion the answer to that lies in what I have chosen to name “My Agile”. As in people who, have probably chosen to, focus on a single method or methodology.

There seem to be more and more people who conflate their chosen, or favourite, method(ology), with the only way to “Do Agile”. This makes no sense because you cannot “Do Agile”. Agility, based on the manifesto, is a few values and some statements that we treat as facts (The principles). You cannot “Do” values and principles, they are true or they are not true. Values and principles guide (Or allow us to evaluate) our behaviours and mindset. They guide how we should do things rather than being what we do. Consider the following “Do value individuals and interactions over processes and tools”. That sentence makes no sense. However, if we ask people “How do we know that you’ve chosen a way of working that values individuals and interactions over processes and tools?” Then that makes sense, and we can answer it. A good test for if “Agile says…” makes sense is to replace the word “Agile” with how we want to work. For example we might want to work in a manner as to be flexible or adaptive, so let us try that in a sentence instead by considering the following in the context of a daily meeting (As an example): –

  • agile says that we should have a daily.
  • flexible says that we should have a daily.
  • adaptive says that we should have a daily.
  • Scrum says that we should have a daily.
  • Kanban says that we should have a daily.

Only the last two of those sentences make sense. We can be agile, flexible or adaptive and yet they cannot tell us what to do. In the same way Scrum or Kanban can guide us what to do and yet we cannot be Scrum or Kanban. We can however be an agile team, a Scrum team, an XP team or a Kanban team because the words do make sense in that context, albeit with different meanings. In the case of an XP team it is actually a team that uses eXtreme Programming values, principles and practices. The same would hold for a Kanban or Scrum team. Whereas in the context of an agile team it would be a team that exhibits the traits of agility, such as being able to adjust, move and understand quickly and easily.
By the way I deliberately wrote agile, flexible and adaptive without capitals in the bulleted list because they are not nouns and I wanted to underline that point.

We can be agile or not be agile, but we cannot do agile or not do agile. We can gauge our agility and become more agile over time. Imagine that agility is like a sliding scale from 0 to 100* and we are somewhere on that scale. The various methodologies can help us move up that scale. Some of the methods and methodologies we choose provide more support in this regard than others. Yet they can all guide us when we use them properly.

* Just to note: that sliding scale of agility keeps moving the bar as to what 100 is means. What might have been deemed as in the range 90-100 agility 10 to 15 years ago might be much lower today.

I suggest that we be open and honest and not try to shoehorn people into a way of working that muddies the waters of communication. If we want a team to be agile let us give them enough freedom to be ever more agile as they improve. If we want a team to be a Scrum team then let us help them use Scrum and to do so in the context of realising that Scrum is a process management methodology (The Scrum guide calls it a framework but it comes from empirical process control) and use Scrum to manage the process by which the team works and improve that process. The same holds for Kanban, XP or any other way of working.

In summary
Let’ us try and understand the different aspects of developing agility and the various methods and methodologies than can help us to get there. Then let’s find a way to work together and make the great aspirations around agility a reality rather than conflate terms and lose sight of the aspirations we really want to achieve.

What is this “Agile” thing all about?

What will becoming more agile really give us and what does it need? The manifesto for agile software development gives us the values and principles that we can use to guide us towards increasing our agility. However, we should not blindly follow someone else’s ideas without knowing where we are going. Therefore I want to discuss my thoughts on agility, what it can give us, as well as some ideas where we might look to improve along the way.

The first thing to realise is that increasing agility is a journey towards a moving target. What was considered the pinnacle of agility 10 years ago is no longer the pinnacle today. Realising this simple truth we realise that anyone selling an “Agile transformation” may want to consider their phrasing of what we are doing. A transformation implies a start and an end. The problem is that the journey towards greater agility knows no end and we can always improve. With that said we can review some ideas around what being agile is all about and discuss those points. We can then look at some outcomes to focus on so we might find a guide to inform us about the direction we are going. Those outcomes can form the basis of teams and companies deciding, for themselves, what they want and how to change course to gain greater agility. It is also worth noting that there is no exhaustive guide for how how to become more agile because each situation is different. The points below are ones that I think are pretty universal.

A focus on people
The manifesto for agile software development has, as we have already seen, put a focus on people. At the end of the day we are working to sell our products or services to an end customer, who is normally a person somewhere in the world. We also have people working to create the product or service and we need to look after them to make sure that they can do their best work.

In terms of employees we need to:

  • Give them the freedom and support they need to make the best decisions and do their best work.
  • Provide the psychological safety to address problems and speak up so as to improve.
  • Make sure that they have all the skills they need and that they can learn and progress.

In this regard I always liked the following way of thinking “Give people all the support and training so they can find their ideal job and then create the environment that they already have that job with you”. The idea is simple and makes sure that you have the best employees you can find and that you will find it easy to hire more people when needed because you will have excellent word of mouth about you as an employer.

When we consider our customers we have two major factors in play. We want to

  1. Earn as much money from them as possible
  2. Like them to come back and sell our products and services through word of mouth.

There are many good examples of this in the world and we often see how most successful companies put a strong focus on quality, customer engagement and making the customer happy. This leads to greater sales and the cycle repeats. The most simple way to deliver customer satisfaction is to be sure what your customers want. The best way to achieve that is by talking to them and asking. We then get into another core part of agility which is to create a finished product increment, put it in the hands of potential, or existing, customers and find out what the next most important bit of value that customers want is. Then we build that and the cycle repeats.

The flexibility to react
One of the key points around agility is the ability to remain flexible and change direction. We might think that we know what our customers really want and yet we will only be sure when we put some working software in their hands and see their reaction. There is a huge problem with developing new things, and that is that people don’t know what they want until they see what they don’t want because they are using it. We also have another problem that many customers don’t know what is possible and therefore cannot know what they want. As such we need to consider how we can rapidly win valuable customer feedback. We need to design measurable experiments into our product releases to learn so that we can fail fast and adjust course.

This philosophy changes how we want to write software. If you need to always remain flexible then we come back to good design principles around modularity, high cohesion and low coupling, simplicity, etc. When we design our work we need to consider not only how brilliant our solution is but also how easy it will be to change in future. As such clean and simple code will almost always win out over some really clever solution. Documenting the problem we are trying to solve and the ideas of our solution in comments is valuable and makes future refactoring easier.

Iteratively gaining feedback on a working increment of your software
Two concepts around agility are iterations and increments of software. An iteration is a step on the journey to the final product, a bit like the never ending journey of taking steps to increase your agility. Where the real difference comes in is with increments. You would ideally deliver working increments of your software on a regular basis. The more rapidly you can do so the more agile you have the potential to be. The absolute ideal state is to be in a permanently shippable state. This is the current gold standard of continuous delivery. Being able to deliver working software regularly allows you to put your software in the hands of customers, or potential customers, and see their reactions. You can gauge your progress and customer satisfaction. You can observe what you need to change to improve your product and ask customers for feedback. You can then change direction, as needed, based on that feedback.

Being outcomes focused
Objectives bring people together and are multilayered. You might have a product vision and objectives for the year, quarter and month or week. This helps to guide where we are going and ensure that we are all speaking the same language and pulling in the same direction. Just putting a lot of work into a project and delivering output will not be nearly as successful as making sure that the entire team is on the same page and working together.

I mention longer term planning like yearly or quarterly plans, it is important to consider that the longer term the plan the less detailed it should be. We should spend the most time on what we are about to work on and use longer term plans purely to guide our direction in the knowledge that they will most likely change. It is not the plan itself that is important but the act of planning. Planning lets us discuss options and possible hurdles in our path before we ever get there. Planning gives us an overall direction. Plans themselves rarely, if ever, survive first contact with reality. As general Dwight Eisenhower said “In preparing for battle I have always found that plans are useless, but planning is indispensable”. The same is true of planning an agile software project. We cannot know what reality, or changing customer demands, will throw at us, but the act of planning allows us to be more ready and able to adapt when the unexpected happens.

Some outcomes that might help to guide us
If we consider our journey towards increasing agility to be outcomes focused then it might help if I provide a list of a few outcomes that I think are helpful. I will then briefly describe why I think they are important.

  • Smile.
  • Sooner (Delivery).
  • Safer (Delivery and interactions).
  • Quality.
  • Value.

As we have already discussed agility is about people and we want to put a smile on the faces of our customers and also our employees because happy customers sell our products through word of mouth and happy employees are more motivated and deliver better work. We want to deliver our products sooner to win valuable feedback and want to do so more safely with fewer errors and rollbacks or fixes. We also want to ensure that we have psychological safety so that people will address actual problems rather than ignore them because they are resigned to the consequences of speaking up. We also want to focus on quality because technical excellence enhances our ability to remain flexible and make changes going forward. Also the DORA state of DevOps report has repeatedly shown that increasing quality also increases long term development speed. The last one goes hand in hand with several other objectives here and that is to deliver value to our customers which makes them happy and keeps us focused.

In conclusion
To me agility is about a focus on people and delivering high quality and valuable software to customers quicker and more safely. If we can get there we should be in a good way. If you want to focus on other objectives that is of course totally fine. Just make sure that your objectives are well thought out and align with the manifesto for agile software development, or your amendments to it if you want to try and improve it.

The Agile Manifesto

In my first post on the subject of “Agile” I want to focus on “The Agile Manifesto”, what it is, what it means and that it is often misrepresented by many people.

The agile manifesto was created in 2001 by 17 people who wanted to find a better way to develop software. They met in Snowbird Utah and over a few days came up with what we now usually refer to as “The agile manifesto”. This is probably also the first source of misunderstanding and possible contention. Later on the same group of people came up with the 12 principles of agile software development, which is likely to be the subject of a future blog post. You can find more about the history of the manifesto here and it is a very interesting read if you are in a typical “Agile shop” today.

Maybe the first thing to realise about a manifesto is what it is. There are many dictionaries online where we can look up the definition of “Manifesto” and the first that came up for me was the Cambridge dictionary so here is their definition “A written statement of the beliefs, aims, and policies of an organisation, especially a political party”. Therefore we can conclude that the manifesto was a statement of aims, beliefs and values at the time that the manifesto was written. The same group of people have not come together to update it since that time, and it stands as the best that any group of people have done to that point or since in terms of widespread recognition.

Let us review “The agile manifesto” by looking at it one section at a time and seeing what we can learn from it.

The title
“Manifesto for Agile Software Development”

What does this tell us? Well it starts by telling us that this is about software development. If you are using “The agile manifesto” to sell agility anywhere else then you’re most likely doing it wrong. This is not a generic “Agile manifesto” for a whole business. It may transfer to other business areas or it may not, and in either case it was not the intention that it would. This manifesto is about software development so please bear that in mind when discussing agility, especially in a wider business context.

The introduction
“We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:”

Here we can see that the agile alliance (As the group called themselves) were uncovering better ways of working at the time (2001). As in it is a never ending process to find better ways of developing software. We can also see that we learn better ways of developing software by developing software. Therefore we can conclude that we should develop software in order to get better at developing software. As such we might have learned more in time and come up with a better manifesto in future (After 2001).

By the way I know it seems obvious to say that we should learn better ways of developing software by developing software. The problem is that all too often “Agile” is pushed by management to gain something other than better ways of developing software such as “Twice the work in half the time” (If they read that book). I don’t mean that agility cannot make you more effective or efficient at developing software, just that people should make honest claims and use it for the right reasons. This blog will get into my take on that in future.

The values
“Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan”

This is very interesting when you read it and look at it more closely. We value individuals and interactions over processes and tools, but how much more? Just a little more? A lot more? We don’t know. We may even find instances where it isn’t a zero sum decision and we can enhance both at the same time. The same holds for every one of the value statements in the manifesto. If we want to be agile, according to the agile alliance in 2001, we should value the items on the left over the items on the right and we have to find the right balance for our environment.

So what do I mean by environment? That is a great question because it is multi-layered. We have to balance these values at the right level, which might include some or all of the following: –

  • Organisational.
  • Program.
  • Project/product.
  • Team.

An example might help to clear that up. Say we want to respond to changing conditions over following a plan. At the team or individual level this is very easy. At the project/product level this requires more effort and the effort is even greater at the organisational level. However, this also means that we should probably work in a way that permits us to change direction as easily as possible and commit to a course of action as late as possible. So when we are writing code we should do so with the mindset that we might change direction in future. It is also a great idea to work towards being in a permanently shippable state so that we can always test if we need to change direction or not. These points change a lot things about how we code and the processes by which we want to work.

The explanation
“That is, while there is value in the items on the right, we value the items on the left more.

Lastly a point about the value statements is made. We value everything mentioned, we just value the items on the left more. Which means that while we value responding to change over following a plan, we still value plans (In my opinion it is more that we value planning, as in the activity of creating plans. I am sure I will have a post about that in future as well). In the same way we value documentation, just not as much as working software and so on. This is very important because some people seem to think that we value software and don’t document anything anymore or that we respond to change so we never plan more than a week or a month ahead. That is far from true and was never a part of the manifesto.

In conclusion
So what can we learn from the manifesto? Well there are some points that seem to be misunderstood, and for me the following stands out: –

  • It is a manifesto for software development (Not generic agility).
  • The people who created the manifesto were learning, are learning and will continue to learn. We all continue to learn, so maybe we can do better, maybe not. We will learn that in time.
  • It has 4 values statements and we value both sides.
  • There is no defined method, methodology or process by which we become more agile. We have to find the right way to work for ourselves.

The manifesto for agile software development is a very simple statement of values with the aim to helping us find better ways to develop software, no more and no less. Something else that might not be totally obvious, because it isn’t mentioned, is that the manifesto does have a focus on individuals, trust and respect that runs through those values and is in support of organisational models based on people.

The “Agile manifesto” is about software and not generic business agility. If you want to use “The agile manifesto” to promote agility in other areas of a company please do so honestly and openly. State what you have learned from the manifesto for agile software development, including that a people centric approach works extremely well, at least in a learning environment. Make reference to the manifesto all you need, just be honest that it isn’t a generic manifesto for all business areas.

New direction

This blog has been inactive for a while. The reason is that I changed employers and wasn’t doing as much that I considered to be interesting and worth blogging about. I have also started going in a different direction in my career, focusing more and more on agility. In my opinion agility is often misunderstood and that is what I want to focus this blog on for now.

I want to focus on what I consider to be agile software development, the downsides of how it is implemented and some advice around that. Then I will see how things develop in future and let it organically grow from there. I also want there to be a bit of a story through the topics so I plan to cover them in a bit of an ordered sequence for now. Some topics I plan to cover include: –

  • The agile manifesto and principles.
  • Some of the major methodologies (Scrum, Kanban and XP).
  • Meetings, why many are not effective and how to improve them.
  • Thoughts around estimating, planning, etc.

Before anyone comments that Scrum is a lightweight framework not a methodology, please realise that the term “lightweight framework” doesn’t really mean anything and doesn’t tell you anything either. Scrum came from empirical process control and and can more fully, and correctly, be described as a process management methodology. Also yes, I might challenge a few commonly held beliefs in the hope of making anyone who reads this blog think about things in a different way. I may be right, I may be wrong with what I write. What is important is that we think about things and have a constructive discussion to help us all improve.

Statistics IO, what does it show?

As another post mentioned I was recently at SQL Saturday in Manchester. One of the presentations raised a very interesting question about what statistics io shows when you have a forwarded record. I thought it would make an interesting blog post to investigate this question and find an answer as best I can. Fair warning, this is a long post and you can download the scripts and read the summary at the end. Most of the explanation is in the scripts. This is because for some reason at the moment code tags don’t seem to work on this theme and a nice syntax highlighter I thought might work breaks my site (I’ll get there and update previous posts though).

Lets start with the setup. First we create a view to make these scripts shorter (Trust me they’re long enough). We then create a simple table with two columns and show that it has one IAM page (I will post about IAM pages in more depth in future but Paul Randal has an excellent article on them) and one data page. You will need to run the setup for each script and each script starts with part 2. The below, standard blurb, applies to all of these scripts: –

These scripts were written by Rainer Unwin (c) 2015

You may alter this code for your own non-commercial purposes (e.g. in a
for-sale commercial tool).
You may republish altered code as long as you include this copyright and
give due credit, however you must obtain my prior permission before blogging
this code.

This code and information are provided “as is” without warranty of
any kind, either expressed or implied, including but not limited
to the implied warranties of merchantability and/or fitness for a
particular purpose. You use the script at your own risk and should
always test in a pre-production or test environment before running
in production.

The view: –

[code language=”sql”]
use tempdb;
go

if object_id( N’dbo.reads_pages’,’V’ ) is not null drop view dbo.reads_pages;
go
create view dbo.reads_pages
as
select is_iam_page
, allocated_page_file_id
, allocated_page_page_id
from sys.dm_db_database_page_allocations(
2
, object_id( ‘dbo.reads’, ‘U’ )
, 0
, 1
, ‘detailed’
)
where is_allocated = 1;
go
[/code]

Lets get an understanding of what stats io normally shows us

Script 1: –

[code language=”sql”]
— :: PART 1 – setup
set nocount on;
go

use tempdb;
go

— create an object to play with
if object_id( N’dbo.reads’, ‘U’ ) is not null drop table dbo.reads;
go

create table dbo.reads( i int, data char( 5000 ) );
go

— :: PART 2 – check with a single page

— insert a single record and check the structure of the table
— then confirm that we get a single read (The data page only)
insert dbo.reads( i, data ) values( 1, ‘a’ );
go

select *
from dbo.reads_pages;
go

set statistics io on;
go

select *
from dbo.reads;
go
— just 1 read (The data page). The IAM page is not read.
— lets try again with a second page of data.

— :: PART 3 – Now try with 2 data pages

insert dbo.reads( i, data ) values( 2, ‘b’ );
go

select *
from dbo.reads_pages;
go

select *
from dbo.reads;
go
— again we see the number of reads of data pages only.
— we need to prove this is the case when we exceed the
— extent boundary as well so lets add another 10 pages

— :: PART 4 – lets use 12 pages which exceeds the extent boundary and conclusion

insert into dbo.reads( i, data )
values( 3, ‘c’ )
, ( 4, ‘d’ )
, ( 5, ‘e’ )
, ( 6, ‘f’ )
, ( 7, ‘g’ )
, ( 8, ‘h’ )
, ( 9, ‘i’ )
, ( 10, ‘j’ )
, ( 11, ‘k’ )
, ( 12, ‘l’ );
go

select *
from dbo.reads_pages;
go

select *
from dbo.reads;
go
— If you have multiple files, as I do, you may also
— find that you now have multiple IAM pages. In my
— case I had 12 data pages and 2 IAM page. reads
— from the select were 12 matching the number of
— data pages
[/code]

Script 1 summary

So what this script shows is the behaviour of statistics io with no forwarded records. As you can clearly see it never shows that the IAM page is read. Logically, since we do an allocation order scan, this should be read. The take away is that only data pages are shown by statistics io.

Script 2 – What is shown for forwarded records where each forwarded record occupies its own page

[code language=”sql”]
— :: PART 1 – setup, use a varchar column to allow us to forward records
set nocount on;
go

use tempdb;
go

— create an object to play with
if object_id( N’dbo.reads’, ‘U’ ) is not null drop table dbo.reads;
go

create table dbo.reads( i int, data varchar( 8000 ) );
go

— :: PART 2 – check with a single page

— insert 7 records and check the structure of the table
— then confirm that we get a single read (The data page only)
insert dbo.reads( i, data )
values( 1, replicate( ‘a’, 7677 ) ) — fill the page
, ( 2, replicate( ‘b’, 50 ) )
, ( 3, replicate( ‘c’, 50 ) )
, ( 4, replicate( ‘d’, 50 ) )
, ( 5, replicate( ‘e’, 50 ) )
, ( 6, replicate( ‘f’, 50 ) )
, ( 7, replicate( ‘g’, 50 ) );
go

select *
from dbo.reads_pages;
go

set statistics io on;
go

select *
from dbo.reads;
go
— just 1 read (The data page). The IAM page is not read.

— :: PART 3 – Now generate 1 forwarded record
update dbo.reads
set data = replicate( ‘b’, 5000 )
where i = 2
go

— 3 pages: 1 IAM, 1 data & 1 for the forwarded record
select *
from dbo.reads_pages;
go

select *
from dbo.reads;
go
— now we see 3 page reads. This could be because we
— have read the IAM page to see where we should read
— the forwarded page or it could be that we jumped
— to the forwarded page and then back again

— :: PART 4 – lets forward another record off the main data page
update dbo.reads
set data = replicate( ‘d’, 5000 )
where i = 4;
go

— 4 pages: 1 IAM , 1 data, 2 for forwarded records
select *
from dbo.reads_pages;
go

select *
from dbo.reads;
go
— 5 reads but how did we get there? if we read the IAM
— page then we would read just 4 pages. The IAM, the
— data page and the 2 forwarded pages. No we must have
— read the data page, followed the pointer, then back
— to the data page and repeat for the other forward
— pointer. Lets confirm with 1 more.

— :: PART 5 – lets forward another record off the main data page
update dbo.reads
set data = replicate( ‘f’, 5000 )
where i = 3;
go

— 5 pages: 1 IAM , 1 data, 3 for forwarded records
select *
from dbo.reads_pages;
go

select *
from dbo.reads;
go
— yes the pattern holds. Each new forwarded record adds
— two reads. One to go the the forwarded page and one
— to go back to the data page. In the next section you
— will see another interesting effect
[/code]

Script 2 summary

What we have shown here is that we count reading the data page and each page that has a forwarded record each time that we look at a record on it. So if you have 7 records, and records 2 an 4 are forwarded, then we read the page as follows (And count 5 reads): –

1: Read the main data page and id = 1 and 2 (Forwarding pointer).
2: Follow the forwarding pointer and read that data page.
3: Read the first data page again to read id = 3 and 4 (Forwarding pointer).
4: Follow the forwarding pointer and read that data page.
5: Read the first data page a third time to read id 5 through 7.

Script 3 – what if several forwarded records are on to the same page

[code language=”sql”]
— :: PART 1 – setup, use a varchar column to allow us to forward records
set nocount on;
go

use tempdb;
go

— create an object to play with
if object_id( N’dbo.reads’, ‘U’ ) is not null drop table dbo.reads;
go

create table dbo.reads( i int, data varchar( 8000 ) );
go

— :: PART 2 – check with a single page

— insert 5 records and check the structure of the table
— then confirm that we get a single read (The data page only)
insert dbo.reads( i, data )
values( 1, replicate( ‘a’, 7677 ) ) — fill the page
, ( 2, replicate( ‘b’, 50 ) )
, ( 3, replicate( ‘c’, 50 ) )
, ( 4, replicate( ‘d’, 50 ) )
, ( 5, replicate( ‘e’, 50 ) )
, ( 6, replicate( ‘f’, 50 ) )
, ( 7, replicate( ‘g’, 50 ) );
go

select *
from dbo.reads_pages;
go

set statistics io on;
go

select *
from dbo.reads;
go
— just 1 read (The data page). The IAM page is not read.

— :: PART 3 – Now generate a forwarded record
update dbo.reads
set data = replicate( ‘b’, 500 )
where i = 2;
go

— 3 pages: 1 IAM, 1 data & 1 for both forwarded records
select *
from dbo.reads_pages;
go

select *
from dbo.reads;
go
— 3 reads as expected.

— :: PART 4 – lets forward another record off the page but onto the same forwarding page
update dbo.reads
set data = replicate( ‘c’, 500 )
where i = 4;
go

— 3 pages: 1 IAM, 1 data & 1 for both forwarded records
select *
from dbo.reads_pages;
go

select *
from dbo.reads;
go
— 4 reads? But last time we got 5 reads. So it looks
— like when we use the same forwarding page then we
— get fewer reads. Lets try again

— :: PART 5 – lets forward another record off the main data page
update dbo.reads
set data = replicate( ‘f’, 500 )
where i = 6;
go

— 3 pages: 1 IAM, 1 data & 1 for both forwarded records
select *
from dbo.reads_pages;
go

select *
from dbo.reads;
go
— ok that is weird now we get 5 reads from the same
— 3 pages as before. What is going on here? We don’t
— count the forwarded page each time when it’s the
— same page. So whereas before every forwarded
— record had it’s own page now it doesn’t. We still
— count the main data page whenever we read it, but
— we don’t count the forwarding page if it is the
— same. Wow, so misleading!
[/code]

Script 3 summary

So what happens here is slightly different than before. We can show that if all the forwarded records are on the same page then we only count the page that contains the forwarded records once. This might not be what you would expect and that is because it is a lie! It is a big fat lie in fact. If you repeat the same test as in script 3 up to the end of part 4 and then do the below you can show that the reads are a lie: –

Open a new query window (I’ll call it query bail) and update dbo.reads where i = 3 but don’t commit
Then run the read from script 3, this will block at i = 3
Open another new query (I’ll call it update) and set data = replicate( ‘z’, 500 ) for i = 4, let this commit
Now rollback query bail

You will see that in fact we did re-visit the forwarded page to see the ‘z’s on there.

So there you have it, statistics io… It’s great, it’s really helpful and yet it’s dishonest to some degree as well. At least you now know what’s going on though 🙂

Scripts: –

Stats IO reads part 0 – create view
Stats IO reads part 1
Stats IO reads part 2
Stats IO reads part 3

Interesting issue with heaps and page free space

First of all I would like to thank Uwe Ricken who alerted me to this point at the recent SQL Saturday in Manchester. I would also like to thank the organisers and sponsors of SQL Saturday, especially our local user group leader Chris Testa-O’Neill.

I recently blogged, to a small extent, about allocation pages. One of the allocation pages I blogged about was the Page Free Space (Or PFS for short) page. This tracks more than just the free space on a page. In fact it only tracks the free space in heap pages. It also tracks whether or not a page is an IAM page (Also briefly mentioned in my blog post about allocation pages) and a few other things. Please see this blog post by Paul Randal of SQL Skills for a fuller description of PFS pages.

Now the main point I want to call out is that the PFS page isn’t very granular in it’s ability to track page fullness. It tracks where in the following range of percent fullness a page is: –

  • empty
  • 1 to 50% full
  • 51 to 80% full
  • 81 to 95% full
  • 96 to 100% full

It might be clear from these values that large, wide, records in a heap could cause problems. However if not please allow me to demonstrate with a small demo followed by an explanation. Try the below script in SSMS: –

[code language=”sql”]
— Use tempdb for simplicity
USE [tempdb];
GO

— Drop test object we intend to use and then create a test table to use
IF OBJECT_ID( N’dbo.demo’, ‘U’ ) IS NOT NULL DROP TABLE dbo.demo;
GO

CREATE TABLE dbo.demo(
data CHAR( 2000 )
);
GO

— We insert 4 values in one transaction
INSERT INTO dbo.demo( data )
VALUES( ‘a’ )
, ( ‘b’ )
, ( ‘c’ )
, ( ‘d’ );
GO

— Lets check the number of data pages (Should be 1 as these 4 rows fit)
SELECT allocated_page_file_id
, allocated_page_page_id
FROM sys.dm_db_database_page_allocations(
DB_ID( N’tempdb’ )
, OBJECT_ID( N’dbo.demo’, ‘U’ )
, 0
, 1
, ‘DETAILED’
)
WHERE is_allocated = 1
AND page_type = 1 — data page;
GO

— Now lets try that again with 4 separate transactions
TRUNCATE TABLE dbo.demo;
GO

INSERT INTO dbo.demo( data ) VALUES( ‘a’ );
INSERT INTO dbo.demo( data ) VALUES( ‘b’ );
INSERT INTO dbo.demo( data ) VALUES( ‘c’ );
INSERT INTO dbo.demo( data ) VALUES( ‘d’ );
GO

— Well isn’t that odd we now have 2 data pages
SELECT allocated_page_file_id
, allocated_page_page_id
FROM sys.dm_db_database_page_allocations(
DB_ID( N’tempdb’ )
, OBJECT_ID( N’dbo.demo’, ‘U’ )
, 0
, 1
, ‘DETAILED’
)
WHERE is_allocated = 1
AND page_type = 1 — data page;
GO
[/code]

Explanation of the issue: –

In a heap SQL Server uses the PFS page to track the amount of free space in a page. It does not read each individual page in the heap (Which would make large heaps unusable). When you add records in a single transaction (Like the first insert in the script) SQL Server will add the records until they no longer fit on the page and will then allocate another page and keep adding records there. However with individual transactions (Like the second set of inserts) the PFS page is checked each time we attempt to insert a record. Since the 4 records fit on a single page when we add them in a single transaction SQL Server fills the page (Well near enough – there’s actually 60 bytes left on the page with the records that I created) so we know that they should fit. However because we check the PFS page at the start of each insert for the second part of the demo with individual transactions the inserts go something like this: –

Insert 1: No data pages allocated so allocate a data page and insert 1 record.
Insert 2: Check the PFS page, our page is 25% full so shows as 50% fill (Due to the lack of granularity in how we track free space) we therefore insert the record on the same page.
Insert 3: Check the PFS page, our page is about 50% full and shows as 50% full, we therefore insert the record on the same page.
Insert 4: Check the PFS page, our page is about 75% full and shows as 80% full. The PFS page shows that there is not enough space for a row of 25% the size of the page. We therefore allocate a new page and insert the record there.

The consequence of this optimisation is that many heap pages for tables with wide rows are likely to have quite a bit of free space. Please note that this is not the case with a clustered index (Or a non clustered index but I’m just looking at table data to draw your attention to this) because in an index the amount of free space on a page is attained from the actual data pages and not the PFS pages.

To wrap up I’d like to point out that this issue can be resolved with the below command, however please be aware that this, similarly to rebuilding a clustered index, will force an update of all non-clustered indexes on the table, this command has worked since at least 2008 R2: –

[code language=”sql”]
ALTER TABLE dbo.demo REBUILD;
GO
[/code]

Allocation Pages Overview

It can often help to understand how a database actually stores data. While the way in which SQL Server stores data and the allocation units it uses are well documented this is my version for my web site. To gain a deeper understanding I would recommend you visit Paul Randal’s blog. Relevant sections to this post can be found here near the bottom. As Paul worked on the storage engine team at Microsoft, and managed the team for a period as well, his knowledge is beyond question.

Types of allocation unit in SQL Server

SQL Server has two different types of allocation unit, bytemap and bitmap. This is just a reference to how SQL Server represents the storage unit that the allocation unit pages relate to. The bitmap pages represent 64KB extents (A collection of 8 * 8KB pages) with a single bit. There are 5 bitmap pages as below: –

  • Global Allocation Map (GAM)
  • Shared Global Allocation Map (SGAM)
  • Bulk Changed Map (BCM) – Also known as an ML page or Minimally Logged change map page
  • Differential Change Map (DCM)
  • Index Allocation Map (IAM) – these are special cases and not covered further in this post

There is also a bytemap page which represents individual 8KB pages with a single byte. This page type is as below

  • Page Free Space (PFS) page

Bytemap pages

The PFS page tracks the below information about database pages: –

  • Amount of free space on a heap page
  • Presence of ghost records on index pages (Including clustered index leaf pages)
  • Is this an IAM page
  • Is this a mixed page
  • Is this page allocated

This information is represented by an entire byte per page and each PFS page can track 8,088 database pages. This equates to just under 64MB of data pages. Therefore every 8,088 pages (64MB) in the data files of your database you will find a PFS page. You can confirm this with the below commands: –

[code language=”sql”]
/* Alter the database name to be a database where you have more than 64MB of data in the primary file group */
DBCC PAGE (DatabaseName, 1, 1, 0) WITH TABLERESULTS;
DBCC PAGE (DatabaseName, 1, 8088, 0) WITH TABLERESULTS;
GO
[/code]

On line 14 you should see a row for m_type and a value of 11 which is a PFS page.

Bitmap pages (except IAMs)

All the bitmap pages track data at the extent level. An extent is a collection of 8 * 8KB pages, so 64KB of data. Extents come in two types: –

  • Mixed extents which have data or index pages from more than one entity
  • Uniform extents which have pages relating to a single entity.

I use the term entity to mean, in simple terms, an index of a table. In truth it goes down to the partition and allocation unit type (in row data, row overflow data, lob data) but we’ll skip that for this explanation. Since we can track 64KB with a single bit we can track just under 4GB of data with a single bitmap page. These pages recur every 511,232 pages. This can be confirmed with the below statements if you have any data files that are greater than 4GB and I’ll put the statements together as though the data is in data file 1: –

[code language=”sql”]
/* Alter the database name to be the database where you have enough data, we’ll look at 2 GAM pages */
DBCC PAGE (DatabaseName, 1, 2, 0) WITH TABLERESULTS;
DBCC PAGE (DatabaseName, 1, 511232, 0) WITH TABLERESULTS;
GO
[/code]

On line 14 you will be able to confirm that the page type (m_type) is the same for these pages. Pages of interest are located at the below pages and then every 511,232 pages thereafter (I have included the page types so you know what to look for on line 14 of a DBCC PAGE print): –

  • Page 2 – GAM – type 8
  • Page 3 – SGAM – type 9
  • Page 6 – DCM – type 16
  • Page 7 – BCM/ML Map– type 17

This has been an overview of the types of allocation pages that SQL Server uses. There will be other future posts to go into more depth on each of these topics. However for now I will keep this short because a full post on allocation units and pages would be far too big 🙂

If you find the way that the storage engine tracks pages and objects interesting I fully recommend the posts of Paul Randal as previously mentioned or feel free to wait for future posts in this area.