If you follow sports you are likely familiar with the concept of small sample sizes. Every hitter in baseball will occasionally have an 0 for 5 day with three strikeouts. If looking at the one game you might come away with the wrong impression of the hitter. Sometimes a great hitter will have a month where these games consistently happen. If you focus on that month you will miss the fact the he usually has a month well above average that makes up for it.
In real estate small sample sizes are everywhere. Do you really think that comps are a completely accurate representation of a market? Depending on the size of the market and how they were selected, these comps may actually be selected and reflect a trend counter to other market forces.
What defines whether a dataset is “small”? This is a tough question because even a dataset of millions of records could be small depending on what it is trying to reflect (Amazon sales trends would be an example). Similarly a dataset of 10 points could be a large dataset for a fairly rare occurrence.
Unless you are dealing directly with statistics the issue of sample size is rarely addressed. I’ve experienced people throwing “facts” out based on a self-selected dataset more often than I care to remember. People most often buy into small sample size sets because it is 1) difficult to get more or different data and 2) reflects the reality they expect to see.
A great example of this is in the sales world. Sales people are in front of a certain type of buyer with certain approaches to a sales environment. They may hear the same thing multiple times with multiple clients over a month and suddenly think the market has shifted on them. The more likely reality is that the way their pitch is structured causes a similar response. The salesperson hears a disconnect and assumes it is based on the market when in fact it is based on a reaction to their words.
Decisions are hard. It requires a lot of self-reflection to truly understand the process you use in making decisions. Are you really making decisions using good data or are you simply using data that reflects the result you were hoping for at the start?