I ran across an article, titled When to use CHAR, VARCHAR, or VARCHAR(MAX), which struck me as something I never do. I mean, I do use varchar (and nvarchar), but I can’t remember the last time I actually created a char column. The article is worth a read, and it sets the stage for you to think about your database design process and the choices you make.
It’s Friday, and if you take a few minutes and think about the last few times you’ve added columns to a table have you used CHAR as a data type? Or do you default to varchar of any size as a general rule?
I tend to do a lot of demo work, and I help customers with different situations. In many cases, we are storing text data, often not strongly typed data. As a result, I find most customers using varchar (or nvarchar), and I’ve built the habit of using the variable structures in proofs-of-concept and demos. I find it especially handy when someone asks me to enter some data they use and then show how it would be handled.
Early in my career, I’d often tag a zipcode as a 5 character field, or a state as a 2 character abbreviation. However, these days a postal code can be a 5+4, which is 9 characters or even 10 with the plus. Many companies work overseas and may want to account for longer postal codes. States (or regions), can often be 3 characters, but sometimes more. Often we just leave 10 characters for region abbreviations (or longer) as the data might be spelled out or need to accommodate something unknown.
For many business applications, it seems that there might be a definition for what the data should be, but since exceptions can abound, often using a variable-length data type just prevents issues in the future. Add that to the fact that often we are dealing with cheap storage, and it doesn’t seem worth the time to try and get the exact size correct. Even when knowing an invoice uses 10 characters, are you sure that you won’t exceed the ten-character width? What if you acquire a company that uses 12 character invoice numbers? Easier to set this to a variable 20 and move on.
If you work in data warehousing, then you might know what your data sources contain and be more likely to choose fixed types, but is the space savings worth the work in the event that source systems change? I don’t know. I tend to plan for sources to change and allow a little padding in my schema. You might feel differently, but are the space savings worth the potential hassles in the future? I’d be curious what you think today.