DB data type — Varchar vs Text dilemma

Sairam Krish
2 min readMay 14, 2021

--

On a quick look, it feels like varchar(n) is better, because it feels like we are saving disk space by specifying exactly what we are going to use. Is it really worth it ? Why can’t we keep it simple and call it text data type ?

Postgres Character types

From postgres doc — https://www.postgresql.org/docs/13/datatype-character.html for educational purpose

More details info — postgres docs

Mysql Character types

From https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html for educational purpose

More details info — mysql docs

character(n) is slow

Let’s rule out character(n) unless the problem we are trying to solve some rare case scenario. Based on the postgres document,character(n) has no performance advantages in PostgreSQL In fact character(n) is usually the slowest of the three because of its additional storage costs.

varchar(n) is slow

varchar(n) are always problematic when dealing with resizing at a later stage. varchar(n) also has an overhead while writing data — each time, database checks whether data is within limits.

Does varchar(n) takes less storage than text

Mysql docs clearly calls this out.,

L represents the actual length in bytes of a given string value.

  • Text -> L + 2 bytes, where L < 2^16
  • varchar -> L + 1 bytes if column values require 0 − 255 bytes, L + 2 bytes if values may require more than 255 bytes

When to use varchar

  • If the field needs a database level validation on the string length., let’s say social security number

OLAP databases

Clickhouse, an OLAP database, for instance, ignores VARCHAR kind of data type declaration and behind the scenes handles them as String.

The String type replaces the types VARCHAR, BLOB, CLOB, and others from other DBMSs.

When creating tables, numeric parameters for string fields can be set (e.g. VARCHAR(255)), but ClickHouse ignores them.

Conclusion

Using text by default and using varchar in places that really need them simplifies the implementation. Overthinking each field how long it should be and when it hits the limit increasing the size feels over engineering if the advantage of doing such a limit doesn’t have huge page off.

Do you have a different perspective ? Let me know in the comments.

--

--

Sairam Krish
Sairam Krish

Written by Sairam Krish

Software Architect ★ Data Architect

No responses yet