BLOBs (Binary Large Objects): introduction
The term BLOB can be found primarily in connection with databases and open source projects. It is used for storing binary data (i.e. data which can contain non-printable characters as well as arbitrary bit patterns). Typical examples of such data types include images, audio files, compressed files and spreadsheet data. If you plan to use Binary Large Objects in a project, certain procedures must be followed. So, what exactly is a BLOB? What characterizes it? What are the advantages and disadvantages of using it?
What is a BLOB (Binary Large Object)?
BLOB stands for Binary Large Object. BLOBs are different from CLOBs (Character Large Objects) and TEXTs in that while CLOBs and TEXTs also deal with large data types, these consist of character strings.
As previously mentioned, large binary data types include images, audio files, archive files and spreadsheet data. In addition, videos are also classified as BLOBs which is why binary files can easily be several hundred gigabytes. According to Jim Starkey, the inventor of the BLOB, the name BLOB came before the acronym was actually defined as Binary Large Object. In 1997, Starkey clarified that BLOB was defined as an acronym because it was too unprofessional for marketing purposes otherwise.
BLOBs in databases
BLOBs in databases need to be processed in a special way. Special data types are unavoidable. Databases cannot read or understand the unstructured data in BLOBs. They can only store it all. Databases can only read the file name, type and size of the BLOB. Therefore, it is impossible to use database functions such as sorting, filtering and searching for specific content in a BLOB.
The difference between structured and unstructured data is that the former has a clear structure. All information in structured data can be displayed in corresponding database fields. Unstructured data, on the other hand, will not allow you to see the content. Only the data type is known.
Binary Large Objects are stored by different database systems in different ways. Since the structure of databases is often not suitable for storing BLOBs directly, they are stored externally. The database itself thus only contains a reference to where the external file is actually stored. There are also alternative terms for describing large binary files depending on the database system being used. Some solutions, such as MySQL, even use different terms for files of specific sizes. You can view some of the most popular systems along with their respective terms for Binary Large Objects in the following table.
|MySQL||Up to 0.255 KB: TINYBLOBUp to 64 KB: BLOBUp to 16 MB: MEDIUMBLOBUp to 4 GB LONGBLOB|
|PostgreSQL||BYTEA and Object Identifier|
|Microsoft SQL Server||binary, varbinary, text, ntext|
Where are BLOBs used?
BLOBs are primarily used in the big data industry. Raw data is collected en masse from website visitors, bundled together into collections of data and stored worldwide in databases. In its raw form, this collected data is unstructured so that is the easiest way for database systems to collect it. BLOBs can also be used to store films or television shows in databases in a quasi-encrypted form.
Binary Large Objects are also used in the open source arena. Even though, by definition, all parts of an open source project are supposed to be able to be generated from a publicly accessible source code, this is not always the case. In some cases, proprietary elements (mainly device drivers) are also included which are only available in binary form. The term “BLOBs” is also used for this data, but the inclusion of these objects in open source projects is quite controversial.
Advantages and disadvantages of Binary Large Objects
The question of whether to use BLOBs for a project needs be considered on a case-by-case basis as they both have important advantages and disadvantages.
|BLOBs are a good option for adding large binary data files to a database and can be easily referenced||Not all databases permit the use of BLOBs|
|It is easy to set access rights using rights management||BLOBs are inefficient due to the amount of disk space required and access time|
|Database backups or dumps contain all the data||Creating backups is highly time consuming due to the file size of BLOBs|