What are CLOBs (Character Large Objects)?

Contents

Normally, databases save data as follows: Individual table columns are compiled as rows, which are then stacked as data blocks. In each data block, a certain piece of information is then assigned to a row. Saving data blocks in this way requires less storage space in the database.

But particularly large data objects are treated differently by databases. These Large Objects (LOBs) are much larger than conventional database entries, as well as unstructured, and are therefore usually stored in a separate storage location. The database then simply creates a reference to the storage location of the object at the corresponding position.

Two types of LOBs exist: BLOBs and CLOBs. A BLOB is a data type that saves large binary objects, while the CLOB file format stores long character strings. The abbreviation stands for Character Large Objects and the term was coined by the developers of the Oracle database. In other database systems, there are also other designations for large objects, such as TEXT in MySQL and PostgreSQL.

How Do CLOBs Work and What Are Their Applications?

Character Large Objects encompass all database objects with strings – i.e. all objects that contain files consisting of characters. Character strings are sequences of letters, digits, special characters, and control characters; CLOB entries therefore typically reference text documents (particularly XML). What’s special about storing these large data types is that they are not contained directly in the database – with few exceptions – but in a separate place. The database itself only contains a reference mechanism that points to the respective CLOB. For instance, the entire character code of a website is not stored in a database cell, but instead a reference number that links to the actual storage location of the underlying HTML document.

Pros and Cons of CLOB Storage

One of the biggest advantages of the CLOB data type is the ability to read and edit individual files. For example, database administrators can use application programs to compare, edit or merge the content of input Character Large Objects. A brief overview of the editing options is shown below:

SUBSTR allows users to extract strings from a CLOB
INSTR inserts cut strings into another position or other CLOBs
COMPARE compares the values of two CLOBs
APPEND connects two CLOBs with each other

Since CLOBs store the documents with text or characters in full, it’s possible to examine their content in detail. Administrators can determine exactly whether content is duplicated or text parts overlap. If this is the case, the datasets can be combined to free up unnecessarily occupied storage space.

In addition, storing letters, digits, special characters, and control characters as CLOBs is absolutely loss-free: When saving files, it often depends on whether all information within the document actually has to be saved – that isn’t an issue with this data format. For instance, you can also save the meta information of a document together with the actual content as a CLOB. Most databases therefore even support the presentation of Character Large Objects in a tree structure as well as navigation and searching for specific content.

The disadvantages of this data type for storing enormously large numbers of characters include the fact that some databases do not allow editing with familiar SQL functions. CLOBs store extremely large quantities of text, which means the execution of standard functions would take a very long time. However, at least the application commands mentioned above – SUBSTR, INSTR, COMPARE, and APPEND – are possible alternatives for replicating unavailable SQL operations. Another disadvantage of CLOBs is that relatively small data elements waste valuable storage space, since a defined, segmented storage space is reserved outside the database.

BLOBs (Binary Large Objects)

BLOB is short for Binary Large Object. The data types in a BLOB are typically classified as being unstructured data. A typical example of an unstructured data type is multimedia files, which are usually stored in databases as a BLOB. Since databases cannot read the unstructured…

Big Data

Timofey ManzyukShutterstock

ASCII: explanation and examples

Every day we work with text on PCs, smartphones, and tablets and expect letters, numbers, and characters to appear on the monitor before us. The fact is that there is a decades-old coding system behind this everyday function. Since the 1960s, ASCII has determined how computers…

Encyclopedia
Tutorials

ASAG StudioShutterstock

What is utf-8 and why is it important for global digital communication?

UTF-8 is a character encoding under Unicode that aims to encompass all modern languages for data processing. What exactly does “UTF-8” mean? And what is special about UTF-8 in the Unicode character set? Here you’ll learn about the structure of the coding and which bytes are…

Encyclopedia
Digitalization
Encryption

sdecoretShutterstock

An introduction to CouchDB

The creation and continued development of web applications require numerous data sets. How exactly these datasets are stored in databases is crucial for the user. The CouchDB database management system stores data in a way that is schema-free and document-oriented. Non-structured…

Database

FlashMovieShutterstock

Low-Code

What is low-code and when is it a good idea to use a low-code platform? Low-code development involves visual components instead of conventional programming techniques. Low-code is therefore particularly suited to technical amateurs who are unable to write complex code. But…

What are CLOBs (Character Large Objects)?

How Do CLOBs Work and What Are Their Ap­pli­ca­tions?

Pros and Cons of CLOB Storage

How Do CLOBs Work and What Are Their Applications?