Refactoring

Contents

When developing an application, inelegantly structured sections can accumulate in the source code which impairs the usability and compatibility of the program. The solution is either an entirely new source code or restructuring in small steps. Many programmers and companies increasingly opt for code refactoring in order to optimize functioning software over the long term and make it more legible and clearer for other programmers.

During the refactoring process, the question is raised about which problem in the code should be solved with which method. Refactoring is meanwhile considered to be among the basics when learning to code and is becoming more and more important. Which methods are used to this end and what are the advantages and disadvantages?

What is refactoring?

Programming software is a lengthy process that can involve multiple developers. Written source code is often revised, changed, and expanded during this work. As a result of time pressure or outdated practices, inelegant sections can accumulate in the source code. These are known as code smells. These weak spots that accrue over time endanger the usability and compatibility of the program. To prevent this gradual erosion and deterioration of the software, refactoring is necessary.

In principle, refactoring is similar to editing a book. The practice of editing does not create a completely new book, but instead a more understandable text. Just like various approaches exist in editing such as cutting, reformulating, deleting, and restructuring, code refactoring likewise encompasses a number of methods like encapsulation, reformatting, or extraction in order to optimize a code without changing its function.

This process is much more cost-effective than preparing an entirely new code structure. Especially in iterative and incremental software development, as well as agile software development, refactoring plays a major role, since programmers frequently need to alter software in these cyclical models. In this context, refactoring is a fixed step in the workflow.

When source code deteriorates: spaghetti code

First, it’s important to understand how code can age and mutate into spaghetti code. Whether due to time pressure, lack of experience, or unclear instructions, programming code can lead to a loss of functionality as a result of unnecessarily complicated commands. A code deteriorates increasingly, the faster and more complex an area of application is.

Spaghetti code refers to confusing, unreadable source code that can only be interpreted by programmers with great difficulty. Simple examples of confusing code include superfluous jump commands (GOTO) that instruct the program to skip back and forth in the source code, or unnecessary for/while loops and if commands.

Projects involving many software developers are particularly susceptible to unclear source code. When code passes through many hands and if the original already contains some weak points, a growing mess resulting from “workaround solutions” can hardly be avoided, necessitating a costly code review. In severe cases, spaghetti code can jeopardize the entire development of software. If the problem gets that far, it may even be too late for code refactoring.

Code smells and code rot are not quite so disastrous. Over time, a code can start to smell – metaphorically – with all its inelegant sections. Difficult-to-understand parts become worse as other programmers intervene or add new strings. If refactoring is not performed at the first signs of code smell, the source code will gradually lose functionality as a result of code rot.

The aim of refactoring

The intention behind refactoring is simply to achieve better code. Effective code allows new code elements to be integrated better without introducing new errors. Programmers who can effortlessly read the code will be able to familiarize themselves with a developing application faster and remove or avoid bugs more easily. Another goal of refactoring is to improve error analysis and the maintainability of software. The work of programmers reviewing code is therefore simplified considerably.

What sources of errors does refactoring solve?

The techniques applied in refactoring are as varied as the errors they’re intended to remove. Essentially, code refactoring is defined by its errors and encompasses the steps required to shorten or remove a solution approach. Sources of errors that can be resolved with refactoring methods include:

Confusing or excessive code: Command strings and blocks are so long that external programmers will be unable to understand the internal logic of the software.
Code duplications (redundancies): Unclear code often contains redundancies that have to be changed separately at each occurrence during maintenance, thereby wasting time and resources.
Excessive parameter lists: Objects are not assigned directly to a method but their attributes are conveyed in a parameter list.
Classes with too many functions: Classes with too many functions defined as methods – also known as god objects –make adjusting the software almost impossible.
Classes with too few functions: Classes with so few functions defined as methods that they are unnecessary.
Overly general code with special cases: Functions with too specific special cases that hardly ever occur – if at all – and therefore make adding necessary extensions more difficult.
Middle man: A separate class acts as a “middle man” between methods and various classes, instead of directing calls from methods directly to a class.

What approach does refactoring involve?

Refactoring should always be performed before changing a program function. It ideally involves very small steps, with code changes tested using software development processes like test-driven development (TDD) and continuous integration (CI). In a nutshell, TDD and CI refer to the continuous testing of small, new code sections that programmers build, integrate, and test in terms of their functionality – often with automated test runs.

As a rule, only change the program in small steps internally, without affecting the external function. After each change, you should run an automated test run if possible.

What techniques exist

A range of refactoring techniques exist. A complete overview can be found in the comprehensive book on refactoring by Martin Fowler and Kent Beck: Refactoring: Improving the Design of Existing Code. Here’s a brief summary:

Red-green development

Red-green development is a test-driven method of agile software development. It is used when a new function is to be integrated into existing code. Red stands for the first test run prior to implementing a new function in the code. Green stands for the simplest possible code section required for the function in order to pass the test. As a result, an extension is prepared with constant test runs to filter out defective code and increase functionality. Red-green development provides a foundation for continuous refactoring in continuous software development.

Branching by abstraction

This refactoring method describes a gradual change to a system and the conversion of old, implemented code into new, integrated sections. Branching by abstraction is typically used for large applications that involve class hierarchies, inheritance, and extraction. By implementing an abstraction that remains linked to an old implementation, other methods and classes can be linked with the abstraction and the functionality of the old code section can be replaced by abstraction.

This often occurs via pull-up or push-down methods. They link to a new, better function with the abstraction and transfer the links to it. In doing so, they either move a sub-class to a higher class (pull-up) or divide a higher class into sub-classes (push-down).

You can then delete the old functions without endangering the overall functionality. With these small changes, the system works unchanged while you gradually replace inelegant code with neat code, section by section.

Compiling methods

Refactoring is intended to make code methods as legible as possible. Ideally, external programmers should be able to grasp the internal logic of a method when reading the code. There are a number of different techniques for efficiently compiling methods. The aim of each change is to harmonize methods, remove redundancies, and split excessively long methods into separate sections, thereby opening them up to future changes.

Such techniques include:

Method extraction
Method inlining
Removing temporary variables
Replacing temporary variables with a request method
Introducing descriptive variables
Separating temporary variables
Removing assignments to parameter variables
Replacing a method with a method object
Replacing an algorithm

Moving attributes between classes

To improve code, you need to move attributes or methods between classes. Here, the following techniques are used:

Move method
Move attribute
Extract class
Inline class
Hide delegate
Remove class in the middle
Introduce extrinsic method
Introduce local extension

Data organization

This method aims to divide data into classes and keep them as neat and clear as possible. You should remove unnecessary links between classes, which impair the software functionality in the event of minor changes, and divide them into coherent classes.

Examples of techniques include:

Encapsulating own attribute accesses
Replacing own attributes with an object reference
Replacing a value with a reference
Replacing a reference with a value
Linking observable data
Encapsulating attributes
Replacing a dataset with a data class

Simplifying conditional expressions

While refactoring, you should simplify conditional expressions as far as possible. The following techniques can be applied to this end:

Stripping conditions
Merging conditional expressions
Merging repeated instructions in conditional expressions
Removing control switches
Replacing nestled conditions with guard clauses
Replacing case distinctions with polymorphism
Introducing zero-objects

Simplifying method requests

Method requests can be run faster and more easily using the following methods, for example:

Renaming methods
Adding parameters
Removing parameters
Replacing parameters with explicit methods
Replacing error codes with exceptions

Refactoring example: renaming methods

The following example shows that the method naming in the original code does not make its functionality clear and easy to understand. The method is intended to output a ZIP code for an office address, but it doesn’t indicate this task directly in the code. To formulate the code more clearly, it’s a good idea to rename the method in the process of code refactoring.

Before:

String getPostalCode() {
	return (theOfficePostalCode+“/“+theOfficeNumber);
}
System.out.print(getPostalCode());

After:

String getOfficePostalCode() {
	return (theOfficePostalCode+“/“+theOfficeNumber);
}
System.out.print(getOfficePostalCode());

Refactoring: advantages and disadvantages

Advantages	Disadvantages
Better comprehensibility facilitates maintenance and the extendibility of the software	Imprecise refactoring could introduce new bugs and errors into the code
Restructuring the source code is possible without altering the functionality	There is no clear definition of “neat code”
Improved legibility improves the comprehensibility of the code for other programmers	An improved code is often difficult for the customer to recognize, since the functionality stays the same, i.e. the benefit is not self-evident
Removed redundancies and duplications improve the effectiveness of the code	In the case of larger teams working on refactoring, the coordination effort required could be surprisingly high
Self-contained methods prevent local changes from having an effect on other parts of the code
Clean code with shorter, self-contained methods and classes is characterized by better testability

In general, when refactoring, introduce new functions only when the existing source code is to remain unchanged. Only alter the source code – i.e. carry out refactoring – when you are not adding any new functions.

10 Years Digital Guide: A Success Story

SDK: What Exactly is a Software Development Kit?

Anyone who develops software must keep a lot in mind. Good usability is just as important as the functionality of an application. At the same time, optimal performance is crucial – not so program errors. Finally, the product also needs to run well on the intended target platforms…

REDPIXEL.PLShutterstock

How to best handle legacy code

What are the disadvantages of legacy code? Very old code is usually incomprehensible to programmers and is difficult to maintain and use since it is based on outdated software versions or was written by other companies. Legacy code cannot be checked using regression testing, but…

NDAB Creativityshutterstock

Behavior Driven Development

Behavior-driven development (BDD) is a key component of agile software development. Instead of being based on a certain programming language, the technique utilizes a form of text that subsequently allows automated tests. BDD enables non-developers to use complex tools that check…

GaudiLabShutterstock

Readme: key details at a glance – including a template

A readme file is often the first port of call when starting a project, installing software or running an update. For developers, it’s hugely important for integrating public repositories (from GitHub, for example) into a project. In this article, we explain all about readme files…

garagestockshutterstock

What does cloud native mean?

The term cloud native has been all the rage in software development for some time now. But what is cloud native exactly and how is it implemented? This agile development approach aims to develop applications in a way that allows them to be seamlessly integrated into a cloud…

Encyclopedia
Cloud Computing

Refac­tor­ing

What is refac­tor­ing?

When source code de­te­ri­o­rates: spaghetti code

The aim of refac­tor­ing

What sources of errors does refac­tor­ing solve?

What approach does refac­tor­ing involve?

What tech­niques exist

Red-green de­vel­op­ment

Branching by ab­strac­tion