Refactoring and Good Code

Our code is not always the best one which we can do, and we have to assume this fact. Sometimes our solutions, designs, algorithms are not good, because they have been implemented with pressure due to a deadline. So it would be a good option in a close future to make a refactor in our code as part of our development. The main reasons are the following:
 
  • Improve code readability
  • Reduce complexity
  • Improve the maintainability of the source code
  • Improve extensibility
¿What is maintainability?
It is how difficult to fix a defect or add new improvement over one functionality which it has been created.
 
 
¿What is extensibility?
It is a system design principle where the implementation takes future growth into consideration. It is a systemic measure of the ability to extend a system and the level of effort required to implement the extension
 
 
¿How can I know if my code bad smells?
If your source code have some of the following symptoms or features, likely you code bad smells:
  • Code within classes
    • Long methods
      • The ideal length of a method has different opinions.
        In the Clean Code: A Handbook of Agile Software Craftsmanship, Robert Martin claims: “The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. Functions should not be 100 lines long. Functions should hardly ever be 20 lines long“.

        Also Kent Beck claims that “Every function in his program was just two, or three, or four lines long. Each was transparently obvious. Each told a story. And each led you to the next in a compelling order. That’s how short your functions should be!“, so both have the same opinion.
    • Comments
      • On the one hand,  comments should not be necessary because the code should read as natural language and easy to understand.
        On the other hand, this does not always happen and comments would be necessary in order to clarify some aspect of the code, however you should not use these ones in every line since it would distract of the most important the code.
      • There is a good quote abouts comments of Robert Martin: “A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment.”
    • Excessive number of arguments
      • The number of numbers of a function or procedure should be as few as possible. If your function have a lot of parameters, maybe you are doing something wrong and you have to define your method in another way or perhaps you have to split it in more methods.
        Regarding the number of arguments Robert Martin has claimed in the clean code book: “The ideal number of arguments for a function is zero (niladic). Next comes one (monadic), followed closely by two (dyadic). Three arguments (triadic) should be avoided where possible. More than three (polyadic) requires very special justification — and then shouldn’t be used anyway.
    • Dead code
      • It means unnecessary, inoperative code that can be removed. 
    • Duplicate code
      • It is when the same code is used more than once. This problem is very important in the software development and we have to try to avoid it in order to increase the quality of the source code. DRY (Don’t repeat yourself) principle can be a good solution in order to avoid the duplicate code, if you want to know more information about this principle you can take a look this conversation among the Pragmatic Programmers Andy Hunt, Dave Thomas and Bill Venners about maintenance programming.
        Also you can check the following article DRY Principle
    • Conditional huge complexity
      • Conditional statements tend to be very complex, so that we have to try to simplify them. To simplify them, you can decompose the complex condition in some methods with proper names or you can use logic rules to reduce the complexity of the condition.
    • Wrong names
      • When a developer is writing source code, he has to choose the proper names for methods, variables and constants.  Usually, we want to save some time and we do not spend enough time to choose the proper names, because the names have to meet the following criteria:
        • Names have to reveal your intentions
        • Avoid disinformation
        • Good names length
        • Use one word per concept
        • Always code in one notation

      • Regarding the names of methods,  variables, classes and other elements Robert Martin in the Clean Code Book says the following:

        The proper use of comments is to compensate for our failure to express our self in code

        Also he claims other statements about the names:

        Functions should do something, or answer something, but not both.

        Methods should have verb or phrase names.”

        One way to know that a function is doing more than ‘one thing’ is if you can extract another function from it with a name that is not merely a restatement of its implementation.

        Goods examples of name for methods would be the followings: “handleXYZ”, “getXYZ”, “calculateXYZ”

    • Large class
      • A class should be small, as small as it would be possible. Regarding the Refactoring in Large Software ProjectsMartin Lippert and Stephen Roock talk about the Rule of 30:
        If an element consists of more than 30 sub-elements, it is highly probable that there is a serious problem:a) Methods should not have more than an average of 30 code lines (not counting line spaces and comments).b) A class should contain an average of less than 30 methods, resulting in up to 900 lines of code.c) A package shouldn’t contain more than 30 classes, thus comprising up to 27,000 code lines.d) Subsystems with more than 30 packages should be avoided. Such a subsystem would count up to 900 classes with up to 810,000 lines of code.e) A system with 30 subsystems would thus possess 27,000 classes and 24.3 million code lines.Maybe the Rule of 30 is a good guide in order to create clean classes.
    • Speculative generality
      • You should write code for the problem which you have found and you do not worry about tomorrow’s problems when these ones definitely happen. And you should not lose between “what if…”  and “in a future if …”.
        Within the Extremeprogramming practices, it exists a principle or good practice whose name is YagNi (YouArentGonnaNeedIt) and it talks about fixing the current problem and you do not think in unlikely flows.

  • Between Classes
    • Message chains
      • It happens when there are a lot of method calls in order to get some data or information of a object. These chains of methods call do the code depends on the relationship between unrelated object.
    • Middle man
      • Sometimes a feature of model can be used in bad way. For instance, in Object Model you can delegate the behaviour of a class in different classes, however it might be possible that one of these delegated classes do not have any responsibility, so it was wrong because it can not exist a class without responsibility. Be aware with the classes which are simply wrappers.
    • Shotgun surgery
      • It happens when you have to change a lot of pieces of your source code in different places to add a new functionality in your application.
    • Lazy Class
      • Every classes have a cost of money and time to understand and maintain them, so that if there are classes that they can not pay for your attention. Theses classes should be removed,  because they increase the complexity and maintainability of the project.
    • Alternative classes with different interfaces
      • It happens when it exists two different classes with different interfaces, but they have similar behaviour so you would make a refactor  using an adapter pattern to unify both interfaces.
    • Primitive obsession
      • The main cause for this problem is: “Use primitive data types to represent domain ideas”. It used to happen in a weakness moment, when the programmer decides to take the simple option instead of thinking another kind of solution more difficult than the previous one (For example: create a new class to represent the new field).

        Primitives are often used to “simulate” types. So instead of a separate data type, you have a set of numbers or strings that form the list of allowable values for some entity. Easy-to-understand names are then given to these specific numbers and strings via constants, which is why they are spread wide and far.
    • Data class
      • When it exists classes with only attributes and getters and setters methods, these classes are merely stored data for other classes, and these ones do not contain any additional functionality to work for they own.

        It’s a normal thing when a newly created class contains only a few public fields (and maybe even a handful of getters/setters). But the true power of objects is that they can contain behaviour types or operations on their data.
    • Data clumps
      • Sometime in different parts of the code there is the same group of variables close to the same object, so it is likely that this group of variables belong to the class of that object. Therefore you should to extract that variables to that class.
        The main reason of this problem is “the copy-pasta programming” and an poor data structure.
    • Refused Bequest
      • It smells when you have a class which inherits of another class but you do not use all the functionality provided by the bequest. So the hierarchy is off-kilter and the unneeded methods might go unused.
    • Indecent Exposure
      • Encapsulate is one of the most important features of the Software Design, in addition this one is not used properly and as a programmer when we design and develop a class, we publicity attributes or functionalities that they do not be shown. This fact contributes to increase the complexity of the design.
    • Feature Envy
      • This error helps to increase the coupling and duplicity code of the application which we are developing. It happens if a class makes huge calls to other class or classes so as to obtain data or functionality. In that case, it may move the functionality on data to this class as well.
    • Divergent changes
      • It occurs when somebody has made changes to class and the developer has included new behaviours that are very different that the previous ones. Therefore it contains too much unrelated functionality, so the developer should isolate that changes in other class.
    • Parallel Inheritance Hierarchies
      • While you are developing, you making a subclass of one class. Instead of making a subclass of another. So when you have defining the inheritance of a application, this structure has to be thought pretty well to avoid errors like that.
    • Solution sprawl
      • Sometimes, we have implemented a solution in order to solve a problem or perform a responsibility, which it has become sprawled across several classes. This fact is the result of implementing a quick solution or feature without spending enough time simplifying and consolidating the design.
 
 
¿What can I do in order to improve our legacy code?
Refactoring would be a good answer, in fact there are some techniques which it can be used to get better our source code. The most important are the following:
  • Composing methods
  • Moving features between objects
  • Organizing data
  • Simplifying conditional expressions
  • Simplifying method calls
Also there is some practices such us S.O.L.I.D principles, DRY principle, Boy Scout rule and others, which you can use in order to improve your quality code and may avoid doing a refactor.
 
Finally, in following articles we will delve into the different symptoms of the smell code to study why this happens, how we can fix it and how we try to avoid it.
 
References:

blogSignature1

Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *