Code Duplication is Bad, Right?!

Code duplication is often considered a bad practice in programming. The DRY (Don't Repeat Yourself) principle advocates against it, and developers usually put in a lot of effort to identify and eliminate duplicated code when refactoring. However, is it always true that code duplication is harmful? I believe that there is more to this story.

Most of the new and semi-refactored code is in some way a duplicate. Then it gets refactored, hopefully reducing it to only the essentials. This is the natural process of code. Aiming to write the "perfect code" on the first try is the same as trying to go for a run without a proper warmup. Anyone who runs without warming up expects either an injury or low performance. Writing a far-from-perfect code full of duplications is the warmup needed to create the so-called "perfect code."

Making the "perfect code" is a process. Attempting to strictly adhere to a principle like DRY from the start can hinder the process and result in overly complex code at the wrong time, similar to premature optimization.

Example

An example can be (sorry, but its react):

    <form>
      {header && <h2>{header}</h2>}
      <label>
        Field 1:
        <input type="text" name="field1" value={formData.field1} onChange={handleChange} />
      </label>
      <label>
        Field 2:
        <input type="text" name="field2" value={formData.field2} onChange={handleChange} />
      </label>
      <label>
        Field 3:
        <input type="text" name="field3" value={formData.field3} onChange={handleChange} />
      </label>
      <div>
        <button onClick={handleSave}>Save</button>
        {onDelete && <button onClick={handleDelete}>Delete</button>}
      </div>
    </form>

This code has a lot of repetitions, but it's simple enough to be maintainable, and it's possible to use this as a first draft. An improvement to this could be:

    <form>
      {header && <h2>{header}</h2>}
      <FormField label="Field 1" name="field1" value={formData.field1} onChange={handleChange} />
      <FormField label="Field 2" name="field2" value={formData.field2} onChange={handleChange} />
      <FormField label="Field 3" name="field3" value={formData.field3} onChange={handleChange} />
      <div>
        <button onClick={handleSave}>Save</button>
        {onDelete && <button onClick={handleDelete}>Delete</button>}
      </div>
    </form>

This code has a few duplications, but it improved the maintainability by grouping each form field into a component, reducing the number of places to change when the time to change comes, and it didn't introduce much complexity.

In contrast, developing strictly from DRY, we could very easily get into a complex code such as:

    <form>
      {header && <h2>{header}</h2>}
      {formFields.map((field, index) => (
        <FormField
          key={index}
          label={field.label}
          name={field.name}
          value={field.value}
          onChange={(e) => handleChange(index, e.target.value)}
        />
      ))}
      <div>
        <button onClick={handleSave}>Save</button>
        {onDelete && <button onClick={handleDelete}>Delete</button>}
      </div>
    </form>

This type of code usually requires a lot of context to where to change, reducing readability, as multiple code hops are needed to understand what this is doing. Reducing readability hurts the maintainability, even though it was slightly improved, from the code perspective, as adding one more field is just a matter of editing an array. It's worth noting that this type of code can improve code readability in some cases, but in most cases, it can decrease it.

A possible better way

When it comes to identifying code duplication, it can be helpful to view it as a sign that indicates the need for refactoring. Duplications are essentially sections of code that share the same structure but differ in the input data. This is a clear indication of where changes can be made to improve the code. Refactored code typically has a better structure than from-the-start-generic code, as it has been modified based on the actual markers of duplication rather than just hypothetical ones.

Conclusion

In conclusion, code duplication is often detrimental. Principles like DRY aim to reduce it, but applying them prematurely can increase code complexity, hurting readability and maintainability. A better way to see code duplication is as markers for refactors.