5 October 2024

(Small) Size Matters Not

How small is too small for a repository, module, package, file, struct, method or function?

Do you prefer to work on a system with many small parts, or a few very large ones? As a follow-up, consider whether, despite the inherent overhead involved, are any of the following too small or simple to justify their definition?

  1. a single-line function/method
  2. a single-field struct/class
  3. a single-method interface
  4. a single-file Go package
  5. a single-package Go module
  6. a single-file git repository

My opinion: Certainly not!

It was probably the following pronouncement, from a book I read when I was first really learning to program, that helped to form my thoughts on this matter (emphasis added):

Initially, you may think that it's silly to create a class with a bunch of one- or two-line methods. Actually, it's quite common for a well-modularized program to have lots of trivial methods. The point of design is to break a problem down into simpler pieces. If those pieces are so simple that they are obvious, that gives us confidence that we must have gotten it right. --Python Programming: An Introduction to Computer Science, First Edition, Chapter 12, p. 395, John Zelle

A Go Code Example

Consider the following Go interface:

type Printer interface {
    Printf(format string, args ...any)
}

A sensible default implementation of the above interface might be as simple as this:

type NopPrinter struct{}

func (NopPrinter) Printf(string, ...any) {}

The Printer interface has only a single method. The NopPrinter struct defines no fields and its Printf method contains zero statements. Notwithstanding these facts it would be silly to conclude that the overhead to define these simple elements outweighs their value.

The (separate) elements of this blog

It may surprise you to realize that this blog is powered, not by just one, but by three totally separate git repositories:

  1. Content
    • A private repository containing markdown files and HTML template files.
  2. HTML
    • A private repository containing the content in repository 1, but as fully-rendered HTML pages and listings. This repository is integrated with a Netlify deployment such that any git push operation results in a deployment of the newly pushed content.
  3. github.com/mdwhatcott/huguinho
    • The code that, when compiled and executed, converts the content in repository 1, into the fully rendered HTML found in repository 2.

Of course, all three of these repositories could be combined into a single repository, and some would argue that this would be a simpler arrangement for long-term maintenance, but that decision would be, as all engineering decisions are, a trade-off. One of the implications of that decision would be to prevent the code in repository 3 from being open-sourced (because some of the content is unlisted and private). It would force the intermixing of commits with rendered content (requiring a deployment) with commits that update content still being drafted, as well as commits providing patches and new features for the blog-generating code.

The contents of all three repositories change for (very) different reasons and on different timetables. Repository 3 is relatively stable at this point, rarely needing attention. Repository 1 is what I work with on a regular basis as I write content. Repository 2 is largely managed by scripts and is largely outside my awareness most of the time. Even if the contents of any of these repositories amounted to just a single file, their existence would be no less justified.

In summary: regardless of the (small) size of each element in a system, we should separate, version, and deploy elements that change for different reasons and on different timetables.