2021-01-25 Pietro Martinelli

solid

LSP: an opinionated discussion

Liskov’s Substitution Principle (LSP for friends) is one of the five SOLID Principles - maybe the most misunderstood.

According to Wikipedia, it states that
Let P(x) be a property provable about objects x of type T. Then P(y) should be true for objects y of type S where S is a subtype of T.

More informally, the idea behind this principle is that we should not violate the contract published by the T supertype when we use or extend it.

I think it’s worth analyzing this idea deeply, in order to explain both classical and less trivial ways to violate the principle.

Generally speaking, we can try to classify LSP violations into three main classes:

Bad Client: the principle is violated due to the usage of the supertype
Bad Child: the principle is violated due to a crooked subtype implementation
Poor Modelling: the principle is violated due to the usage of a (general) type to model (less general) domain concepts

So, let’s show many examples of violations belonging to the three classes.

Bad Client

The first example of LSP violation I would like to talk about is a classical one: a bad client of a types hierarchy can break LSP downcasting a reference to a specific, hardcoded subtype:

public <T> T lastElementOf(Collection<T> input) {
  var theList = (List<T>)input;
  return theList.isEmpty() ? null : theList.get(theList.size() - 1);
}

Callers of the method lastElementOf believe they can invoke it passing whatever instance of whatever concrete implementation of the Collection interface, but calls passing something other than instances of types implementing the List subinterface will fail systematically: lastElementOf is a bad client for the Collection type hierarchy because not all Collection‘s subtypes are fully substitutable to the supertype when it comes to invoke the method.

A subtle variation of this violation of LSP, which I have already written here about, involves two unrelated interfaces: here the cast assumes that the actual parameter type implements both interfaces, breaking BadInterfaceDowncastingClient‘s contract - the method below is therefore a bad client for FrontEndContext interface.

public interface FrontEndContext {}

public interface BackEndContext {}

public class MyContext : FrontEndContext, BackEndContext {}

public class ABoundaryService {
  public void BadInterfaceDowncastingClient(FrontEndContext ctx) {
    var context = (BackEndContext)ctx;
    doSomethingWith(context);
  }
}

It must be said that LSP violations belonging to the bad client class are not very usual in code written by experienced developers (but it happened to me to find something very similar to the last example in code written by a self-styled software architect).

Bad Child

The second class of LSP violations it’s worth to mention is the class I like to call bad children: the violation consists in a subtype bad implementing the contract stated by the supertype.
The tipical example you can find of this class of violations is that of a Square class, extending Rectangle in a way that violates some supertype invariant (e.g. the idea that width and height can be changed independently) leading to surprisingly behaviour.

A less didactic and more actual example can be the following, where the InMemoryBin<T> implementation of the Bin<T> interface implements its supertype subtly breaking the contract of the addForever(T item):

public interface Bin<T> {
  void addForever(T item);
}

public class InMemoryBin<T> implements Bin<T>  {
  private static final int MAX_SIZE = 50;
  private int currentIndex = -1;
  private T[] items = new T[MAX_SIZE];

  public void addForever(T item) {
    currentIndex = (currentIndex + 1) % MAX_SIZE;
    items[currentIndex] = item;
  }
}

The method required by the interface clearly requires added elements to be kept forever, but the implementation use a capped data-structure to store references to added items. So, when a client adds the (MAX_SIZE+1)th item to the InMemoryBin, the first item added disappears from the collection: InMemoryBin.addForEver is not really for ever and the described class acts as a bad child for the Bin supertype, hence not fully substitutable to it.

A third way to violate LSP writing a subtype of an interfaces or a superclass is to implement a method misrepresenting its intended purpose: the classic example is that of a class implementing the toString() method (better: overriding Object.toString() base method) in order to construct not only a textual representation of an object, but also a meaningful one from a business perspective.
toString() method is generally intended as a way to describe an object for logging and debugging purposes, but it’s not uncommon to find code like the following, which overrides and uses it to implement some functional requirement:

public class SqlQuery {
  public SqlQuery(String tableName) { ... }
  public void addStringFilter(String fieldName, string operator, String value) { ... }
  public void addIntFilter(String fieldName, string operator, int value) { ... }
  ...
  public void toString() { // Maybe should the method to be named 'buildSql()' or 'toSql()'?
    return "select * from " + tableName + " where " + buildWhereClause();
  }
}

I wrote that toString() method is generally intended as a way to describe an object for logging and debugging purposes, but sure, you can object that this is a very opinionated sentence. No doubt in part it is, but… what about the name of the method? It is toString, not toSql nor something like toHtml or toUiMessage: this method is intended to generate a String representation of an object, and String is a very unstructured, general-purpose concept: about the idea to represent Strings with specific structure defining custom types please read the next section - the same can be valid when it comes to the choice of method names; in one sentence, if the method name asks for a String returning implementation, you should return a real String, with all its invariants… and a Sql query definitely isn’t.

Sadly, this nuance of LSP bad child violation is a very common one, even in code written by experienced developers.

Poor Modelling

So far, so good.
The last class of LSP violation which I think is interesting to talk about is a bit different from bad client and bad child, due the fact that it does not involve any subclassing: the violation resides in a misuse of an (usualy very general-purpose) existing type from a modelling point of we: let me call it poor modelling.

This may seem like a provocation, and it certainly is in part, but I think that whenever you are using a general-purpose type (tipically: String) to represent data like email addresses or credit card numbers all your code around… you’re violating the Liskov Substitution Principle - if not in its formal definition, at least in its general meaning.

Representing an email address as a String, without defining a dedicated EmailAddress type that ensures invariants that should be valid for such a value is not only a naive modelling error (from the point of view of a domain driven desing you should not have any doubt about this); it’s not only very unconfortable and error prone (what about mistakenly swap two String values, the first one representing an email address and the second one holding a credit card number?); it violates the contract of the String class, too, because the very general-purpose String is intended to exhibit behaviours (invariants) that are simply not valid (they are conversely wrong!) for an email address (or a credit card number).
If you are not completely convinced: what about concatenating two Strings? Is the resulting value still a valid String? Of course it is!! Can the same be said abuout concatenating thw email addresses? What about keeping only the first then characters of an existing String? It results in a valid String, of course, but the same is in general not true for a part of an email addresses…

So… you should model email addresses and credit card numbers (and users IDs and VAT codes and Sql queries and… well, you got the point) not only to be a good DDDer, nor to let the compiler statically help you to avoid errors using those values. You should not use unwrapped general-purpose types to represent your domain’s concepts even to respect the LSP’s spirit: not only subtypes, but also values should be fully susbtitutable to the super(or general-purpose) type; if your values are subject to restrictions (in value domain or in behaviour/invariants) with respect to the use of the chosen, general purpose type, you are in my humble opinion violating LSP due to poor modelling.