LSP: an opinionated discussion

Liskov’s Substitution Principle (LSP for friends) is one of the five SOLID Principles - maybe the most misunderstood.

According to Wikipedia, it states that
Let P(x) be a property provable about objects x of type T. Then P(y) should be true for objects y of type S where S is a subtype of T.

More informally, the idea behind this principle is that we should not violate the contract published by the T supertype when we use or extend it.

I think it’s worth analyzing this idea deeply, in order to explain both classical and less trivial ways to violate the principle.

Generally speaking, we can try to classify LSP violations into three main classes:

  • Bad Client: the principle is violated due to the usage of the supertype
  • Bad Child: the principle is violated due to a crooked subtype implementation
  • Poor Modelling: the principle is violated due to the usage of a (general) type to model (less general) domain concepts

So, let’s show many examples of violations belonging to the three classes.

Bad Client

The first example of LSP violation I would like to talk about is a classical one: a bad client of a types hierarchy can break LSP downcasting a reference to a specific, hardcoded subtype:

1
2
3
4
public <T> T lastElementOf(Collection<T> input) {
var theList = (List<T>)input;
return theList.isEmpty() ? null : theList.get(theList.size() - 1);
}

Callers of the method lastElementOf believe they can invoke it passing whatever instance of whatever concrete implementation of the Collection interface, but calls passing something other than instances of types implementing the List subinterface will fail systematically: lastElementOf is a bad client for the Collection type hierarchy because not all Collection‘s subtypes are fully substitutable to the supertype when it comes to invoke the method.

A subtle variation of this violation of LSP, which I have already written here about, involves two unrelated interfaces: here the cast assumes that the actual parameter type implements both interfaces, breaking BadInterfaceDowncastingClient‘s contract - the method below is therefore a bad client for FrontEndContext interface.

1
2
3
4
5
6
7
8
9
10
11
12
public interface FrontEndContext {}

public interface BackEndContext {}

public class MyContext : FrontEndContext, BackEndContext {}

public class ABoundaryService {
public void BadInterfaceDowncastingClient(FrontEndContext ctx) {
var context = (BackEndContext)ctx;
doSomethingWith(context);
}
}

It must be said that LSP violations belonging to the bad client class are not very usual in code written by experienced developers (but it happened to me to find something very similar to the last example in code written by a self-styled software architect).

Bad Child

The second class of LSP violations it’s worth to mention is the class I like to call bad children: the violation consists in a subtype bad implementing the contract stated by the supertype.
The tipical example you can find of this class of violations is that of a Square class, extending Rectangle in a way that violates some supertype invariant (e.g. the idea that width and height can be changed independently) leading to surprisingly behaviour.

A less didactic and more actual example can be the following, where the InMemoryBin<T> implementation of the Bin<T> interface implements its supertype subtly breaking the contract of the addForever(T item):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public interface Bin<T> {
void addForever(T item);
}

public class InMemoryBin<T> implements Bin<T> {
private static final int MAX_SIZE = 50;
private int currentIndex = -1;
private T[] items = new T[MAX_SIZE];

public void addForever(T item) {
currentIndex = (currentIndex + 1) % MAX_SIZE;
items[currentIndex] = item;
}
}

The method required by the interface clearly requires added elements to be kept forever, but the implementation use a capped data-structure to store references to added items. So, when a client adds the (MAX_SIZE+1)th item to the InMemoryBin, the first item added disappears from the collection: InMemoryBin.addForEver is not really for ever and the described class acts as a bad child for the Bin supertype, hence not fully substitutable to it.

A third way to violate LSP writing a subtype of an interfaces or a superclass is to implement a method misrepresenting its intended purpose: the classic example is that of a class implementing the toString() method (better: overriding Object.toString() base method) in order to construct not only a textual representation of an object, but also a meaningful one from a business perspective.
toString() method is generally intended as a way to describe an object for logging and debugging purposes, but it’s not uncommon to find code like the following, which overrides and uses it to implement some functional requirement:

1
2
3
4
5
6
7
8
9
public class SqlQuery {
public SqlQuery(String tableName) { ... }
public void addStringFilter(String fieldName, string operator, String value) { ... }
public void addIntFilter(String fieldName, string operator, int value) { ... }
...
public void toString() { // Maybe should the method to be named 'buildSql()' or 'toSql()'?
return "select * from " + tableName + " where " + buildWhereClause();
}
}

I wrote that toString() method is generally intended as a way to describe an object for logging and debugging purposes, but sure, you can object that this is a very opinionated sentence. No doubt in part it is, but… what about the name of the method? It is toString, not toSql nor something like toHtml or toUiMessage: this method is intended to generate a String representation of an object, and String is a very unstructured, general-purpose concept: about the idea to represent Strings with specific structure defining custom types please read the next section - the same can be valid when it comes to the choice of method names; in one sentence, if the method name asks for a String returning implementation, you should return a real String, with all its invariants… and a Sql query definitely isn’t.

Sadly, this nuance of LSP bad child violation is a very common one, even in code written by experienced developers.

Poor Modelling

So far, so good.
The last class of LSP violation which I think is interesting to talk about is a bit different from bad client and bad child, due the fact that it does not involve any subclassing: the violation resides in a misuse of an (usualy very general-purpose) existing type from a modelling point of we: let me call it poor modelling.

This may seem like a provocation, and it certainly is in part, but I think that whenever you are using a general-purpose type (tipically: String) to represent data like email addresses or credit card numbers all your code around… you’re violating the Liskov Substitution Principle - if not in its formal definition, at least in its general meaning.

Representing an email address as a String, without defining a dedicated EmailAddress type that ensures invariants that should be valid for such a value is not only a naive modelling error (from the point of view of a domain driven desing you should not have any doubt about this); it’s not only very unconfortable and error prone (what about mistakenly swap two String values, the first one representing an email address and the second one holding a credit card number?); it violates the contract of the String class, too, because the very general-purpose String is intended to exhibit behaviours (invariants) that are simply not valid (they are conversely wrong!) for an email address (or a credit card number).
If you are not completely convinced: what about concatenating two Strings? Is the resulting value still a valid String? Of course it is!! Can the same be said abuout concatenating thw email addresses? What about keeping only the first then characters of an existing String? It results in a valid String, of course, but the same is in general not true for a part of an email addresses

So… you should model email addresses and credit card numbers (and users IDs and VAT codes and Sql queries and… well, you got the point) not only to be a good DDDer, nor to let the compiler statically help you to avoid errors using those values. You should not use unwrapped general-purpose types to represent your domain’s concepts even to respect the LSP’s spirit: not only subtypes, but also values should be fully susbtitutable to the super(or general-purpose) type; if your values are subject to restrictions (in value domain or in behaviour/invariants) with respect to the use of the chosen, general purpose type, you are in my humble opinion violating LSP due to poor modelling.

"Refactoring" a constant-time method into linear time

Recently, I had the (dis-)pleasure to stumble upon a coding horror created by a colleague of mine. When I told Pietro about it, he graciously asked me to write a post about it. So, here we go!

If you know Java and haven’t lived with your head under a rock for the past 7-odd years, you surely know about streams. We all know and love streams, right? Well, what I love even more than streams is applying my judgement and thinking whether it is or isn’t a good idea to use one.

Take this simple and innocent-looking piece of code, for example:

1
2
3
4
5
6
int lastElement(int[] array) {
if (array.length == 0) {
throw new RuntimeException("Array is empty");
}
return array[array.length - 1];
}

It doesn’t get any simpler than that.

But if you just want to use streams everywhere, you might be tempted to convert it as follows:

1
2
3
4
5
int lastElement(int[] array) {
return Arrays.stream(array)
.reduce((first, second) -> second)
.orElseThrow(() -> new RuntimeException("Array is empty"));
}

Spot the difference? You just converted a constant-time array access into a linear-time scan!

This is not necessarily an issue with streams (the same coding horror can be achieved with a good old-fashioned for loop, of course), but it just serves to prove that:

  • applying your judgement is better than blindly use the new shiny API
  • it is important to always consider the complexity (both in time and space) of your code.

Needless to say, the pull request that contained this change was NOT approved!

Functional shell: a minimal toolbox

I already wrote a post about adopting a functional programming style in Bash scripts. Here I want to explore how to build a minimal, reusable functional toolbox for my bash scripts, avoiding redefinition of base functional bricks whenever I need them.

So, in short: I wish I could write a scripts (say use-functional-bricks.sh) like the following

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#!/bin/bash
double () {
expr $1 '*' 2
}

square () {
expr $1 '*' $1
}

input=$(seq 1 6)
square_after_double_output=$(map "square" $(map "double" $input))
echo "square_after_double_output $square_after_double_output"

sum() {
expr $1 '+' $2
}

sum=$(reduce 0 "sum" $input)
echo "The sum is $sum"

referring to “globally” available functions map and reduce(and maybe others, too) without to re-write them everywhere they are needed and without to be bound to external scripts invocation.

The way I think we can solve the problem refers to three interesting features available in bash:

  • export functions from scripts (through export -f)
  • execute scripts in the current shell’s environment, through source command
  • execute scripts when bash starts

So I wrote the following script (say functional-bricks.sh):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/bin/bash
map () {
f=$1
shift
for x
do
$f $x
done
}
export -f map

reduce () {
acc=$1
f=$2
shift
shift
for curr
do
acc=$($f $acc $curr)
done
echo $acc
}
export -f reduce

and added the following line at the end of my user’s ~/.bashrc file:

1
. ~/common/functional-bricks.sh

and… voila!: now map and reduce implemented in functional-bricks.sh are available in all my bash sessions - so I can use them in all my scripts!
And because seeing is beleiving… if I launch the script use-functional-bricks.shdefined above, I get the following output:

1
2
3
4
5
6
7
square_after_double_output 4
16
36
64
100
144
The sum is 21

Functional way of thinking: higher order functions and polymorphism

I think higher order functions are the functional way to polymorphism: the same way you can write a generic algorithm in an OO language referring to an interface, which you can plug specific behaviour into the generic algorithm through, you can follow the sameplug something specific into something generic“ advice writing a high order function referring to a function signature.

Put it another way, function signatures are the functional counterpart for OO interfaces.

This is a very simple concept having big implications about you can design and organize your code. So, I think the best way to metabolize this concept is to get your hands dirty with higher order functions, in order to become faimilar with thinking in terms of functions that consume and return (other) functions.

For example, you can try to reimplement simple higher order functions from some library like lodash, ramdajs or similar. What about implementing an afterfunction that receives an integer n and another function f and returns a new function that invokes f when it is invoked for the n-th time?

1
2
3
4
5
6
7
8
function after(n, f) {
return function() {
n--
if(n === 0) {
f()
}
}
}

You can use like this:

1
2
3
4
5
6
const counter = after(5, () => console.log('5!'))
counter()
counter()
counter()
counter()
counter() // Writes '5!' to the console

So you have a simple tool for count events, reacting to the n-th occurrence (and you honored the Single Responsibility Principly, too, separating counting responsibility from business behavior implemented by f). Each invocation of after creates a scope (more *technically: a closure for subsequent executions os the returned function - the value of n or of variables defined in the lexical scope of after‘s invocation are nothing different from the instance fields you can use in your class implementing an interface.
Generalizing this approach, you can implement subtle variation of the after function: you can for example write an every function that returns a function that call the f parameter of the every invocation every n times

1
2
3
4
5
6
7
8
9
10
function every(n, f) {
let m = n
return function() {
m--
if(m === 0) {
m = n
f()
}
}
}

This is my way to see functional composition through higher order functions: another way to plug my specific, business-related behavior into e generic - higher order - piece of code, without reimplement the generic algorithm the latter implements.

Bonus track: what is the higher order behaviour implemented by the following function?

1
2
3
function canYouGuessMyName (items, f) {
return items.reduce((acc, curr) => ({ ...acc, [f(curr)]: (acc[f(curr)] || []).concat([curr]) }), {})
}

Written with StackEdit.

Functions as first-class citizens: the shell-ish version

The idea to compose multiple functions together, passing one or more of them to another as parameters, generally referred to as using higher order functions is a pattern which I’m very comfortable with, since I read about ten years ago the very enlighting book Functional Thinking: Paradigm Over Syntax by Neal Ford. The main idea behind this book is that you can adopt a functional mindset programming in any language, wheter it supports function as first-class citizens or not. The examples in that book are mostly written in Java (version 5 o 6), a language that supports (something similar to) functions as first-class citizens only from version 8. As I said, it’s more a matter of mindset than anything else.

So: a few days ago, during a lab of Operating System course, waiting for the solutions written by the students I was wondering If it is possible to take a functional approach composing functions (or something similar…) in a (bash) shell script.

(More in detail: the problem triggering my thinking about this topic was “how to reuse a (not so much) complicated piece of code involving searching files and iterating over them in two different use cases, that differed only in the action applied to each file)

My answer was Probably yes!, so I tried to write some code and ended up with the solution above.

The main point is - imho - that as in a language supporting functions as first class citizens the bricks to be put together are functions, in (bash) script the minimal bricks are commands: generally speaking, a command can be a binary, or a script - but functions defined in (bash) scripts can be used as commands, too. After making this mental switch, it’s not particularly difficult to find a (simple) solution:

action0.sh - An action to be applied to each element of a list

1
2
#!/bin/bash
echo "0 Processing $1"

action1.sh - This first action to be applied to each element of a list

1
2
#!/bin/bash
echo "1 Processing $1"

foreach.sh - Something similar to List<T>.ForEach(Action<T>) extension method of .Net standard library(it’s actually a high order program)

1
2
3
4
5
6
7
#!/bin/bash
action=$1
shift
for x
do
$action $x
done

main.sh - The main program, reusing foreach‘s logic in more cases, passing to the high order program different actions

1
2
3
4
5
6
#!/bin/bash
./foreach.sh ./action0.sh $(seq 1 6)
./foreach.sh ./action1.sh $(seq 1 6)

./foreach.sh ./action0.sh {A,B,C,D,E}19
./foreach.sh ./action1.sh {A,B,C,D,E}19

Following this approach, you can apply different actions to a bunch of files, without duplicating the code that finds them… and you do so applying a functional mindset to bash scripting!

In the same way it is possible to implement something like the classic map higher order function using functions in a bash script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
double () {
expr $1 '*' 2
}

square () {
expr $1 '*' $1
}

map () {
f=$1
shift
for x
do
echo $($f $x)
done
}

input=$(seq 1 6)
double_output=$(map "double" $input)
echo "double_output --> $double_output"
square_output=$(map "square" $input)
echo "square_output --> $square_output"
square_after_double_output=$(map "square" $(map "double" $input))
echo "square_after_double_output --> $square_after_double_output"

square_after_double_output, as expected, contains values 4, 16, 36, 64, 100, 144.

In conclusion… no matter what language you are using: using it functionally, composing bricks and higher order bricks together, it’s just a matter of mindset!

Set Of Responsibility and IoC

The original post was published here.

I recently read this Pietro’s post about a possible adaptation of the chain of responsability pattern: the “Set of responsibility”. This is very similar to its “father” because each Handler handles the responsibility of a Request but in this case it doesn’t propagate the check of responsibility to other handlers. There is a responsibility without chain!

In this article I’d like to present the usage of this pattern in an IOC Container, where the Handlers aren’t added to the HandlerSet list but provided from the container. In this way you can add new responsibility to the system simply adding a new Handler in the container without changing other parts of implemented code (es. the HandlerSet), in full compliance with the open closed principle.

For coding I’ll use the Spring Framework (JAVA) because it has a good IOC Container and provides a set of classes to manage with that. Inversion of control principle and the Dependency Injection are first class citizens in the SpringFramework.

Here is the UML class diagram with 3 responsabilities X,Y,Z and a brief description of the solution adopted.

Class diagram

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
@Component
public class XHandler implements Handler {
@Override
public Result handle(Request request) {
return ((RequestX) request).doSomething();
}
@Override
public boolean canHandle(Request request) {
return request instanceof RequestX;
}
}
@Component annotation on XHandler tells to Spring to instantiate an object of this type in the IOC container.
public interface HandlerManager {
Result handle(Request request) throws NoHandlerException;
}
@Service
public class CtxHandlerManager implements HandlerManager {

private ApplicationContext applicationContext;

@Value("${base.package}")
private String basePakage;

@Autowired
public CtxHandlerManager(ApplicationContext applicationContext) {
this.applicationContext = applicationContext;
}

@Override
public Result handle(Request request) throws NoHandlerException {
Optional handlerOpt = findHandler(request);
if ( !handlerOpt.isPresent() ) {
throw new NoHandlerException();
}
Handler handler = handlerOpt.get();
return handler.handle(request);
}

private Optional<Handler> findHandler(Request request) {
ClassPathScanningCandidateComponentProvider provider = createComponentScanner();

for (BeanDefinition beanDef : provider.findCandidateComponents(basePakage)) {
try {
Class clazz = Class.forName(beanDef.getBeanClassName());
Handler handler = (Handler) this.applicationContext.getBean(clazz);
//find responsible handler for the request
if (handler.canHandle(request)) {
return Optional.ofNullable(handler);
}
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
return Optional.empty();
}

private ClassPathScanningCandidateComponentProvider createComponentScanner() {
ClassPathScanningCandidateComponentProvider provider
= new ClassPathScanningCandidateComponentProvider(false);
provider.addIncludeFilter(new AssignableTypeFilter(Handler.class));
return provider;
}
}

CtxHandlerManager works like a Handler dispatcher. The handle method finds the Handler and calls its handle method which invokes the doSomething method of the Request.

In the findHandler method I use the Spring ClassPathScanningCandidateComponentProvider object with an AssignableTypeFiler to the Handler class. I call the findCandidateComponent on a basePackage (the value is set by @Value Spring annotation) and for each candidate the canHandle method check the responsibility. And that’s all!

In Sender class the HandlerProvider implementation (CtxHandlerManager)is injected by Spring IOC by “Autowiring”:

1
2
3
4
5
6
7
8
9
@Service
public class Sender {
@Autowired
private HandlerManager handlerProvider;
public void callX() throws NoHandlerException {
Request requestX = new RequestX();
Result result = handlerProvider.handle(requestX);
...
}

This solution let you to add new responsibility simply creating new Request implementation and a new Handler implementation to manage it. By applying @Component annotation on the Handler you allow Spring to autodetect the class for dependency injection when annotation-based configuration and classpath scanning is used. On application reboot, this class can be provided by the IOC Container and instantiated simply invoking the HandlerManager.

In the next post I’d like present a possible implementation of a Component Factory using the set of responsability pattern in conjunction with another interesting pattern, the builder pattern.

Happy coding!

Set of responsibility

The original post was published here.

So, three and a half years later… I’m back.

According to wikipedia, the chain-of-responsibility pattern is a design pattern consisting of a source of command objects and a series of processing objects. Each processing object contains logic that defines the types of command objects that it can handle; the rest are passed to the next processing object in the chain.

In some cases, I will benefit from flexibility allowed by this pattern, without being tied to the chain-based structure, e.g. when there is an IoC container involved: Handlers in the pattern all have the same interface, so it’s difficult to leave their instantiation to the IoC container.

In such scenario I use a variation of the classic chain of responsibility: there are still responsibilities, off course, but there is no chain out there.

I like to call my variation set of responsibility (or list of responsibility - see above for discussion about this - or selectable responsibility) - the structure is the one that follows (C# code):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
interface Handler {
Result Handle(Request request);

bool CanHandle(Request request);
}


class HandlerSet {
IEnumerable handlers;

HandlerSet(IEnumerable < handler > handlers) {
this.handlers = handlers;
}

Result Handle(Request request) {
return this.handlers.Single(h => h.CanHandle(request)).Handle(request);
}
}

class Sender {
HandlerSet handler;

Sender(HandlerSet handler) {
this.handler = handler;
}

void FooBar() {
Request request = ...;
var result = this.handler.Handle(request);
}
}

One interesting scenario which I’ve applied this pattern in is the case in which the Handlers‘ input type Request hides a hierarchy of different subclasses and each Handler implementation is able to deal with a specific Request subclass: when use polymorphism is not a viable way, e.g. because those classes comes from an external library and are not under our control or they aren’t the best place that to implement the processing logic in, we can use set of responsibility in order to cleanup the horrible code that follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class RequestX : Request {}

class RequestY : Request {}

class RequestZ : Request {}

class Sender {
var result = null;

void FooBar() {
Request request = ...;

if(request is RequestX) {
result = HandleX((RequestX)request);
} else if (request is RequestY) {
result = HandleY((RequestY)request)
} else if (request is RequestZ) {
result = HandleZ((RequestZ)request)
}
}
}

We can’t avoid is and () operators usage, but we can hide them behind a polymorphic interface, adopting a design than conform to open-closed principle:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class Sender {
HandlerSet handler;
Sender(HandlerSet handler) {
this.handler = handler;
}

void FooBar() {
Request request = ...;
var result = this.handler.Handle(request);
}
}

class HandlerX : Handler {
bool CanHandle(Request request) => request is RequestX;

Result Handle(Request request) {
HandleX((RequestX)request);
}
}

class HandlerY : Handler {
bool CanHandle(Request request) => request is RequestY;

Result Handle(Request request) {
HandleY((RequestY)request);
}
}

class HandlerZ : Handler {
bool CanHandle(Request request) => request is RequestZ;

Result Handle(Request request) {
HandleZ((RequestZ)request);
}
}

Adding a new Request subclass case is now only a matter of adding a new HandlerAA implemementation of Handler interface, without the need to touch existing code.

I use in cases like the explained one the name of set of responsibility to stress the idea that only one handler of the set can handle a single, specific request (I use _handlers.Single(...) method in HandlerSet implementation, too).

When the order in which the handlers are tested matters, we can adopt a different _handlers.Single(...) strategy: in this case I like to call the pattern variation list of responsibility.

When more than one handler can handle a specific request we can think to variations of this pattern that select all applyable handlers (i.e. those handlers whose CanHandle method returns true for the current request) and apply them to the incoming request.

So, we have decoupled set/list/chain-processing logic from concrete Request processing logic, leaving them to vary independently, according to the Single Responsibility Principle, an advantage we would not have adopting the original chain of responsibility pattern…

Embrace change

So it is: three months ago I joined a new job position, switching from a big company to a little, agile one, after more than eight years of distinguished service :-).
Furthermore: I switched from a Java and JEE - centric technological environment to a more rich and various one - yet .NET and C# oriented.
So, my Java Peanuts will maybe become in the future C# Peanuts (or Node.js Peanuts, who knows…) or, more generally, Programming Peanuts: for the moment I’m still planning a little post series about my way from Java to .NET, so… if you are interested… stay tuned!

Seven things I really hate in database design

  1. Common prexif in all table names
    eg: TXXX, TYYY, TZZZ, VAAA, VBBB - T stays for Table, V stays for View
    eg: APPXXX, APPYYY, APPZZZ - APP is an application name
  2. Common prefix in all field names in every table
    eg: APPXXX.XXX_FIELD_A, APPXXX.XXX_FIELD_B, APPXXX.XXX_FIELD_C
  3. Fields with the same meaning and different names (in different tables)
    es: TABLE_A.BANK_ID, TABLE_B.BK_CODE
  4. Fields with the same logical type and different physical types
    eg: TABLE_A.MONEY_AMOUNT NUMBER(20,2)
    TABLE_B.MONEY_AMOUNT NUMBER(20,0) – value * 100
    TABLE_B.MONEY_AMOUNT VARCHAR(20) –value * 100 as char
  5. No foreign-keys nor integrity constraints at all - by design
  6. Date (or generally structured data type) representation with generic and not specific types
    eg: TABLE_A.START_DATE NUMBER(8,0) – yyyyddmm as int
    eg: TABLE_B.START_DATE VARCHAR(8) – yyyyddmm as char
  7. (possible only in presenceof 6.) Special values for semantic corner-cases which are syntactically invalid
    eg: EXPIRY_DATE = 99999999 – represents “never expires case”,
    but… IT’S NOT A VALID DATE!!! - why not 99991231??

Mocking static methods and the Gateway pattern

This post was originally posted here.

A year ago I started to use mocking libraries (e.g., Mockito, EasyMock, …), both for learning something new and for testing purpose in hopeless cases.
Briefly: such a library makes it possible to dynamically redefine the behaviour (return value, thrown exceptions) of the methods of the class under test, in order to run tests in a controlled environment. It makes it possible even to check behavioural expectations for mock objects, in order to test the Class Under Test’s interactions with its collaborators.
A few weeks ago a colleague asked me: “[How] can I mock a static method, maybe using a mock library?”.
In detail, he was looking for a way to test a class whose code was using a static CustomerLoginFacade.login(String username, String password) method provided by an external API (an authentication custom API by a customer enterprise).
His code looked as follows:

1
2
3
4
5
6
7
8
9
10
11
12
public class ClassUnderTest {
...
public void methodUnderTest(...) {
...
// check authentication
if(CustomerLoginFacade.login(...)) {
...
} else {
...
}
}
}

but customer’s authentication provider was not accessible from test environment: so the main (but not the only: test isolation, performances, …) reason to mock the static login method.

A quick search in the magic mocking libraries world revealed that:

  • EasyMock supports static methods mocking using extensions (e.g, Class Extension, PowerMock)

  • JMock doesn’t support static method mocking

  • Mockito (my preferred [Java] mocking library at the moment) doesn’t support static method mocking, because Mockito prefers object orientation and dependency injection over static, procedural code that is hard to understand & change (see official FAQ). The same position appears even in a JMock-related discussion. PowerMock provides a Mockito extension that supports static methods mocking.
    So, thanks to my colleague, I will analize the more general question “Ho can I handle external / legacy API (e.g., static methods acting as service facade) for testing purposes?”. I can identify three different approaches:

  • mocking by library: we can use a mocking library supporting external / legacy API mocking (e.g, class’ mocking, static methods’ mocking), as discussed earlier

  • mocking by language: we can refer to the features of a dynamically typed programming language to dynamically change external / legacy API implementation / behaviour. E.g., the login problem discussed earlier can be solved in Groovy style, using the features of a language fully integrated with the Java runtime:

    1
    2
    3
    CustomerLoginFacade.metaClass.'static'.login = {
    return true;
    };

    Such an approach can be successfully used when CustomerLoginFacade.login‘s client code is Groovy code, not for old Java client code.

  • Architectural approach: mocking by design. This approach refers to a general principle: hide every external (concrete) API behind an interface (i.e.: coding on interfaces, not on concrete implementation). This principle is commonly knows as dependency inversion principle.
    So, we can solve my colleague’s problem this way: first, we define a login interface:

    1
    2
    3
    public interface MyLoginService {
    public abstract boolean login(final String username, final String password);
    }

    Then, we refactor the original methodUnderTest code to use the interface:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public class ClassUnderTest {
private MyLoginService loginService;
// Collaborator provided by Constructor injection (see here for
// a discussion about injection styles)
public ClassUnderTest(final LoginService loginService) {
this.loginService = loginService;
}
...
public void methodUnderTest(...) {
...
// check authentication
if(loginService.login(...)) {
...
} else {
...
}
}
}

So, for testing pourposes, we can simply inject a fake implementation of the MyLoginService interface:

1
2
3
4
5
public void myTest() {
final ClassUnderTest cut = new ClassUnderTest(new FakeLoginService());
cut.methodUnderTest(..., ...);
...
}

where FakeLoginService is simply

1
2
3
4
5
6
7
8
9
10
11
12
public class FakeLoginService implements MyLoginService {
public boolean login(final String username, final String password) {
return true;
}
}
and the real, pruduction implementation of the interface looks simply like this:

public class RealLoginService implements MyLoginService {
public boolean login(final String username, final String password) {
return CustomerLoginFacade.login(username, password);
}
}

Ultimately, the interface defines an abstract gateway to the external authentication API: changing the gateway implementation, we can set up a testing environment fully decoupled from real customer’ authentication provider
.
IMHO, i prefer the last mocking approach: it’s more object oriented, and after all… my colleague called me once the more OO person I know :-). I find this approach more clean and elegant: it’s built only upon common features of programming languages and doesn’t refer to external libraries nor testing-oriented dynamic languafe features.
In terms of design, too, I think it’s a more readable and more reusable solution to the problem, which allows a clearer identification of responsibilities of the various pieces of code: MyLoginService defines an interface, and every implementation represents a way to implement it (a real-life (i.e.: production) implementation versus the fake one).

However, method mocking (by library or by language, doesn’t matter) is in certain, specific situations a very useful technique, too, especially when code that suffers static dependencies (ClassUnderTest in our example) is an example of legacy code, designed with no testing in mind, and is eventually out of developer control.
[Incidentally: the solution adopted by my colleague was just that I have proposed (i.e., mocking by design)]

Credits: thanks to Samuele for giving me cause to analyze such a problem (and for our frequent and ever interesting design-related discussion).
Thanks to my wife for hers valuable support in writing in my pseudo-English.