Class Methods Resist Refactoring?

April 28, 2015 ยป 19 minutes (3572 words)

Recently, a coworker shared the blog post "Why Ruby Class Methods Resist Refactoring" with me. I thought it was very interesting, mainly because at Hireology we use an open source project named Light Service that helps us abstract the business logic of our main Rails application from the Rails framework. Under the hood, Light Service provides a DSL that makes use of a modified version of the Command Pattern. Where one normally expects an object instance, Light Service instead uses only class methods defined at runtime.

Considering the success the team at Hireology has had using Light Service, I was intrigued to find out more why this author felt that class methods resist refactoring. I have done days worth of refactoring in this code, and have found that the use of class methods have made the refactoring easy and simple. I had never given it much thought in the past, finding the use of class methods vs instance methods to be a fairly straightforward decision that most object oriented software developers would agree with. Yet, I was very, very wrong.

Haters Gonna Hate

After a not very exhuastive Google search though, I was only able to find writing chastising the use of class methods. I was curious why, and after a sampling of posts, I distilled a list of arguments about why using class methods should be avoided:

  1. Class methods cannot be used to declare an interface (applicable to Java)1
  2. Class methods prevent polymorphism1
  3. Class methods promote use of global mutable state2
  4. Class methods promote procedural code3,4

Context Matters

Points 1 and 2 are both valid and I completely agree that using class methods will impose limitations in an object oriented context that make the use of class methods less than appealing. If you find yourself in a situation where a strictly defined interface or polymorphism helps create the "best"5 solution, then don't use class methods. However, rather than write-off class methods all together, I see this as an exercise in judgement. Because there are certain situations in which class methods are not useful, or would even be problematic, does not inheritely lead to the conclusion that they are evil. However, there also exists the attitude that class methods lead to the use of global mutable state and writing code in a procedural way.

While the use of global mutable state and procedural code in an object oriented context is unfortunate, there seems to be an incorrect assumption that the use of class methods leads to these problems. I have written plenty of class methods that do not define global mutable state, nor is the behavior defined in a procedural way. The assumption that class methods lead to these problems is false logic. It is akin to saying "Because class methods can be used to define global mutable state, therefore all class methods lead to global mutable state". It is not the case that class methods create global mutable state due to their inherit use, but the choices of the software developer that thinks global mutable state is an acceptable solution.

To Procedure Or Not Procedure?

Something I find to be great fun in the software developer world is that it tends to be full of opinions. Well, I'd like to share one with you. I think procedural code is often times easier to read, to understand and test. However, I am probably thinking of a different flavor of procedural code than what critics of class methods are thinking of as procedural code. I am in favor of pushing data and state as far out of the context in which computation occurs as possible, such that a procedure (or method), can be thought of as a pure function. This means isolating side-effect generating code into special methods, typically denoted with a `!`. If I want to talk with the database, or mutate state, or do something that would otherwise not provide referential transparency, I want to isolate that behavior from the behavior that is pure and referentially transparent. This requires a different orientation in thinking from the traditional object oriented paradigm which will appeal to some and not others.

However, functional programming arguments aside, let us not fool ourselves into thinking that object oriented code is somehow inherently non-procedural. OOP is a special form of proceduralism which combines both data and procedures (methods) together in the same context (class). In traditional OOP, the fact that data is stored as instance variables, and can be accessed as properties within the context of the receiver (object) of a method call, is nothing but indirection. The data stored as properties of an object via its instance variables is still used in the same manner even the most procedural C code's functions use data. Does this mean we should abandon classes and objects and return to type defs and structs? I don't think so.

But does the traditional perspective of OOP preclude our code from being described as "procedural"? To answer "yes" would be deluding ourselves into thinking that somehow the combination of data and behavior in one context is computationally different than traditional "procedural" C code. Where OOP is different from "procedural" C is in the base, common abstractions we use to construct our programs. In OOP we are given classes and objects with which to define our programs, to model systems, and to model behavior. This means the bulk of activity in an OOP program is in the passing of method calls from caller to receiver. Yet, the behavior invoked from these method calls is still a procedural computation, otherwise how else would the instructions be executed in the correct sequence?

Refactoring A Refactoring

In answer to the question, "Do class methods resist refactoring?", based on my experience, I confidently say "no". Rather than only defend class methods, I'd like to provide a counter to the author's refactoring conclusions, and claim that using class methods within the right context helps the object oriented software developer write code that is easier to reason about, has a greater clarity of purpose, is simpler, more modular, and easier to test.

To begin, let's take a look at the code context used in the original blog post:

 1 class SyncToAnalyticsService
 2   ConnectionFailure =
 4   def self.perform(data)
 5     data              = data.symbolize_keys
 6     account           = Account.find(data[:account_id])
 7     analytics_client  =[:analytics_api_key])
 9     account_attributes = {
10       account_id:,
11       account_name:,
12       account_user_count: account.users.count
13     }
15     account.users.each do |user|
16       analytics_client.create_or_update({
17         id:   ,
18         email:,
19         account_admin:  account.administered_by?(user)
20       }.merge(account_attributes))
21     end
22   rescue SocketError => ex
23     raise
24   end
25 end

As the author describes, the `perform` method is complex, and appears to violate the Single Responsibility Principle. The author also claims that the `perform` method's operations are not all at the same level of abstraction, indicating that the mixture of abstraction prevents this from being a composable method, and limits reusability. I agree completely. This context contains configuration knowledge to correctly construct an instance of `Analytics::Client`, knows about an `Account` model, contains some additional data model knowledge required by the `Analytics::Client` to create an account representation presumably in a different database, and rescues a `SocketError` exception. If I had to guess, this code was produced by a single feature story, and the developer(s) that wrote this code had the goal to accomplish the task handed to them, and moved on after it appeared to work.

I want to stress no judgement is intended here. If only we could `git commit` a representation of the pressures and stresses felt in the moments of developing a feature - then we would probably think twice about measuring code quality in terms of 'WTFs' per minute! `Gitblame` only reveals a name, but cannot share the time constraints and numerous stresses, pressures and other factors that influence our decisions as developers when writing a feature. Perhaps I am lucky, but one thing I've found to be nearly unanimously true is that developers want to write the best code possible. It's quite common though, to find that developers have not been exposed to the ideas, approaches and lessons that would help them write better code. Whatever the context was for the developer(s) who wrote this code, let's leave all that aside, and instead thank them for an opportunity to climb up a rung or two on the ladder of experience and learning.

The author proposes refactoring this class by using the Extract Method refactoring pattern. This results in the following:

 1 class SyncToAnalyticsService
 2   ConnectionFailure =
 4   def self.perform(data)
 5     data                = data.symbolize_keys
 6     account             = Account.find(data[:account_id])
 7     analytics_client    =[:analytics_api_key])
 9     sync_users(analytics_client, account)
10   end
12   def self.sync_users(analytics_client, account)
13     account_attributes  = account_attributes(account)
15     account.users.each do |user|
16       sync_user(analytics_client, account_attributes, user)
17     end
18   end
20   def self.sync_user(analytics_client, account_attributes, user)
21     create_or_update_user(analytics_client, account_attributes, user)
22   rescue SocketError => ex
23     raise
24   end
26   def self.create_or_update_user(analytics_client, account_attributes, user)
27     attributes = user_attributes(user, account).merge(account_attributes)
28     analytics_client.create_or_update(attributes)
29   end
31   def self.user_attributes(user, account)
32     {
33       id:   ,
34       email:,
35       account_admin:  account.administered_by?(user)
36     }
37   end
39   def self.account_attributes(account)
40     {
41       account_id:,
42       account_name:,
43       account_user_count: account.users.count
44     }
45   end
46 end

The author makes three claims about this refactored code:

  1. It is not object oriented (it is a hybrid of procedural and functional code)
  2. You cannot declare private class methods
  3. This demotivates refactoring because this result is unsatisfying

The first claim that this code is not object oriented, but a hybrid of procedural and functional seems compelling on the first take, but I argue the Extract Method refactoring makes this code more object oriented. There is clearly an increased number of messages being sent to `self` in this new formulation, and there is still the same object orientation between this class and the `Analytics::Client`. Given this refactoring, behavior is now better encapsulated into callable methods, one of the defining hallmarks of object oriented programming. Ironically, I would argue that the original example was more procedural. In the original example, there is only a passing resemblance to object oriented code due to the behavior residing in a class, but otherwise, the one and only method reads as a procedural list of instructions to be executed in order from start to finish.

Without a clear definition of functional programming by this author, it makes it difficult to determine what is meant by calling this code more functional. However, the side-effect generating method `create_or_update_user` is impure, and due to the same mixed abstractions present in this refactored version as was found in the original example, this class is unfortunately not very composable or reusable from a functional perspective either.

The second claim that one cannot declare private class methods in Ruby is untrue. The method `private_class_method` has been available in Ruby since 1.8.6. There is no requirement to open up the Singleton Class of this class in order to define private class methods.

The third claim is opinion. Personally I find this code easier to understand and to reason about because the encapsulated behavior as methods allows my brain to follow ideas rather than a list of instructions. I respectfully consider the distinction about the behavior being defined as class methods vs. instance methods as insignificant. The real problems with this class, previously described, are still present. The main take away here is that the Extract Method refactor only shuffled behavior around, which I would claim makes this code more object-oriented, but did nothing to address the mixed abstractions and Single Responsibility Principle violations.

Next the author encourages us to reconsider the approach taken so far, by introducing the idea of instantiating this class via the `perform` class method:

1 class SyncToAnalyticsService
2   ConnectionFailure =
4   def self.perform(data)
5     new(data).perform
6   end
8   ...

Given this new approach focused on instantiation, the author applies the Extract Method refactoring again, resulting in the following formulation:

 1 ass SyncToAnalyticsService
 2   ConnectionFailure =
 4   def self.perform(data)
 5     new(data).perform
 6   end
 8   def initialize(data)
 9     @data = data.symbolize_keys
10   end
12   def perform
13     account.users.each do |user|
14       create_or_update_user(user)
15     end
16   rescue SocketError => ex
17     raise
18   end
20 private
22   def create_or_update_user(user)
23     attributes = user_attributes(user).merge(account_attributes)
24     analytics_client.create_or_update(attributes)
25   end
27   def user_attributes(user)
28     {
29       id:   ,
30       email:,
31       account_admin:  account.administered_by?(user)
32     }
33   end
35   def account_attributes
36     @account_attributes ||= {
37       account_id:,
38       account_name:,
39       account_user_count: account.users.count
40     }
41   end
43   def analytics_client
44     @analytics_client ||=[:analytics_api_key])
45   end
47   def account
48     @account ||= Account.find(@data[:account_id])
49   end
50 end

In response to this approach with the focus on using instances and instance methods, the author makes a number of new claims:

  1. Introducing state via instance variables and memoization eliminates the need for intermediate variables from method call to method call
  2. It is easier to test because you can separate the instantiation of an object from the invocation of its behavior
  3. Instantiation promotes greater composibility
  4. Class method refactoring produces ugly code, and demotivates refactoring
  5. Instantiation promotes greatest flexibility in formulation

I do not disagree or raise any counter arguments in response to the first point. The return value of methods has supplanted the need for intermediate variables, and the use of instance variables with memoization helps to prevent repeatedly hitting the database for an account. This introduces state, as the author points out, and given the previous point about eliminating intermediate variables is clearly helpful. However, the need to introduce state is a reflection of the mixed abstractions problem described previously, and is again a symptom of a deeper, unaddressed issue in this class.

The claim about easier testing due to the separation of object creation from object behavior is an interesting assertion to make. I have some problems with this one. A class with only class methods cannot be instantiated, so there is no need to test its creation. However, in my experience I cannot see a clear link between the ease of testing a class based on one's ability to instantiate that class. The factors that make behavior easier to test or more complicated to test have no relationship to the subject of a test being an instance or a class. Those factors are a separate topic, and I conclude this claim is slightly exaggerated and not relevant.

The claim that instantiation allows for greater composability for enhancing or adding additional behavior is demonstrable. Using a decorator, or a collaborator object, it would be possible to extend the behavior of instance objects of this class without violating the Open/Closed Principle. However, the likelihood of such an approach for this class I would argue is highly unlikely, due to the fact that the state encapsulated in this class is an `Account` object, but perhaps there would be some need to reuse the data hash of the `Account` objects users info merged with the `account_attributes`. If that were the case, I would argue for a cleaner separation of that data from the behavior it is enabling.

I'd like to revisit the claim that class method refactoring produces ugly code. However, this is fundamentally a matter of personal preference.

I think the final claim that instantiation promotes the greatest flexibility is true from a certain perspective that emphasizes a traditional object oriented approach. The object oriented approach that is informed by classes contain behavior, and objects contain data and behavior via the use of state, are fundamentally going to approach solving problems via instance objects. However, while not every problem is suitable for classes with only class methods, I find that classes with class methods provide excellent contexts in which specific behavior, collaboration or transformations can occur. I would make the stronger claim that it is only by enforcing proper boundaries between classes with class methods that the maximum flexibility is achieved in object oriented programming, with the greatest reusability and modular code as an outcome of such an approach.

Below is the refactoring I would choose to do for the original class:

 1 # from the calling context...
 2 account = Account.find(data[:account_id])
 3 analytics_client =[:analytics_api_key])
 4 SyncAccountUsersToAnalyticsService.perform(account, analytics_client)
 6 class SyncAccountUsersToAnalyticsService
 7   ConnectionFailure =
 9   def self.perform(account, analytics_client)
10     account.users.each do |user|
11       begin
12         analytics_client.create_or_update(attributes_from(user, account))
13       rescue SocketError => e
14         raise
15       end
16     end
17   end
19   def self.attributes_from(user, account)
20     {
21       id:       ,
22       email:    ,
23       account_admin:      account.administered_by?(user),
24       account_id:,
25       account_name:,
26       account_user_count: account.users.count
27     }
28   end
30   private_class_method :attributes_from
31 end

The first step of this refactor was to eliminate any object creation from within this class. Removing object creation from a class containing nothing but class methods I find is a good general rule of thumb to keep in mind. An exception to this is a factory class containing class methods, whose return value is an instantiated object. However, by requiring the two dependencies of this class, the account and the analytics client, to be passed to the class, not only is this class much easier to test, but it also helps us identify the real responsibility of this class: sync an account's users with the analytics client.

Since our class no longer requires querying the `Account` model to find an account instance from some arbitrary data hash, nor does our class need to know anything about the configuration of the analytics client, we've reduced nearly all of the mixed abstractions that were present in the original class. It also clarifies the general type of this class as a collaborator context that enables communication between three separate entities: accounts, users and the analytics client. However, this class still violates Single Responsibility Principle, in that it knows how to sync the appropriate information about users and accounts with the analytics client, and how to construct the data model required by the analytics client. Let's look at another refactoring, that helps to isolate the construction of account-user entities independently of the act of syncing those entities with the analytics client:

 1 # from the calling context...
 2 account = Account.find(data[:account_id])
 3 analytics_client =[:analytics_api_key])
 4 account_user_analytics_entities = AccountUserAnalyticsEntity.perform(account)
 5 SyncAccountUsersToAnalyticsService.perform(account_user_analytics_entities, analytics_client)
 7 class AccountUserAnalyticsEntity
 8   def self.perform(account)
 9 do |user|
10       attributes_from(user, account)
11     end
12   end
14   def self.attributes_from(user, account)
15     {
16       id:       ,
17       email:    ,
18       account_admin:      account.administered_by?(user),
19       account_id:,
20       account_name:,
21       account_user_count: account.users.count
22     }
23   end
25   private_class_method :attributes_from
26 end
28 class SyncAccountUsersToAnalyticsService
29   ConnectionFailure =
31   def self.perform(account_user_analytics_entities, analytics_client)
32     account_user_analytics_entities.each do |entity|
33       begin
34         analytics_client.create_or_update(entity)
35       rescue SocketError => e
36         raise
37       end
38     end
39   end
40 end

Now, the two classes are clearly and cleanly separated, each with one reason to change, and each with a single responsibility. Given this formulation, I'm not sure of a more simple way to expressively declare this behavior, nor does it suggest a procedural processing of a sequence of instructions. In both cases, each class contains a single public method, with a `private_class_method` defined on the `AccountUserAnalyticsEntity` class.

I would argue that formulating this behavior as classes with class methods is far easier to understand and read at a glance than the previously provided instance method / stateful approach. Not only are we not requiring state in any of these classes, but we also impose an interesting constraint on these classes of no object instantiation. Why is that important? It requires the caller to deal with that object instantiation. You might be thinking, "Well we have to deal with object instantiation of the account and analytics client somewhere", and you are right. In fact, by pushing out the creation of our account and analytics client objects, we see how these two classes respect the Dependency Inversion Principle.

Both of these classes represent abstractions. The idea of an account and its users are also abstractions, as is the analytics client. The Dependency Inversion Principle says, "Abstractions should not depend on details. Details should depend on abstractions."6 The detail of knowing how to locate an account within the `AccountUserAnalyticsEntity` class would violate the abstraction it represents because it would require that abstraction to depend on a low level detail (how to query an account). This tells us that there is a better place to query the account object from outside this class in which the Dependency Inversion Principle can still be honored. In a traditional web-application, this would likely occur at the controller level, where the low level abstraction of a controller depends on the higher level abstraction of a model, which in turn depends on the higher level abstraction of a class like `AccountUserAnalyticsEntity` to perform behavior without needing to worry about lower level details.

Additionally, as a result of honoring the Dependency Inversion Priniple and the Single Responsibility Principle, we can more easily test these two classes either using real objects, fakes, mocks or dummies. In some cases we might want to test object creation, but more practically we probably are only concerned with verifying behavior. Considering the flexibility passing in the dependencies grants us in terms of testing, I would argue this approach is more flexible in its formulation than the instance object / stateful approach.

  1. Static Methods vs Instance Methods
  2. DCI and Rails
  3. Are Class Methods Evil?
  4. Class vs Instance Methods
  5. The definition of "best code" is not the subject of this post, so I will leave that definition up to the reader's judgement based on their current experience.
  6. Dependency Inversion Principle