Skip to main content

· 8 min read

Motivation

I first encountered the repository pattern in a Go backend codebase, where there are files/packages named "repo" and those packages are used to supply information from data sources. I was puzzled by such usage because until then, I have always known "repository" as a term related to "Git" and "GitHub". With further research online, I then realized that the repository pattern is a popular abstraction layer between the business logic and the data sources.

A succinct description of the repository pattern by Edward Hieatt and Rob Mee (P of EAA):

Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.

UML illustration, source: martinfowler.com

This pattern is often discussed when talking about Domain Driven Design, which is

an approach to software development that centers the development on programming a domain model that has a rich understanding of the processes and rules of a domain. - Martin Fowler DomainDrivenDesign

In this article, I hope to consolidate some of the excellent resources that discuss the repository pattern in-depth and provide my examples and reflections on this pattern.

Uncovering the Repository Pattern

The idea of the repository pattern is to have a layer between a business domain and a data store. This can be done via an interface that defines data store handling logic, which the domain service logic can depend on.

Let's discuss a simplified example in Go, in the context of a URL-shortener application.

1. Create a repository interface in the service layer

The example application provides a URL-shortening service, which essentially takes in any URL and returns a shortened version that redirects visitors to the original address.

Let's assume that the URL-shortener service needs

  • a way to create a mapping of the original URL and the shortened URL
  • a way to query for the original URL for redirection
  • anything else (for simplicity we will only focus on the two above, CR of CRUD)

We mentioned that a repository interface needs to be created, but where?

The short answer is that we can implement it alongside the service layer. This is because the service knows what needs to be done with the data store (it may not need to know how). The repository interface, therefore, specifies the operations required by the service (without underlying details). One possible arrangement in Go is to have the service domain struct contain a reference to the repository interface, which is passed in from the constructor.

For example, we can have the following in a service/urlshortener.go file

package service

// The interface to be implemented by the actual datastore
type URLShortenerRepository interface {
Create(ctx context.Context, url string) (error)
Find(ctx context.Context, url string) (string, error)
}

// Domain struct
type URLShortener struct {
repo URLShortenerRepository
}

func NewURLShortener(repo URLShortenerRepository) *URLShortener {
return &URLShortener{repo: repo}
}

// Illustrations of the wrapping methods on the domain struct
func (u *URLShortener) Create(ctx context.Context, url string) (error) {
err := u.repo.Create(ctx, url)
if err != nil {
fmt.Println(err)
return err
}
return nil
}

func (u *URLShortener) Find(ctx context.Context, url string) (string, error) {
result, err := u.repo.Find(ctx, url)
if err != nil {
fmt.Println(err)
}
return result, err
}

2. Implement the repository interface in the data store layer

So far we have the service layer interacting with the repository interface, and we can now focus on implementing the actual handling logic in the data store layer. This typically involves a persistent database either relational or NoSQL like MongoDB, which we will use in this example.

Now, let's implement the handling logic in a mongoDB/mongo.go file

// Note that in Go, interfaces are implemented implicitly
type MongoDBRepository struct {
connectionString string
}

func NewMongoDBRepository(connectionString string) *MongoDBRepository {
return &MongoDBRepository{connectionString: connectionString}
}

func (m *MongoDBRepository) Create(ctx context.Context, url string) (error) {
// Insert a URL pair into the datastore via some MongoDB specific query
}

func (m *MongoDBRepository) Find(ctx context.Context, url string) (string, error) {
// Find from the datastore via some MongoDB specific query
}

3. Connecting the repository interface with the implementation

The last step in the process is to utilize what we have implemented so far.

We can imagine a central place where the service is initialized along with the data store, perhaps in a main.go file

repo := mongoDB.NewMongoDBRepository("db connection url here")
URLShortenerService := service.NewURLShortener(repo)

// example usage
err := URLShortenerService.Create(context.Background(), "some long url here")
if err != nil {
panic(err)
}

Diagram of the repository pattern Summary of the repository pattern

Analyzing the Repository Pattern

In the above section, we discussed a possible repository pattern implementation. In this part, we will highlight some of the benefits achieved.

Abstraction

The repository interface created separates the contract from implementation. This reduces the complexity at the service layer as only cares about the supporting behaviors of the underlying data store and not how they are actually implemented. It also reduces code duplication as all other services can share a consistent way to access data via the repository interface.

In the article on why you should use the repository pattern by Joe Petrakovich, he uses an analogy of a universal adapter to describe how the repository pattern sits between services and the data so that access or even modifications will less likely to impact the business logic code.

Encapsulation

Closely related to abstraction, encapsulation here means your repository interface helps to control access in a standardized way. This means regardless of the underlying data store, the repository interface exposes only the essential and expected ways to interact with the data store. This means a set of consistent error handling or logging can be performed at this layer and changes to the underlying data store are unlikely to affect the service layer code.

Separation of concern

The separation created by the repository layer reduces coupling as the service layer code does not depend on the data store directly. Similarly, the data store changes can hence be independent of the business requirement.

Facilitate unit testing via dependency injection

A crucial benefit of the repository pattern is that it allows for easy mocking and quicker unit tests. As we can see in our example's main.go file, a mock repository can be implemented and passed into the constructor instead. During testing, a mock repository can remove the need to establish a database connection or query a database, hence isolating the service layer logic.

Diagram of the repository pattern test Testing with the repository pattern

For example:

// Note that in Go, interfaces are implemented implicitly
type MockRepository struct {}

func NewMockRepository() *MockRepository {
return &MockRepository{}
}

func (m *MockRepository) Create(ctx context.Context, url string) (error) {
// Simulate insertion
return nil
}

func (m *MockRepository) Find(ctx context.Context, url string) (string, error) {
// Simulate read
return "https://short.com/url", nil
}

repo := NewMockRepository()
URLShortenerService := service.NewURLShortener(repo)

// example usage
err := URLShortenerService.Create(context.Background(), "some long url here")
if err != nil {
panic(err)
}

To understand dependency injection better, read more here

Drawbacks and Considerations

As with all patterns, there are drawbacks and even proponents who are loudly against the use of the repository pattern. Here are some of my observations and thoughts on the matter.

Is it cost-effective?

When implementing a software design pattern, it typically adds on the number of boilerplate codes to "set it up". Similarly for the repository pattern, implementing it could mean more structural code is added for the sake of "writing more code now so as to not repeat ourselves down the line". If however, the project is small-scale and there's likely no further development given that it is a demo/playground application, the investment in using the repository pattern could go unrealized.

Is another layer of indirection really necessary?

A fairly famous quote in computer science states:

Any problem in computer science can be solved with another layer of indirection. But that usually will create another problem

I am very cautious whenever I need to build a new layer of abstraction, because often than not, abstractions turned out to be "leaky" or "hasty". Such layers of abstractions don't deliver on their promises of simplicity and in very extreme cases, make the code harder to understand for ourselves and more so for future maintainers.

Better or worst testing?

Together with dependency injection, the repository pattern can help speed up unit testing by abstracting away the database. However, it does not remove the need to conduct integration tests because with a mock repository, the responses from the data store layer may not be realistic. To gain confidence in the system, integration tests are still necessary.

Conclusion

Design patterns such as the repository pattern are useful to understand because even if we choose not to use them, we are likely to come across them in existing codebases. As with all design patterns, the key is to plan well and find the right context before moving headlong into implementation. That's all and hope you enjoyed reading this article!

References

· 4 min read

Thoughts

This is a retrospective of the second semester of my second year at NUS. It was a relatively relaxed semester, with me taking 5 modules that resulted in a very manageable workload. In some sense, I would also say that perhaps I got used to what needed to be done.

Module Review

CS3281 Thematic Systems Project I (aka Open Source Mod)

This was my highlight-of-the-semester module, and I truly enjoyed it 😄 (Kudos to Prof Damith for keeping the module alive by volunteering his time to deliver this module!) It taught me a lot about open-source development. Even though the project that I worked on is by no means a large-scale, well-established one, in some way that provided autonomy and a whole range of tasks to tackle. As someone who has taken CS3216 (aka Go build software projects mod), I would say the learning outcome is different, but this module is equally worth doing. Summarizing some of my thoughts on the module:

  • You get to work on an open-source project!
  • You get to participate in the routine tasks of an open-source project, such as raising (and triaging) issues, fixing bugs, reviewing PRs, improving documentation, proposing new features, and discussing implementation details etc.
  • The projects are generally well documented; or have rich context from the git history and public discussion in issues and PRs.
  • The project mentors will be very helpful and you will get to learn from them through PR reviews and discussions.
  • When you spot the not-so-good parts of the projects, you have the chance to improve them.
  • Working on school-based projects also lends you the opportunity to work with other students, as well as external contributors and even on external projects (especially upstream dependencies).

I spent a fairly consistent amount of time working on MarkBind, as you can see in the contribution graph: [graph]

You can find out about what I have done (my progress and knowledge-learned log) here.

One thing I learned about OSS: if you want to be a contributor, first become a user. That leads to so many opportunities to contribute, and new perspectives to look at the project.

CS3230 Design and Analysis of Algorithms

This module is what you would expect in an advanced data structure and algorithms class. While the concepts may be difficult, they turned out to be pretty interesting to know. I enjoyed learning and analyzing the algorithms, which were all quite fundamental. There's some stress from the weekly graded assignments, but in general, it was manageable.

CS3240 Interaction Design

This module provides a good introduction to the field of interaction design. It covers topics such as user-centered design, usability, and accessibility. It's a good survey of the field, and it was a more design-oriented module than the other CS modules I have taken. Workload wise if you don't like working on wireframes, and prototypes on tools like Figma, it can be a bit of a drag. I personally had many occasions where I opened Figma and just can't get myself to work on the assignment. But I did enjoy the module and my output, which you can find in the write-up here.

ES2660 Communicating In The Information Age

This required module focuses on the theories, techniques, and skills related to effective communication in the context of Information Technology. It covers topics such as critical thinking, public speaking, and writing, and provides opportunities to practice these skills through tutorial activities and assignments. The workload is manageable, and the classroom atmosphere is relaxed.

One thing that I remember most about this module: the challenge of speaking impromptu on a given topic (Not that easy if you want to do it well).

(bonus: here's the guideline I used for impromptu speaking)

  • Essence of the prompt (Context, audience, purpose)
  • Stand (Agree, Disagree)
  • Key terms
  • Reasons for my stand
  • Evidence/Examples/Implications/applications/ramifications
  • Delve deeper(Consider alternative, consequences)
  • Conclusion

LSM1303 Animal Behaviour

Pretty chill and fun module with an awesome prof (The Otterman!). It was a great gateway to learning more about animals, and even got to observe them out in the wild.

animal1 animal2 animal3 animal4

· 5 min read

Motivation

This article is inspired by a question I received in a programming methodology class. In this class, in which we write Java code to solve programming exercises, we have the constraint that every attribute of a class should be private and final. It means there is no access to the field outside of the class, and no modification is allowed once this field is initialized. This strict requirement is put in place to enforce immutability when constructing a class object in Java.

Sooner or later, when the exercises get more complex, we tend to move on to an OOP solution whereby multiple classes are constructed and organized with the help of inheritance. The problem then arises when there is a need to access this private final field in the parent class from a subclass. What should we do then?

To give a concrete example, let's say we have the following classes:

class Parent {
private final int value;

Parent(int value) {
this.value = value;
}
}

class Child extends Parent {
Child(int value) {
super(value);
}

int add(int another) {
return super.value + another; // UNABLE TO ACCESS!
}
}

What should we do if the child class wants to access value from the parent?

Solutions

Change modifier

The simplest way to deal with that is to change the access modifier from private to something else - perhaps public or protected. This solution can be legitimate depending on the context. In some cases, perhaps it is perfectly normal to expose this value to other classes.

Add a getter method

From the Oracle's Java tutorial on inheritance

A subclass does not inherit the private members of its parent class. However, if the superclass has public or protected methods for accessing its private fields, these can also be used by the subclass.

So, another possible solution is to have a getter method in the parent class and make that method public. This way child classes (and technically other classes) will have access via the getter. So a quick example will be:

class Parent {
private final int value;

Parent(int value) {
this.value = value;
}

public int getValue() {
return this.value;
}
}

class Child extends Parent {
Child(int value) {
super(value);
}

int add(int another) {
return super.getValue() + another; // CAN ACCESS!
}
}

Having a getter method can be beneficial in the sense that even though now a "private" field is exposed, you still have one layer of abstraction over it. The users of the getter method do not need to know how that value is generated, which can be manipulated (if needed) by some complex preprocessing steps in the getter method. Also, the underlying private field could change drastically and yet the users of the getter method are unaware.

Rethink code design

Lastly, this problem may be a signal to rethink if there is a legitimate need to access a private final field. Given a parent-child relationship, sometimes it's difficult to be clear about which field/method should reside in which classes.

  • Would it be better to have the field in the child class instead?
  • Can we shift what the child class wanted to do with value into the parent class as a general method that the child class can inherit and possibly override?

A better code design might suggest that the private final field can stay as is, maintaining an abstraction barrier between the parent and the child class. One example solution is then:

class Parent {
private final int value;

Parent(int value) {
this.value = value;
}

int add(int another) {
return this.value + another;
}
}

class Child extends Parent {
Child(int value) {
super(value);
}

int add(int another) { // will work if this method is omitted as well,
return super.add(another); // as it will be inherited
}
}

Anti pattern

A problematic walkaround that some might come up with is to redeclare the same field in the child class.

class Parent {
private final int value;

Parent(int value) {
this.value = value;
}
}

class Child extends Parent {
private final int value;

Child(int value) {
super(value);
this.value = value;
}

int add(int another) {
return this.value + another; // will work but not recommended
}
}

This works but is arguably a bad design because it does not make use of inheritance to reduce any duplicates between shared properties. It also could result in the values (that meant to represent the same thing) going out of sync, especially if these fields were not declared as final.

Conclusion

When I was asked the motivating question, my immediate response was: "make a public getter method". To which I was then asked a follow-up question:

  • Why do we resort to using a public getter method, when we want to keep the field private?

Which got me thinking:

  • Why can't private fields be inherited?

This article is a reminder for me to ask the "why" questions more often, and explore the reasons for the answers.