The Purpose of Objects

There’s no shortage of people saying object-orientation doesn’t work, but it’s not always easy to understand why. Complaints are often summarized to broad statements like “it produces ugly code” and “the performance is bad”. If these statements are true, why bother with object-orientation at all? In this post, I tried to answer this question by drawing a line between good uses and bad uses of object-orientation.

Things Objects are For

Applying object-oriented design has many advantages in terms of software development process improvements. Here’s a few that come to mind.

Objects are for Communication

Humans understand the world in terms of objects. Whether you’re an engineer or a marketing specialist or a QA tester, explaining the relationship between entities is easy to express in terms of objects. This is especially the case in simulations, like most video games. For example, saying that an “Actor” has a “Weapon” that is a “Bazooka” that shoots “Bullets” that are a “RPG missile” is something easy for anybody to understand, whether or not they have a technical background.

Objects are for Specification

Since it’s possible to easily convey concepts in an object-oriented way, it’s easy to formally specify the expected behavior of systems this way. Don’t worry about modeling every last detail of your design, you just need to create a high-level specification that engineers, managers, your customers, and your QA team can understand.

The actual implementation doesn’t need to match your specification in terms of code, it just has to have the same behavior at the end of the day. Somebody should be able to look at the specification and the product, then confirm if that feature exists or doesn’t. This allows your team and customer to be aligned on the requirements.

Engineers should know roughly what code corresponds to which part of the specification, even if the code itself isn’t programmed in an object-oriented way.

Objects are for Classification

By splitting object into classes and packages, we’re able to create meaningful subdivisions of the work involved in a software engineering project.

As a project manager, you can organize the engineering tasks required to finish the product in groups of related classes, and you can assign them to different people so they can work in parallel. Organizing tasks means you can ship faster.

Alternatively, you can measure which classifications of objects have the highest rates of defects, which allows you to make intelligent business decisions about where to allocate expensive QA efforts to maximally improve product quality. This means you ship a better product for cheaper.

Objects are for Relations

By understanding the relationships between objects, we can understand dependencies within the system. This makes it possible to organize the order in which tasks need to be done, and the order in which code can be tested.

This aspect can be taken to the logical conclusion of organizing your data using a relational database schema, using a high-level language like DDL. By having well-specified relationships, it becomes possible to automatically maintain them using a database management system. I see this as an industry-proven formalization of data-oriented programming.

Objects are for Isolation

Object-oriented design is fundamentally about software components that communicate with each other through messages. By isolating different components, it becomes possible to create high level systems without having to worry about the details. Humans can only focus on a limited amount of information at a time, so properly isolating components makes it easier to be productive. Of course, it’s possible to over-isolate things and lead to complications.

Isolation might be for political reasons, in the sense that interfaces separate the work of different teams and companies. Isolation might also be a result of the real world constraints, like when different distributed computer systems are physically separated and communicate through network messages.

Things Objects are Not For

Given all the good things about object-oriented design, it’s tempting to go overboard and apply it even to problems where it’s clearly not useful. Here are some examples of when applying object-orientation will lower the quality of your product.

Objects are Not For Implementation

After creating a design using object-orientation, you should rarely write code that corresponds one-to-one to your design using object-oriented language features. As a software engineer, you need to translate human requirements into computer requirements, since things that make sense for humans don’t necessarily make sense for computers.

You don’t have to hand-code everything in assembly or avoid the Java language entirely. Focus on writing code that solves the computing requirements of your customers. For example, what are the inputs, the outputs, the transformations that need to be done? What data needs to be moved where?

Objects are Not For Indirection

Object-oriented languages usually allow you to use inheritance in order to create specialized functionality for different objects. At the end of the day, this boils down to syntactical sugar for the creation and use of virtual function pointer tables, which is a way of implementing an indirection on functions.

Implementing virtual function pointer tables by hand (like you have to in C) is actually pretty tedious, so automation by the language is convenient. However, you have to consider if a virtual function pointer table is really the appropriate solution to implement the desired indirection. Many design patterns are workarounds for Java not having function pointers, so don’t confuse these workarounds as universal solutions.

You can often get away with implementing indirection in a much simpler way. For example, you could store the type of an object in an enum and switch behavior based on it. Although such alternatives are not always more flexible, consider the technical cost of increasing flexibility: The more flexible your classes become, the harder it becomes to understand what the code really does, which increases complexity. Overly flexible systems tend to become so abstract that nothing really “does” anything, and by focusing on writing code that solves your customers’ computing requirements, you can avoid these over-generalizations.

Even if you are using an object-oriented programming language, try to see through the abstraction its object system provides in order to write simpler and more efficient code.

Objects are Not For Allocation

By Java’s conventions, each object in your system is dynamically allocated and garbage collected. This likely doesn’t represent the allocation pattern your software system really has, since you will likely operate on collections of objects that benefit from having their allocations grouped together to facilitate memory management and improve cache coherency.

By better understanding the allocation patterns of your software system, you’ll probably find key points where allocations are made, like system initialization or user session startup. By understanding the allocation patterns of your system, you’ll be able to write code with simpler resource ownership relationships, you’ll be able to make better choices for data-structures and memory layouts, and consequently you’ll be able to write code with better performance.

Even in a garbage collected language, grouping objects in large allocations means less work for the garbage collector, so applying this principle is likely a win no matter what kind of memory management system your language uses.

Objects are Not For Parallelization

You rarely have “just one” of something, so working at the granularity of collections of objects rather than single objects can have many advantages in terms of parallel processing.

For example, it’s possible to leverage SIMD instruction sets to perform many fine-grained computations in parallel, but this requires that your objects are arranged contiguously memory and also usually requires a struct-of-arrays memory layout rather than array-of-structs. If you want to take advantage of SIMD, you need to stop thinking about objects and start thinking about organizing data. Clever compilers may automatically use SIMD optimizations in your loops, but they don’t have the power to dramatically change the layout of your data, so the compiler needs your help in this respect.

When doing large batches of parallel work, memory transfers become an important bottleneck in terms of the overall performance of your code. In this respect, it’s useful to consider the cost of streaming your data in and out of memory as you traverse the dataset and dispatch parallel computations on it. If your bottleneck lies in memory transfers but you have extra computation power to spare, you can further improve the performance of your algorithm by building acceleration structures or compressing your stream of data. Either way, the hard problems to solve here are related to data and algorithms, not related to organizing objects.

In Conclusion

Object-orientation is useful for design work, but that doesn’t mean your implementation also needs to be object-oriented. This could be summarized as the difference between object-oriented design and object-oriented programming.

Object-orientation is a natural method of communication between humans. However, when it comes to programming, focus on writing code that solves your requirements in terms of a computer system rather than an abstract object system.

Advertisements

2 comments

  1. nmusatti

    While the “for”‘s are clear, you don’t really explain the “not for”‘s. Objects are (obviously?) one way to implement stuff that does provide indirection, tend to have a certain allocation pattern and may be used to implement parallel processing. As every other programming construct and paradigm object oriented programming has characteristics that make it better suitable for solving certain problems, but not others.
    Virtual tables help avoid having to explicitly code the same switch in multiple places and are way less error prone than explicit function pointer manipulation.
    Java programmers do have a tendency to let themselves be carried away and make everything abstract and indirect, thus making the purpose of their code less evident; still there are situations where this kind of flexibility is indeed desirable. In the opposite direction inheritance of implementation does cause a degree of coupling that is often hard to dismantle; delegation is often a better solution.
    Java and dynamic OO languages do have a tendency of fragmenting memory and to cause a rather high memory management overhead; C++ provides ways to avoid it at the cost of explicit memory management. Still, if performance requirements are so strict that you have to take cache line size into account object orientation may not be the best approach; at the very least you need to design your objects very carefully.
    As for parallelism object orientation is better suited for a MIMD model than a SIMD one: at least conceptually the actor model is a form of OO where each object is an independent processing unit.
    When all’s said and done the pet vs. sheep dichotomy may be considered as a guideline: object oriented programming makes more sense when the entities in your domain are relatively few and relatively complex.

    Like

    • nlguillemot

      I agree with you on all these points. You’re right that the meaning of my “not for”s is confusing. My intention is to say that determining the best solution to these problems should be done at the level of “plain C”, to write code that has good performance and is inherently simpler computationally. If the object-oriented syntactical sugar of your language makes that easier (like generating virtual function pointer tables), then by all means go for it .

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s