COM/DCOM & COM+

A Primer on the Evolution of a Microsoft Development Environment
The Component Object Model

The Component Object Model (COM) has its roots in OLE version 1, which was created in 1991 and was a proprietary document integration and management framework for the Microsoft Office suite. Microsoft later realized that document integration is just a special case of component integration. OLE version 2, released in 1995 was a major enhancement over its predecessor. The foundation of OLE version 2, now called COM, provided a general-purpose mechanism for component integration on Windows platforms. Since then additions have been made, such as DCOM, but applications that worked then still work now.

COM, the Component Object Model,refers to both a specification and an implementation developed by Microsoft Corporation that provides a framework for integrating components. This framework supports interoperability and reusability of distributed objects by allowing developers to build systems by assembling reusable components from different vendors that communicate via COM. By applying COM to build systems of preexisting components, developers hope to reap benefits of maintainability and adaptability.

Objects created using the COM specification support the fundamental notions of encapsulation, polymorphism, and reusability. Microsoft's Component Object Model (COM) defines a language-independent notion of what an object is -- how to create objects, how to invoke methods, and so on. This allows development of components that programmers can use (and reuse) in a consistent way, regardless of which languages they use to write the component and its client.

COM is an architecture for the integration and deployment of software components, rather than a body of techniques for problem analysis. In contrast, most Object Oriented Design methodologies were created for more monolithic object-oriented applications. Therefore, the assumptions that you can make with OOD don't necessarily apply to COM, and important considerations arise,

Table 1: Differences in design considerations between OOD and COM
Object-Oriented Design Assumptions / Added COM Considerations
Objects typically packaged in the same application (module) as client code / Objects and clients typically in separate modules, both .EXEs and .DLLs
Objects and clients run in a single process / Objects and clients may run in different processes and on different machines
Class (implementation) inheritance / Interface inheritance (no implementation inheritance)
Single interface per object (the object's class definition) / Multiple interfaces per object
Single client per object / Multiple simultaneous clients per object
1:1 relationships between clients and objects typical / Many: Many relationships between clients and objects is common

Designing a component-based system in COM is not just a matter of applying an Object Oriented Design methodology; COM introduces new considerations of packaging, components per package, objects per component, interfaces per object, and simultaneous clients per object.

COM is about choice. It provides the choice of the highest volume languages and tools available, as well as the largest base of applications. COM also provides choice in the area of security, as it provides a common interface (SSPI) where various security providers can be plugged in. COM also provides choice of network transport.

COM Principles

COM forces the Windows operating system to see applications as objects. The OS takes the responsibility of creating objects when they are required, deleting them when they are not, and handling communications between them, be it in the same or different processes or machines. The OS creates a central registry for the objects. One major advantage of this mechanism is versioning. If the COM object ever changes to a new version, the applications that use that object need not be recompiled.

All COM objects are registered with a component database. When a client wishes to create and use a COM object:

  1. It invokes the COM API to instantiate a new COM object.
  2. COM locates the object implementation and initiates a server process for the object.
  3. The server process creates the object, and returns an interface pointer at the object.

The client can then interact with the newly instantiated COM object through the interface pointer.

COM defines a binary structure for the interface between the client and the object. This binary structure provides the basis for interoperability between software components written in arbitrary languages. A fully compliant COM object can be written in any language that can produce binary compatible code. As long as a compiler can reduce language structures down to this binary representation, the implementation language for clients and COM objects does not matter - the point of contact is the run-time binary representation.

COM defines an application programming interface (API) to allow for the creation of components for use in integrating custom applications or to allow diverse components to interact. COM components are never linked to any particular application. The only thing that an application may know about a COM object is what functions it may or may not support. In fact, the object model is so flexible that applications can query the COM object at run-time as to what functionality it provides.

As shown in Figure 1, services implemented by COM objects are exposed through a set of interfaces that represent the only point of contact between clients and the object.

Figure 1: Client Using COM Object Through an Interface Pointer

Garbage collection is another major advantage to using COM. When there are no outstanding references (a.k.a. pointers) to an object, the COM object destroys itself.

COM Runtime Architecture

COM/ DCOM is a truly distributed Object-Oriented Architecture. Components developed using Microsoft’s COM provide a way by which two objects in different object spaces or networks, can talk together by calling each other’s methods.

COM services are provided in a standard way, whether those services are required within a single running process, within two different processes on the same machine, or on two different processes across a network using DCOM.

As a result COM and DCOM provide location transparency.

COM servers (objects) are accessed within the same process, within two different processes on the same machine, or across the network using RPC:

  1. In-process server: The client can link directly to a library containing the server. The client and server execute in the same process. Communication is accomplished through function calls.
  2. Local Object Proxy: The client can access a server running in a different process but on the same machine through an inter-process communication mechanism. This mechanism is actually a lightweight Remote Procedure Call (RPC).
  3. Remote Object Proxy: The client can access a remote server running on another machine. The network communication between client and server is accomplished through DCE RPC. The mechanism supporting access to remote servers is called DCOM.

Figure 2: Three Methods for Accessing COM Objects

COM and DCOM give designers three choices for packaging component code into some executable module: in-process (same process as client), local (separate process from client on the same machine), and remote (separate processes on separate machines).

Table 2: Pros and Cons of COM packaging choices
Package Type / Pros / Cons / Preferred Uses
In-Process / High speed (no remoting overhead), no remoting limitations / No security, no process protection-crash in component crashes process that loaded it, UI synchronization (sharing a message pump) tricky / Add-on types of components that provide simple services (like function libraries or child-window UI elements) to clients
Local / Process security (separation), process ownership (including threading, memory management, etc.), control over UI synchronization / Slower than in-process (remoting overhead), remoting limitations on interfaces, no access security / Heavier components that are too expensive to load in-process, have UI beyond simple child windows, or wish to manage their own files (such as databases).
Remote / Process and access security, process and possible machine ownership (e.g., managing a shared component resource), cross-platform / Slower than local with additional remoting limitations / Components that need to run in close proximity to a particular resource

If the client and server are in the same process, the sharing of data between the two is simple. However, when the server process is separate from the client process, as in a local server or remote server, COM must format and bundle the data in order to share it. This process of preparing the data is called marshalling. Distributed computing purists describe marshalling as the process of packaging and transmitting data between different address spaces, automatically resolving pointer problems, while preserving the data’s original form and integrity. Marshalling is accomplished through a "proxy" object and a "stub" object that handles the cross-process communication details for any particular interface.

Even though COM objects reside in separate processes or address spaces or even different machines, the operating system takes care of marshalling the call and calling objects running in a different application (or address space) on a different machine. The actual internal implementation of marshalling and un-marshalling differs depending on whether the client and server operate on the same machine (COM) or on different machines (DCOM). Given an IDL file, the Microsoft IDL compiler can create default proxy and stub code that performs all necessary marshalling and un-marshalling.

Figure 3: Cross-process communication in COM

The fact that COM can access services within the same process is a huge differentiator between COM and CORBA. Allowing for the in-process model allows for the development of components such as AcitveX Controls or JavaBeans. CORBA at present cannot handle in-process components and therefore cannot participate in the component marketplace.

The IDL

Whenever a client needs some service from a remote distributed object, it invokes a method implemented by the remote object. The service that the remote distributed object (Server) provides is encapsulated as an object and the remote object's interface is described in an Interface Definition Language (IDL). The interfaces specified in the IDL file serve as a contract between a remote object server and its clients. Clients can thus interact with these remote object servers by invoking methods defined in the IDL. COM objects and interfaces are specified using Microsoft Interface Definition Language (IDL), an extension of the DCE Interface Definition Language standard. To avoid name collisions, each object and interface must have a unique identifier. Interfaces are considered logically immutable. Once an interface is defined, it should not be changed (new methods should not be added and existing methods should not be modified).

When developing a COM-based system, it's important to get the interface down in IDL code. In modern COM, IDL best describes COM interfaces. After describing an interface in IDL, run the IDL through the MIDL compiler, which produces C and C++ header files, a type library, and the source code necessary for building a proxy-stub DLL. Interfaces must be well defined in IDL, because the proxy and the stub need to understand exactly how to move data between the client and the object. This is important because the client and the object might be on different machines, and moving data from the client to the object probably involves moving actual bits back and forth.

To invoke a remote method, the client makes a call to the client proxy. The client side proxy packs the call parameters into a request message and invokes a wire protocol like IIOP (in CORBA) or ORPC (in DCOM) or JRMP (in Java/RMI) to ship the message to the server. At the server side, the wire protocol delivers the message to the server side stub. The server side stub then unpacks the message and calls the actual method on the object. In both CORBA and Java/RMI, the client stub is called the stub or proxy and the server stub is called skeleton. In DCOM, the client stub is referred to as proxy and the server stub is referred to as stub.

The final consideration for using COM is that a single object instance may play different roles for different simultaneous clients where each client is using a different set of interfaces. A COM object can support any number of interfaces. An interface provides a grouped collection of related methods. In addition, a single piece of client code may be using many different objects polymorphically (through the same interface). COM gives up on multiple inheritances to provide a binary standard for object implementations. Instead of supporting multiple inheritances, COM uses the notion of an object having multiple interfaces to achieve the same purpose. This also allows for some flexible forms of programming.

DCOM

Distributed COM is an extension to COM that allows network-based component interaction. While COM processes can run on the same machine but in different address spaces, the DCOM extension allows processes to be spread across a network. With DCOM, components operating on a variety of platforms can interact, as long as DCOM is available within the environment.

It is best to consider COM and DCOM as a single technology that provides a range of services for component interaction, from services promoting component integration on a single platform, to component interaction across heterogeneous networks. In fact, COM and its DCOM extensions are merged into a single runtime. This single runtime provides both local and remote access.

DCOM which is often called 'COM on the wire’ supports remoting objects by running on a protocol called the Object Remote Procedure Call (ORPC). This ORPC layer is built on top of DCE's RPC and interacts with COM's run-time services. A DCOM server is a body of code that is capable of serving up objects of a particular type at runtime. Each DCOM server object can support multiple interfaces each representing a different behavior of the object. A DCOM client calls into the exposed methods of a DCOM server by acquiring a pointer to one of the server object's interfaces. The client object then starts calling the server object's exposed methods through the acquired interface pointer as if the server object resided in the client's address space. As specified by COM, a server object's memory layout conforms to the C++ vtable layout. Since the COM specification is at the binary level it allows DCOM server components to be written in diverse programming languages like C++, Java, Object Pascal (Delphi), Visual Basic and even COBOL. As long as a platform supports COM services, DCOM can be used on that platform. DCOM is now heavily used on the Windows platform.

Active X

In October of 1996 Microsoft turned over COM/DCOM, parts of OLE, and ActiveX to the Open Group (a merger of Open Software Foundation and X/Open). The Open Group has formed the Active Group to oversee the transformation of the technology into an open standard. The aim of the Active Group is to promote the technology's compatibility across systems (Windows, UNIX, and MacOS) and to oversee future extension by creating working groups dedicated to specific functions. However, it is unclear how much control Microsoft will relinquish over the direction of the technology. Certainly, as the inventor and primary advocate of COM and DCOM, Microsoft is expected to have strong influence on the overall direction of the technology and underlying APIs.

An ActiveX control is really just another term for "OLE Object" or, more specifically, "Component Object Model (COM) Object." In other words, a control, at the very least, is some COM object that supports the IUnknown interface and is also self-registering. It usually supports many more interfaces in order to offer functionality, but all additional interfaces can be viewed as optional and, as such, a container should not rely on any additional interfaces being supported. This allows a control to implement as little functionality as it needs to, instead of supporting a large number of interfaces that actually don't do anything. Through QueryInterface a container can manage the lifetime of the control, as well as dynamically discover the full extent of a control's functionality based on the available interfaces.

In short, this minimal requirement for nothing more than IUnknown allows any control to be as lightweight as it can. Other than IUnknown and self-registration, there are no other requirements for a control. There are, however, conventions that should be followed about what the support of an interface means in terms of functionality provided to the container by the control. It should never be assumed that an interface is available, and standard return-checking conventions should always be followed. It is important for a control or container to degrade gracefully and offer alternative functionality if a required interface is not available.