Mojo -- More of a Protocol

Mojo Overview

No one knows the Mojo better than its designer and thus the best introduction is Intro to Mojo & Services. Here I provide a brief overview for the Mojo.

Mojo is an Inter-Process Communication (IPC) framework. Before Mojo comes out, Chromium also had an IPC framework implemented in C++ macros, called legacy IPC. To reduce redundant code and decomposing services, Chromium refactors the underlying implementation of legacy IPC and creates Mojo. Currently, most IPC in Chromium is implemented with Mojo and the underlying implementation of legacy IPC has been replaced with Mojo. Below is a figure from Mojo documents showing different layers of Mojo.

Mojo core is responsible for the really core implementation of IPC. It focuses on building the IPC network and routing IPC messages to the right place. Mojo System API is a simple wrapper for Mojo core while the real platform-specific logic, like Windows named pipes and Unix sockets, is implemented in C++ System in the figure above. The C++ bindings is usually what Chromium really relies on and provides most high-level abstractions.

How Mojo works

The way Mojo works is pretty similar to protobuf. Firstly, the user defines various interfaces in a mojom file, each of which represents a function which can be called by a remote endpoint. During the process of compiling, the mojom files will be compiled to different bindings. For example, for C++ code, a example.mojom file will generate a example.mojom.h file which other source files can include.

The implementation of Mojo is really simple and straightforward. Enter mojo/public/tools/bindings/generators/cpp_templates directory and all secrets are revealed.

Yes, Mojo creates a set of source templates for different language bindings. After parsing mojom file, it simply renders the templates with data retrieved from AST. Thus, a good way to set instrumentation is to modify *.tmpl files.

Mojo network

A Mojo network is made up of many Nodes, which means a process running Mojo. It works pretty like a TCP/IP network but one key difference is that there is one broker process in a Mojo network. The broker process (the browser process in Chromium) is usually a full-privileged process and responsible for assisting communication between nodes.

Next I will take the Chromium for an example to introduce some details in Mojo.

Join a network

The very first thing for a new Mojo node is to join the Mojo network. In Chromium case, every child process is spawned from the browser process (the broker in Mojo) and a bootstrap channel will be created in advance. To establish the connection with new child process, the browser process has to pass the handle of the bootstrap channel to it. The methods vary on different platforms.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// child_thread_impl.cc:193
mojo::IncomingInvitation InitializeMojoIPCChannel() {
TRACE_EVENT0("startup", "InitializeMojoIPCChannel");
mojo::PlatformChannelEndpoint endpoint;
#if defined(OS_WIN)
if (base::CommandLine::ForCurrentProcess()->HasSwitch(
mojo::PlatformChannel::kHandleSwitch)) {
endpoint = mojo::PlatformChannel::RecoverPassedEndpointFromCommandLine(
*base::CommandLine::ForCurrentProcess());
} else {
// If this process is elevated, it will have a pipe path passed on the
// command line.
endpoint = mojo::NamedPlatformChannel::ConnectToServer(
*base::CommandLine::ForCurrentProcess());
}
#elif defined(OS_FUCHSIA)
endpoint = mojo::PlatformChannel::RecoverPassedEndpointFromCommandLine(
*base::CommandLine::ForCurrentProcess());
#elif defined(OS_MACOSX)
auto* client = base::MachPortRendezvousClient::GetInstance();
if (!client) {
LOG(ERROR) << "Mach rendezvous failed, terminating process (parent died?)";
base::Process::TerminateCurrentProcessImmediately(0);
return {};
}
auto receive = client->TakeReceiveRight('mojo');
if (!receive.is_valid()) {
LOG(ERROR) << "Invalid PlatformChannel receive right";
return {};
}
endpoint =
mojo::PlatformChannelEndpoint(mojo::PlatformHandle(std::move(receive)));
#elif defined(OS_POSIX)
endpoint = mojo::PlatformChannelEndpoint(mojo::PlatformHandle(
base::ScopedFD(base::GlobalDescriptors::GetInstance()->Get(
service_manager::kMojoIPCChannel))));
#endif

return mojo::IncomingInvitation::Accept(
std::move(endpoint), MOJO_ACCEPT_INVITATION_FLAG_LEAK_TRANSPORT_ENDPOINT);
}

From the code snippet above, it is clear that the initial channel handle is retrieved from command line arguments on Windows and Fuchsia. On macOS, it is passed through Mach. On Linux, since the child process is forked from its parent process, the handle is simply retrieved from global descriptors.

Once the child process has been launched, the browser process sends an invitation to the child process.

1
2
3
4
5
6
7
8
9
// child_process_launch_helper.cc:168
// Set up Mojo IPC to the new process.
{
DCHECK(mojo_channel_);
DCHECK(mojo_channel_->local_endpoint().is_valid());
mojo::OutgoingInvitation::Send(
std::move(invitation), process.process.Handle(),
mojo_channel_->TakeLocalEndpoint(), process_error_callback_);
}

After the child process accepts the invitation, the platform channel is established. Then, the browser process sends a BrokerClient message to inform the child process the name of the broker process. Note that the inviter process may be not the broker process in other cases. After receiving the message, the child process updates the broker name and peers the inviter with previous bootstrap channel reused.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
void NodeController::OnAcceptBrokerClient(const ports::NodeName& from_node,
const ports::NodeName& broker_name,
PlatformHandle broker_channel) {
// ...

// This node should already have an inviter in bootstrap mode.
ports::NodeName inviter_name;
scoped_refptr<NodeChannel> inviter;
{
base::AutoLock lock(inviter_lock_);
inviter_name = inviter_name_;
inviter = bootstrap_inviter_channel_;
bootstrap_inviter_channel_ = nullptr;
}

// ...

// It's now possible to add both the broker and the inviter as peers.
// Note that the broker and inviter may be the same node.
scoped_refptr<NodeChannel> broker;
if (broker_name == inviter_name) {
DCHECK(!broker_channel.is_valid());
broker = inviter;
} else {
DCHECK(broker_channel.is_valid());
broker = NodeChannel::Create(
this,
ConnectionParams(PlatformChannelEndpoint(std::move(broker_channel))),
Channel::HandlePolicy::kAcceptHandles, io_task_runner_,
ProcessErrorCallback());
AddPeer(broker_name, broker, true /* start_channel */);
}

AddPeer(inviter_name, inviter, false /* start_channel */);
// ...
}

At this time, the child process has succeeded in joining the Mojo network.

Mojo Protocol

As the title suggests, Mojo is more of a protocol. IPC is not something new, but what makes Mojo innovative is that it defines an IPC protocol capable of exchanging information and resources across the process boundary.

Message header

The first layer of a Mojo IPC message is Message. Below is the definition of its header.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// channel.h:61
struct MOJO_SYSTEM_IMPL_EXPORT Message {
enum class MessageType : uint16_t {
// An old format normal message, that uses the LegacyHeader.
// Only used on Android and ChromeOS.
// TODO(https://crbug.com/695645): remove legacy support when Arc++ has
// updated to Mojo with normal versioned messages.
NORMAL_LEGACY = 0,
#if defined(OS_IOS)
// A control message containing handles to echo back.
HANDLES_SENT,
// A control message containing handles that can now be closed.
HANDLES_SENT_ACK,
#endif
// A normal message that uses Header and can contain extra header values.
NORMAL,
};

#pragma pack(push, 1)
// Old message wire format for ChromeOS and Android, used by NORMAL_LEGACY
// messages.
struct LegacyHeader {
// Message size in bytes, including the header.
uint32_t num_bytes;

// Number of attached handles.
uint16_t num_handles;

MessageType message_type;
};

// Header used by NORMAL messages.
// To preserve backward compatibility with LegacyHeader, the num_bytes and
// message_type field must be at the same offset as in LegacyHeader.
struct Header {
// Message size in bytes, including the header.
uint32_t num_bytes;

// Total size of header, including extra header data (i.e. HANDLEs on
// windows).
uint16_t num_header_bytes;

MessageType message_type;

// Number of attached handles. May be less than the reserved handle
// storage size in this message on platforms that serialise handles as
// data (i.e. HANDLEs on Windows, Mach ports on OSX).
uint16_t num_handles;

char padding[6];
};

To illustrate it clearly, below is an ASCII figure.

1
2
3
4
5
6
7
8
|<-----16bits------>|<------16bits----->| -
| num_bytes | | header
| num_header_bytes | message_type | |
| num_handles | padding | _
| extra_header(handles) |
| ... |
| payload |
| ... |

Note that only on Windows and macOS, handles are serialized into the extra headers.

ChannelMessage header

The second layer is ChannelMessage, which is encoded in the payload field of Message.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// node_channel.cc:27
enum class MessageType : uint32_t {
ACCEPT_INVITEE,
ACCEPT_INVITATION,
ADD_BROKER_CLIENT,
BROKER_CLIENT_ADDED,
ACCEPT_BROKER_CLIENT,
EVENT_MESSAGE,
REQUEST_PORT_MERGE,
REQUEST_INTRODUCTION,
INTRODUCE,
#if defined(OS_WIN)
RELAY_EVENT_MESSAGE,
#endif
BROADCAST_EVENT,
#if defined(OS_WIN)
EVENT_MESSAGE_FROM_RELAY,
#endif
ACCEPT_PEER,
BIND_BROKER_HOST,
};

struct Header {
MessageType type;
uint32_t padding;
};

struct AcceptInviteeData {
ports::NodeName inviter_name;
ports::NodeName token;
};

struct AcceptInvitationData {
ports::NodeName token;
ports::NodeName invitee_name;
};

// ...

And corresponding ASCII figure.

1
2
3
4
5
|<-----16bits------>|<------16bits----->|
| message_type | -
| padding | | Message->payload
| data | |
| ... | _

Note that the data field in ChannelMessage is not fixed-length. It depends on the type of ChannelMessage.

EventMessage header

One of ChannelMessage types is EVENT_MESSAGE. In fact, EventMessage is a large class of messages. The definition of its header is as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// event.h:30
enum Type : uint32_t {
// A user message event contains arbitrary user-specified payload data
// which may include any number of ports and/or system handles (e.g. FDs).
kUserMessage,

// When a Node receives a user message with one or more ports attached, it
// sends back an instance of this event for every attached port to indicate
// that the port has been accepted by its destination node.
kPortAccepted,

// This event begins circulation any time a port enters a proxying state. It
// may be re-circulated in certain edge cases, but the ultimate purpose of
// the event is to ensure that every port along a route is (if necessary)
// aware that the proxying port is indeed proxying (and to where) so that it
// can begin to be bypassed along the route.
kObserveProxy,

// An event used to acknowledge to a proxy that all concerned nodes and
// ports are aware of its proxying state and that no more user messages will
// be routed to it beyond a given final sequence number.
kObserveProxyAck,

// Indicates that a port has been closed. This event fully circulates a
// route to ensure that all ports are aware of closure.
kObserveClosure,

// Used to request the merging of two routes via two sacrificial receiving
// ports, one from each route.
kMergePort,

// Used to request that the conjugate port acknowledges read messages by
// sending back a UserMessageReadAck.
kUserMessageReadAckRequest,

// Used to acknowledge read messages to the conjugate.
kUserMessageReadAck,
};
1
2
3
4
5
6
7
8
9
10
11
12
// event.cc:23
struct SerializedHeader {
Event::Type type;
uint32_t padding;
PortName port_name;
};

struct UserMessageEventData {
uint64_t sequence_num;
uint32_t num_ports;
uint32_t padding;
};

and corresponding ASCII figure.

1
2
3
4
5
6
7
8
9
|<-----16bits------>|<------16bits----->|                       -
| message_type | |
| padding | |
| | - | ChannelMessage->data
| | | port_name (128bits) |
| port_name | | |
| | - |
| data | |
| ... | -

Specifically, for a UserMessageEvent:

1
2
3
4
5
6
7
8
9
10
11
12
13
|<-----16bits------>|<------16bits----->|                       -
| message_type | |
| padding | |
| | - |
| | | port_name (128bits) |
| port_name | | |
| | - | ChannelMessage->data
| sequence_num(64bits) | |
| | - |
| num_ports | | |
| padding | | EventMessage->data |
| ports | | |
| ... | - -

UserMessage header

As a type of EventMessage, UserMessage also has its header. Contrary to previous design, the UserMessage is attached after the whole EventMessage.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#pragma pack(push, 1)
// Header attached to every message.
struct MessageHeader {
// The number of serialized dispatchers included in this header.
uint32_t num_dispatchers;

// Total size of the header, including serialized dispatcher data.
uint32_t header_size;
};

// Header for each dispatcher in a message, immediately following the message
// header.
struct DispatcherHeader {
// The type of the dispatcher, correpsonding to the Dispatcher::Type enum.
int32_t type;

// The size of the serialized dispatcher, not including this header.
uint32_t num_bytes;

// The number of ports needed to deserialize this dispatcher.
uint32_t num_ports;

// The number of platform handles needed to deserialize this dispatcher.
uint32_t num_platform_handles;
};
#pragma pack(pop)

and corresponding ASCII figure.

1
2
3
4
5
6
7
8
|<-----16bits------>|<------16bits----->|
| num_dispatchers |
| header_size |
| DispatcherHeader |
| ... |
| DispatcherData |
| ... |

For every dispatcher, a DispatcherHeader and DispatcherData will be appended to MessageHeader.

Summary

In my view, Mojo is more like a protocol, not a framework. The good message layering design decomposes different functions and extends the flexibility of Mojo itself.