New subject: software domains and top level nodes

21 Jan 2020

      On Fri, Jan 17, 2020 at 5:30 PM Stefano Stabellini via System-dt
system-dt@lists.openampproject.org wrote:
...
Hi all,
I would like to follow-up on system device tree and specifically on one
of the action items from the last call.
Rob raised the interesting question of what is the interaction between
the new system device tree concepts and the top level nodes (memory,
reserved-memory, cpus, chosen).
I am going to write here my observations.
Some questions inline, but they're really rhetorical questions for my
response at the end.
...
As a short summary, the system device tree concepts are:

Multiple top level "cpus,cluster" nodes to describe heterogenous CPU
clusters.
A new "indirect-bus" which is a type of bus that does not
automatically map to the parent address space.
An address-map property to express the different address mappings of
the different cpus clusters and can be used to map indirect-bus nodes.

These new nodes and properties allow us to describe multiple
heterogenous cpus clusters with potentially different address mappings,
which can be expressed using indirect-bus and address-map.
We also have new concepts for software domains configurations:

Multiple "openamp,domain" nodes (currently proposed under /chosen) to
specify software configurations and MPU configurations.
A new "access" property under each "openamp,domain" node with links to
nodes accessible from the cpus cluster.

Openamp,domain nodes allow us to define the cpus cluster and set of
hardware resources that together form a software domain. The access
property defines the list of resources available to one particular
cluster and maps well into MPU configurations (sometimes called
"firewall configurations" during the calls.)
See the attached full example.
I am going to go through the major top level nodes and expand on how
the new concepts affect them.
/cpus
/cpus is the top level node that contains the description of the cpus in
the system. With system device tree, it is not the only cpus cluster,
additional cpus clusters can be described by other top level nodes
compatible with "cpus,cluster". However, /cpus remains the default
cluster. An OS reading device tree should assume that it is running on
/cpus. From a compatibility perspective, if an OS doesn't understand or
recognize the other "cpus,cluster" nodes, it can ignore them, and just
process /cpus.
Buses compatible with "indirect-bus" do not map automatically to the
parent address space, which means that /cpus won't be able to access
them, unless an address-map property is specified under /cpus to express
the mapping. This is the only new limitation introduced for /cpus.
Again, from a compatibility perspective an OS that doesn't understand
the address-map property would just ignore both it and the bus, so
again, it is an opt-in new functionality.
So far in my examples "openamp,domain" nodes refer to "cpus,cluster"
nodes only, not to /cpus.  There is a question on whether we want to
allow "openamp,domain" nodes to define a software domain running on
/cpus. We could go either way, but for simplicity I think we can avoid
it.
"openamp,domain" nodes express accessibility restrictions while /cpus is
meant to be able to access everything by default. If we want to specify
hard accessibility settings for all clusters, it is possible to write a
pure system device tree without /cpus, where all cpus clusters are
described by "cpus,cluster" nodes and there is no expectation that an OS
will be able to use it without going through some transformations by
lopper (or other tools.)
/chosen
The /chosen node is used for software configurations, such as bootargs
(Linux command line). When multiple "openamp,domains" nodes are present
the configurations directly under /chosen continue to refer to the
software running on /cpus, while domain specific configurations need to
go under each domain node.
As an example:

/chosen/bootargs refers to the software running on /cpus
/chosen/openamp_r5/bootargs refers to the openamp_r5 domain

/memory
The /memory node describes the main memory in the system. Like for any
device node, all cpus clusters can address it.
Not really true. You could have memory regions not accessible by some cpus.
...
indirect-bus and
address-map can be used to express addressing differences.
It might be required to carve out special memory reservations for each
domain. These configurations are expressed under /reserved-memory as we
do today for any other reserved regions.
What about a symmetric case where say you have 4 domains and want to
divide main memory into 4 regions?
...
/reserved-memory
/reserved-memory is used to describe particular reserved memory regions
for special use by software. With system device tree /reserved-memory
becomes useful to describe domain specific memory reservations too.
Memory ranges for special use by "openamp,domain" nodes are expressed
under /reserved-memory following the usual set of rules. Each
"openamp,domain" node links to any relevant reserved-memory regions using
the access property. The rest is to be used by /cpus.
For instance:

/reserved-memory/memory_r5 is linked and used by /chosen/openamp_r5
other regions under /reserved-memory, not linked by any
"openamp,domain" nodes, go to the default /cpus

So the code that parses /reserved-memory has to look up something
elsewhere to determine if each child node applies? That's fairly
invasive to the existing handling of /reserved-memory.
Also, a reserved region could have different addresses for different
CPUs. Basically, /reserved-memory doesn't have an address, but
inherits the root addressing. That makes it a bit of an oddball. We
need to handle both shared and non-shared reserved regions.
Shared-memory for IPC is commonly described here for example.
...
We should use a specific compatible string to identify reserved memory
regions meant for openamp,domain nodes, so that a legacy OS will safely
ignore them. I added
compatible = "openamp,domain-memory-v1";
That doesn't really scale. If we don't care about legacy OS support,
then every node will have this?
I don't really like the asymmetric structure of all this. While having
a default view for existing OS seems worthwhile, as soon as there's a
more symmetric use case it becomes much more invasive and OS parsing
for all the above has to be adapted. We need to design for 100
domains.
To flip all this around, what if domains become the top-level structure:
domain@0 {
  chosen {};
  cpus {};
  memory@0 {};
  reserved-memory {};
};
domain@1 {
  chosen {};
  cpus {};
  memory@800000 {};
  reserved-memory {};
};
The content of all the currently top-level nodes don't need to change.
The OS's would be modified to treat a domain node as the root node
which shouldn't be very invasive. Then everything else just works as
is.
This could still have other nodes at the (real) root or links from one
domain to another. I haven't thought thru that part, but I think this
structure can only help because it removes the notion that the root
has a specific cpu view.
Rob

Re: [System-dt] software domains and top level nodes

/cpus

/chosen

/memory

/reserved-memory

/cpus

/chosen

/memory

/reserved-memory