On Fri, Jan 17, 2020 at 5:30 PM Stefano Stabellini via System-dt system-dt@lists.openampproject.org wrote:
Hi all,
I would like to follow-up on system device tree and specifically on one of the action items from the last call.
Rob raised the interesting question of what is the interaction between the new system device tree concepts and the top level nodes (memory, reserved-memory, cpus, chosen).
I am going to write here my observations.
Some questions inline, but they're really rhetorical questions for my response at the end.
As a short summary, the system device tree concepts are:
- Multiple top level "cpus,cluster" nodes to describe heterogenous CPU clusters.
- A new "indirect-bus" which is a type of bus that does not automatically map to the parent address space.
- An address-map property to express the different address mappings of the different cpus clusters and can be used to map indirect-bus nodes.
These new nodes and properties allow us to describe multiple heterogenous cpus clusters with potentially different address mappings, which can be expressed using indirect-bus and address-map.
We also have new concepts for software domains configurations:
- Multiple "openamp,domain" nodes (currently proposed under /chosen) to specify software configurations and MPU configurations.
- A new "access" property under each "openamp,domain" node with links to nodes accessible from the cpus cluster.
Openamp,domain nodes allow us to define the cpus cluster and set of hardware resources that together form a software domain. The access property defines the list of resources available to one particular cluster and maps well into MPU configurations (sometimes called "firewall configurations" during the calls.)
See the attached full example.
I am going to go through the major top level nodes and expand on how the new concepts affect them.
/cpus
/cpus is the top level node that contains the description of the cpus in the system. With system device tree, it is not the only cpus cluster, additional cpus clusters can be described by other top level nodes compatible with "cpus,cluster". However, /cpus remains the default cluster. An OS reading device tree should assume that it is running on /cpus. From a compatibility perspective, if an OS doesn't understand or recognize the other "cpus,cluster" nodes, it can ignore them, and just process /cpus.
Buses compatible with "indirect-bus" do not map automatically to the parent address space, which means that /cpus won't be able to access them, unless an address-map property is specified under /cpus to express the mapping. This is the only new limitation introduced for /cpus. Again, from a compatibility perspective an OS that doesn't understand the address-map property would just ignore both it and the bus, so again, it is an opt-in new functionality.
So far in my examples "openamp,domain" nodes refer to "cpus,cluster" nodes only, not to /cpus. There is a question on whether we want to allow "openamp,domain" nodes to define a software domain running on /cpus. We could go either way, but for simplicity I think we can avoid it.
"openamp,domain" nodes express accessibility restrictions while /cpus is meant to be able to access everything by default. If we want to specify hard accessibility settings for all clusters, it is possible to write a pure system device tree without /cpus, where all cpus clusters are described by "cpus,cluster" nodes and there is no expectation that an OS will be able to use it without going through some transformations by lopper (or other tools.)
/chosen
The /chosen node is used for software configurations, such as bootargs (Linux command line). When multiple "openamp,domains" nodes are present the configurations directly under /chosen continue to refer to the software running on /cpus, while domain specific configurations need to go under each domain node.
As an example:
- /chosen/bootargs refers to the software running on /cpus
- /chosen/openamp_r5/bootargs refers to the openamp_r5 domain
/memory
The /memory node describes the main memory in the system. Like for any device node, all cpus clusters can address it.
Not really true. You could have memory regions not accessible by some cpus.
indirect-bus and address-map can be used to express addressing differences.
It might be required to carve out special memory reservations for each domain. These configurations are expressed under /reserved-memory as we do today for any other reserved regions.
What about a symmetric case where say you have 4 domains and want to divide main memory into 4 regions?
/reserved-memory
/reserved-memory is used to describe particular reserved memory regions for special use by software. With system device tree /reserved-memory becomes useful to describe domain specific memory reservations too. Memory ranges for special use by "openamp,domain" nodes are expressed under /reserved-memory following the usual set of rules. Each "openamp,domain" node links to any relevant reserved-memory regions using the access property. The rest is to be used by /cpus.
For instance:
- /reserved-memory/memory_r5 is linked and used by /chosen/openamp_r5
- other regions under /reserved-memory, not linked by any "openamp,domain" nodes, go to the default /cpus
So the code that parses /reserved-memory has to look up something elsewhere to determine if each child node applies? That's fairly invasive to the existing handling of /reserved-memory.
Also, a reserved region could have different addresses for different CPUs. Basically, /reserved-memory doesn't have an address, but inherits the root addressing. That makes it a bit of an oddball. We need to handle both shared and non-shared reserved regions. Shared-memory for IPC is commonly described here for example.
We should use a specific compatible string to identify reserved memory regions meant for openamp,domain nodes, so that a legacy OS will safely ignore them. I added
compatible = "openamp,domain-memory-v1";
That doesn't really scale. If we don't care about legacy OS support, then every node will have this?
I don't really like the asymmetric structure of all this. While having a default view for existing OS seems worthwhile, as soon as there's a more symmetric use case it becomes much more invasive and OS parsing for all the above has to be adapted. We need to design for 100 domains.
To flip all this around, what if domains become the top-level structure:
domain@0 { chosen {}; cpus {}; memory@0 {}; reserved-memory {}; };
domain@1 { chosen {}; cpus {}; memory@800000 {}; reserved-memory {}; };
The content of all the currently top-level nodes don't need to change. The OS's would be modified to treat a domain node as the root node which shouldn't be very invasive. Then everything else just works as is.
This could still have other nodes at the (real) root or links from one domain to another. I haven't thought thru that part, but I think this structure can only help because it removes the notion that the root has a specific cpu view.
Rob
On Tue, 21 Jan 2020, Rob Herring wrote:
On Fri, Jan 17, 2020 at 5:30 PM Stefano Stabellini via System-dt system-dt@lists.openampproject.org wrote:
Hi all,
I would like to follow-up on system device tree and specifically on one of the action items from the last call.
Rob raised the interesting question of what is the interaction between the new system device tree concepts and the top level nodes (memory, reserved-memory, cpus, chosen).
I am going to write here my observations.
Some questions inline, but they're really rhetorical questions for my response at the end.
As a short summary, the system device tree concepts are:
- Multiple top level "cpus,cluster" nodes to describe heterogenous CPU clusters.
- A new "indirect-bus" which is a type of bus that does not automatically map to the parent address space.
- An address-map property to express the different address mappings of the different cpus clusters and can be used to map indirect-bus nodes.
These new nodes and properties allow us to describe multiple heterogenous cpus clusters with potentially different address mappings, which can be expressed using indirect-bus and address-map.
We also have new concepts for software domains configurations:
- Multiple "openamp,domain" nodes (currently proposed under /chosen) to specify software configurations and MPU configurations.
- A new "access" property under each "openamp,domain" node with links to nodes accessible from the cpus cluster.
Openamp,domain nodes allow us to define the cpus cluster and set of hardware resources that together form a software domain. The access property defines the list of resources available to one particular cluster and maps well into MPU configurations (sometimes called "firewall configurations" during the calls.)
See the attached full example.
I am going to go through the major top level nodes and expand on how the new concepts affect them.
/cpus
/cpus is the top level node that contains the description of the cpus in the system. With system device tree, it is not the only cpus cluster, additional cpus clusters can be described by other top level nodes compatible with "cpus,cluster". However, /cpus remains the default cluster. An OS reading device tree should assume that it is running on /cpus. From a compatibility perspective, if an OS doesn't understand or recognize the other "cpus,cluster" nodes, it can ignore them, and just process /cpus.
Buses compatible with "indirect-bus" do not map automatically to the parent address space, which means that /cpus won't be able to access them, unless an address-map property is specified under /cpus to express the mapping. This is the only new limitation introduced for /cpus. Again, from a compatibility perspective an OS that doesn't understand the address-map property would just ignore both it and the bus, so again, it is an opt-in new functionality.
So far in my examples "openamp,domain" nodes refer to "cpus,cluster" nodes only, not to /cpus. There is a question on whether we want to allow "openamp,domain" nodes to define a software domain running on /cpus. We could go either way, but for simplicity I think we can avoid it.
"openamp,domain" nodes express accessibility restrictions while /cpus is meant to be able to access everything by default. If we want to specify hard accessibility settings for all clusters, it is possible to write a pure system device tree without /cpus, where all cpus clusters are described by "cpus,cluster" nodes and there is no expectation that an OS will be able to use it without going through some transformations by lopper (or other tools.)
/chosen
The /chosen node is used for software configurations, such as bootargs (Linux command line). When multiple "openamp,domains" nodes are present the configurations directly under /chosen continue to refer to the software running on /cpus, while domain specific configurations need to go under each domain node.
As an example:
- /chosen/bootargs refers to the software running on /cpus
- /chosen/openamp_r5/bootargs refers to the openamp_r5 domain
/memory
The /memory node describes the main memory in the system. Like for any device node, all cpus clusters can address it.
Not really true. You could have memory regions not accessible by some cpus.
indirect-bus and address-map can be used to express addressing differences.
It might be required to carve out special memory reservations for each domain. These configurations are expressed under /reserved-memory as we do today for any other reserved regions.
What about a symmetric case where say you have 4 domains and want to divide main memory into 4 regions?
/reserved-memory
/reserved-memory is used to describe particular reserved memory regions for special use by software. With system device tree /reserved-memory becomes useful to describe domain specific memory reservations too. Memory ranges for special use by "openamp,domain" nodes are expressed under /reserved-memory following the usual set of rules. Each "openamp,domain" node links to any relevant reserved-memory regions using the access property. The rest is to be used by /cpus.
For instance:
- /reserved-memory/memory_r5 is linked and used by /chosen/openamp_r5
- other regions under /reserved-memory, not linked by any "openamp,domain" nodes, go to the default /cpus
So the code that parses /reserved-memory has to look up something elsewhere to determine if each child node applies? That's fairly invasive to the existing handling of /reserved-memory.
Also, a reserved region could have different addresses for different CPUs. Basically, /reserved-memory doesn't have an address, but inherits the root addressing. That makes it a bit of an oddball. We need to handle both shared and non-shared reserved regions. Shared-memory for IPC is commonly described here for example.
We should use a specific compatible string to identify reserved memory regions meant for openamp,domain nodes, so that a legacy OS will safely ignore them. I added
compatible = "openamp,domain-memory-v1";
That doesn't really scale. If we don't care about legacy OS support, then every node will have this?
I don't really like the asymmetric structure of all this. While having a default view for existing OS seems worthwhile, as soon as there's a more symmetric use case it becomes much more invasive and OS parsing for all the above has to be adapted. We need to design for 100 domains.
Actually I agree with you here. In fact, in my original proposal I had "memory" as an attribute under the "openamp,domain" node. More on this below. The usage of reserved-memory was done as a follow-up to one of Grant's comments during the calls.
To flip all this around, what if domains become the top-level structure:
domain@0 { chosen {}; cpus {}; memory@0 {}; reserved-memory {}; };
domain@1 { chosen {}; cpus {}; memory@800000 {}; reserved-memory {}; };
The content of all the currently top-level nodes don't need to change. The OS's would be modified to treat a domain node as the root node which shouldn't be very invasive. Then everything else just works as is.
I see what you are trying to do with this suggestion and I think it is a worthy goal. In particular, I agree with you that the suggested usage of reserved-memory for domain memory has limitations and looks asymmetric.
However, I think it would be worthwhile to keep the separation between domains under /chosen (or equivalent, for instance a new top-level node called "domains") and hardware configuration so that they can be easily done in separate stages by different people/personas.
It is a good idea to keep the domains configuration separate because it is a configuration, not a hardware description, and typically comes in later in the build pipeline. It would be best if it could be added at a later time without dramatic changes to the device tree structure.
Thus, my suggestion would be to avoid using reserved-memory for domain memory reservations completely. Instead we could just use the "memory" attribute under the openamp,domain-v1 node:
openamp_r5 { compatible = "openamp,domain-v1"; #address-cells = <0x2>; #size-cells = <0x2>; memory = <0x0 0x0 0x0 0x8000000>; cpus = <&cpus_r5 0x2 0x80000000>; access = <&tcm 0x1>, <ðernet0 0x0>; };
In fact, reserved-memory alone wouldn't be sufficient to describe virtual machine memory anyway (something I have started to think about), and that is one of the reasons why I kept a "memory" attribute under the openamp_r5 node even in the latest example. The information under reserved-memory is actually already redundant.
The memory attribute under "openamp,domain-v1" would specify domain specific memory reservations. The top-level memory node would describe all the memory physically available, no matter the software domains in the system.
The top-level reserved-memory node will only apply to the default /cpus node and could be used for compatibility with legacy OSes to hide away other domains' memory ranges. reserved-memory information for a domain would go under the "openamp,domain-v1", similarly to what you suggested. See the example below.
This could still have other nodes at the (real) root or links from one domain to another. I haven't thought thru that part, but I think this structure can only help because it removes the notion that the root has a specific cpu view.
We can do that by making /cpus optional or removing it completely. If a cpus top-level node is not specified, there is no default view. Taking away /cpus, my example would look very similar to yours, except that all the domains would be described in a single place under a new top-level node called "domains", keeping software configurationis and hardware description separate.
No /cpus, means no /chosen and no /reserved-memory. There would still be a /memory node to describe all the memory physically available in the system. This is the example:
memory@0 {}; amba { dev1: dev@1000000 { }; dev2: dev@2000000 { };
}; cpus_r5: cpus-cluster@0 {}; cpus_a53: cpus-cluster@1 {};
domains { domain@0 { cpus { &cpus_r5 }; memory@0 {}; access { &dev1 }; chosen {}; reserved-memory {}; };
domain@1 { cpus { &cpus_a53 }; memory@800000 {}; access { &dev2 }; chosen {}; reserved-memory {}; }; };
system-dt@lists.openampproject.org