Branch: refs/heads/main
Home: https://github.com/OpenAMP/openamp-system-reference
Commit: 5751358ac27d07ff44e5f233161a59ce8bb4afca
https://github.com/OpenAMP/openamp-system-reference/commit/5751358ac27d07ff…
Author: Iuliana Prodan <iuliana.prodan(a)nxp.com>
Date: 2025-08-25 (Mon, 25 Aug 2025)
Changed paths:
M west.yml
Log Message:
-----------
west: Update to Zephyr 4.2
Update to latest Zephyr release, 4.2.
Signed-off-by: Iuliana Prodan <iuliana.prodan(a)nxp.com>
Commit: daa61d6096e268ea6393532cd15494163ba0f927
https://github.com/OpenAMP/openamp-system-reference/commit/daa61d6096e268ea…
Author: Iuliana Prodan <iuliana.prodan(a)nxp.com>
Date: 2025-08-25 (Mon, 25 Aug 2025)
Changed paths:
A examples/zephyr/rpmsg_multi_services/boards/imx95_evk_mimx9596_m7.conf
A examples/zephyr/rpmsg_multi_services/boards/imx95_evk_mimx9596_m7.overlay
Log Message:
-----------
examples: add support for i.MX95 M7 core in rpmsg_multi_services
Add the dts and config overlay for imx95_evk_mimx9596_m7 board
in order to have the rpmsg_multi_services sample working on
ARM Cortex-M7 core from i.MX95.
Signed-off-by: Iuliana Prodan <iuliana.prodan(a)nxp.com>
Commit: 1922b2a4230f381cf94cc4e298fc53644db6d87d
https://github.com/OpenAMP/openamp-system-reference/commit/1922b2a4230f381c…
Author: Iuliana Prodan <iuliana.prodan(a)nxp.com>
Date: 2025-08-25 (Mon, 25 Aug 2025)
Changed paths:
M examples/zephyr/rpmsg_multi_services/README.rst
Log Message:
-----------
examples: Zephyr: update rpmsg-multi-service readme
Update README file to include NXP i.MX95 M7 board support.
Signed-off-by: Iuliana Prodan <iuliana.prodan(a)nxp.com>
Commit: 1a11089bf356e2aec4b657051b5c0d38fa437573
https://github.com/OpenAMP/openamp-system-reference/commit/1a11089bf356e2ae…
Author: Iuliana Prodan <iuliana.prodan(a)nxp.com>
Date: 2025-08-25 (Mon, 25 Aug 2025)
Changed paths:
M examples/zephyr/rpmsg_multi_services/src/main_remote.c
Log Message:
-----------
examples: zephyr: replace struct fw_resource_table with void
This fixes the following warning:
passing argument 1 of 'rsc_table_get' from incompatible pointer type.
Signed-off-by: Iuliana Prodan <iuliana.prodan(a)nxp.com>
Compare: https://github.com/OpenAMP/openamp-system-reference/compare/cc353a29db02...…
To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/openamp-system-reference/settings/notifications
Branch: refs/heads/main
Home: https://github.com/OpenAMP/openamp-system-reference
Commit: cc353a29db02a96376ed5e1b57a06f8c30bde988
https://github.com/OpenAMP/openamp-system-reference/commit/cc353a29db02a963…
Author: Ben Levinsky <ben.levinsky(a)amd.com>
Date: 2025-08-25 (Mon, 25 Aug 2025)
Changed paths:
M examples/legacy_apps/machine/zynqmp_r5/CMakeLists.txt
M examples/legacy_apps/machine/zynqmp_r5/generic/gic_init.c
M examples/legacy_apps/machine/zynqmp_r5/helper.c
M examples/legacy_apps/machine/zynqmp_r5/linker_large_text.ld
M examples/legacy_apps/machine/zynqmp_r5/linker_remote.ld
M examples/legacy_apps/machine/zynqmp_r5/platform_info.c
M examples/legacy_apps/machine/zynqmp_r5/platform_info.h
M examples/legacy_apps/machine/zynqmp_r5/rsc_table.c
M examples/legacy_apps/machine/zynqmp_r5/rsc_table.h
M examples/legacy_apps/machine/zynqmp_r5/zynqmp_r5_a53_rproc.c
M examples/legacy_apps/system/freertos/suspend.c
M examples/legacy_apps/system/generic/CMakeLists.txt
R examples/legacy_apps/system/generic/machine/CMakeLists.txt
R examples/legacy_apps/system/generic/machine/microblaze_generic/CMakeLists.txt
R examples/legacy_apps/system/generic/machine/microblaze_generic/helper.c
R examples/legacy_apps/system/generic/machine/microblaze_generic/linker_remote.ld
R examples/legacy_apps/system/generic/machine/zynqmp_r5/CMakeLists.txt
M examples/legacy_apps/system/generic/suspend.c
Log Message:
-----------
legacy_apps: zynqmp_r5: Fix compilation issues for baremetal and RPU target apps
Fix linker scripts, C files and Cmake to ensure upstream works with latest BSP.
Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com>
Signed-off-by: Tanmay Shah <tanmay.shah(a)amd.com>
To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/openamp-system-reference/settings/notifications
Branch: refs/heads/main
Home: https://github.com/OpenAMP/open-amp
Commit: 98ba93cfabf0a5398e2671b230f94c7cb8b3bef7
https://github.com/OpenAMP/open-amp/commit/98ba93cfabf0a5398e2671b230f94c7c…
Author: Sipke Vriend <sipke(a)direktembedded.com>
Date: 2025-08-25 (Mon, 25 Aug 2025)
Changed paths:
R doc/data-structure.md
R doc/img-src/coprocessor-rpmsg-ns-dynamic.gv
R doc/img-src/coprocessor-rpmsg-ns.gv
R doc/img-src/coprocessor-rpmsg-static-ep.gv
R doc/img-src/gen-graph.py
R doc/img-src/rproc-lcm-state-machine.gv
R doc/img/coprocessor-rpmsg-ns-dynamic.png
R doc/img/coprocessor-rpmsg-ns.png
R doc/img/coprocessor-rpmsg-static-ep.png
R doc/img/rproc-lcm-state-machine.png
R doc/remoteproc-design.md
R doc/rpmsg-design.md
Log Message:
-----------
doc: copy design document images from open-amp repository
These design documents and images are being moved to openamp-docs
Decided with Bill and Arnaud to move all the design documentation from the
https://github.com/OpenAMP/open-amp repository doc folder to the
https://github.com/OpenAMP/openamp-docs repository docs folder
The main reason being that the breathe and doxylink integration is already
in openamp-docs, so can be used for the design doc.
Signed-off-by: Sipke Vriend <sipke(a)direktembedded.com>
To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/open-amp/settings/notifications
Branch: refs/heads/main
Home: https://github.com/OpenAMP/libmetal
Commit: 16493b179fb480e6ff5d9378cf24b8b78859fd7f
https://github.com/OpenAMP/libmetal/commit/16493b179fb480e6ff5d9378cf24b8b7…
Author: Tanmay Shah <tanmay.shah(a)amd.com>
Date: 2025-07-29 (Tue, 29 Jul 2025)
Changed paths:
M lib/system/freertos/sys.h
R lib/system/freertos/template/CMakeLists.txt
R lib/system/freertos/template/sys.c
R lib/system/freertos/template/sys.h
R lib/system/freertos/xlnx/CMakeLists.txt
R lib/system/freertos/xlnx/irq.c
R lib/system/freertos/xlnx/sys.c
R lib/system/freertos/xlnx/sys.h
M lib/system/generic/sys.h
R lib/system/generic/template/CMakeLists.txt
R lib/system/generic/template/sys.c
R lib/system/generic/template/sys.h
R lib/system/generic/xlnx/CMakeLists.txt
R lib/system/generic/xlnx/irq.c
R lib/system/generic/xlnx/microblaze_generic/CMakeLists.txt
R lib/system/generic/xlnx/microblaze_generic/sys.c
R lib/system/generic/xlnx/sys.c
R lib/system/generic/xlnx/sys.h
R lib/system/generic/xlnx/sys_devicetree.h
Log Message:
-----------
lib: system: remove xlnx BSP specific code
libmetal is abstraction layer and vendor specific code should not be
maintained as part of libmetal. Instead common interfaces provided by
libmetal should be implemented by vendor BSP in downstream repos.
Signed-off-by: Tanmay Shah <tanmay.shah(a)amd.com>
To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/libmetal/settings/notifications
Branch: refs/heads/main
Home: https://github.com/OpenAMP/openamp-system-reference
Commit: 75c158b008d6d758c39870814dbf8a92cc084867
https://github.com/OpenAMP/openamp-system-reference/commit/75c158b008d6d758…
Author: Tanmay Shah <tanmay.shah(a)amd.com>
Date: 2025-07-23 (Wed, 23 Jul 2025)
Changed paths:
M examples/legacy_apps/machine/zynqmp_r5/helper.c
M examples/legacy_apps/machine/zynqmp_r5/platform_info.c
M examples/legacy_apps/machine/zynqmp_r5/rsc_table.c
M examples/legacy_apps/machine/zynqmp_r5/rsc_table.h
Log Message:
-----------
legacy_apps: zynqmp_r5: restore initial resources
Resource table can get corrupted during un-controlled reboot of host.
This can cause crash when attach happens after reboot. Restore initial
good version of resource table before re-creating virtio devices.
As part of this, first the server takes the resource table snapshot to
before any IPC events are in place to prevent this issue during system-init.
Secondly helper.c will call get_resource_table so that the future
call to restore_initial_rsc_table will have the resource set up.
Otherwise the snapshot can still fail if the resource is not set up.
Signed-off-by: Tanmay Shah <tanmay.shah(a)amd.com>
Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com>
Commit: 29232c244a8ef3136205031f891c361fdfb44472
https://github.com/OpenAMP/openamp-system-reference/commit/29232c244a8ef313…
Author: Tanmay Shah <tanmay.shah(a)amd.com>
Date: 2025-07-23 (Wed, 23 Jul 2025)
Changed paths:
M examples/legacy_apps/machine/zynqmp_r5/linker_large_text.ld
M examples/legacy_apps/machine/zynqmp_r5/linker_remote.ld
M examples/legacy_apps/machine/zynqmp_r5/rsc_table.c
M examples/legacy_apps/machine/zynqmp_r5/rsc_table.h
Log Message:
-----------
legacy_apps: zynqmp_r5: add rsc table metadata section
During attach detach operation, resource table needs to be retrieved by
host. In such case firmware is expected to provided metadata by host.
This metadata is expected by firmware at the start address of shared
memory beteween host and remote.
Add padding to ensure no section overlap.
Signed-off-by: Tanmay Shah <tanmay.shah(a)amd.com>
Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com>
Compare: https://github.com/OpenAMP/openamp-system-reference/compare/49d74fec0b50...…
To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/openamp-system-reference/settings/notifications
Branch: refs/heads/main
Home: https://github.com/OpenAMP/libmetal
Commit: a19cfb3ff3b23dd84043c660748f7892abe6029d
https://github.com/OpenAMP/libmetal/commit/a19cfb3ff3b23dd84043c660748f7892…
Author: Huichun Feng <foxhoundsk.tw(a)gmail.com>
Date: 2025-07-15 (Tue, 15 Jul 2025)
Changed paths:
M lib/system/linux/device.c
Log Message:
-----------
lib: system: linux: Remove redundant malloc'd addr release check
As the code path suggests, the pointer, ldev, at this point, can be
free'd directly w/o a check. Remove the redundant check.
Signed-off-by: Huichun Feng <foxhoundsk.tw(a)gmail.com>
To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/libmetal/settings/notifications
Branch: refs/heads/main
Home: https://github.com/OpenAMP/libmetal
Commit: fdc85ec742a03dc401167df4bffa49cae6ec8db5
https://github.com/OpenAMP/libmetal/commit/fdc85ec742a03dc401167df4bffa49ca…
Author: Fox Feng <foxhoundsk.tw(a)gmail.com>
Date: 2025-07-15 (Tue, 15 Jul 2025)
Changed paths:
M lib/system/linux/device.c
Log Message:
-----------
lib: system: linux: Remove redundant memory validness check
In the function prologue, "ldev" has already been assigned a malloc'd
address, and a check has also been performed. Remove this redundant
pointer validness check during each iteration. The second check was
also inherently misplaced because the it was not enclosed in the malloc
if-branch.
Signed-off-by: Fox Feng <foxhoundsk.tw(a)gmail.com>
Signed-off-by: Huichun Feng <foxhoundsk.tw(a)gmail.com>
To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/libmetal/settings/notifications
On 7/3/25 2:49 AM, Arnaud POULIQUEN wrote:
>
>
> On 7/2/25 19:00, Tanmay Shah wrote:
>>
>>
>> On 7/2/25 10:47 AM, Arnaud POULIQUEN wrote:
>>>
>>>
>>> On 7/2/25 17:23, Tanmay Shah wrote:
>>>>
>>>>
>>>> On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote:
>>>>>
>>>>>
>>>>> On 7/1/25 23:19, Tanmay Shah wrote:
>>>>>>
>>>>>>
>>>>>> On 7/1/25 1:06 PM, Tanmay Shah wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/1/25 12:56 PM, Tanmay Shah wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/1/25 17:16, Tanmay Shah wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote:
>>>>>>>>>>> Hi Tanmay,
>>>>>>>>>>>
>>>>>>>>>>> On 6/27/25 23:29, Tanmay Shah wrote:
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> I am implementing remoteproc recovery on attach-detach use case.
>>>>>>>>>>>> I have implemented the feature in the platform driver, and it works for
>>>>>>>>>>>> boot
>>>>>>>>>>>> recovery.
>>>>>>>>>>>
>>>>>>>>>>> Few questions to better understand your use case.
>>>>>>>>>>>
>>>>>>>>>>> 1) The linux remoteproc firmware attach to a a remote processor, and you
>>>>>>>>>>> generate a crash of the remote processor, right?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes correct.
>>>>>>>>>>
>>>>>>>>>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it
>>>>>>>>>>> is an
>>>>>>>>>>> autoreboot independent from the Linux core?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> It is auto-reboot independent from the linux core.
>>>>>>>>>>
>>>>>>>>>>> 2) In case of auto reboot, when does the remoteprocessor send an even to
>>>>>>>>>>> the
>>>>>>>>>>> Linux remoteproc driver ? beforeor after the reset?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Right now, when Remote reboots, it sends crash event to remoteproc driver
>>>>>>>>>> after
>>>>>>>>>> reboot.
>>>>>>>>>>
>>>>>>>>>>> 3) Do you expect to get core dump on crash?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> No coredump expected as of now, but only recovery. Eventually will
>>>>>>>>>> implement
>>>>>>>>>> coredump functionality as well.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> However, I am stuck at the testing phase.
>>>>>>>>>>>>
>>>>>>>>>>>> When should firmware report the crash ? After reboot ? or during some
>>>>>>>>>>>> kind of
>>>>>>>>>>>> crash handler ?
>>>>>>>>>>>>
>>>>>>>>>>>> So far, I am reporting crash after rebooting remote processor, but it
>>>>>>>>>>>> doesn't
>>>>>>>>>>>> seem to work i.e. I don't see rpmsg devices created after recovery.>
>>>>>>>>>>>> What should be the correct process to test this feature ? How other
>>>>>>>>>>>> platforms
>>>>>>>>>>>> are testing this?
>>>>>>>>>>>
>>>>>>>>>>> I have never tested it on ST board. As a first analysis, in case of
>>>>>>>>>>> autoreboot
>>>>>>>>>>> of the remote processor, it look like you should detach and reattach to
>>>>>>>>>>> recover.
>>>>>>>>>>
>>>>>>>>>> That is what's done from the remoteproc framework.
>>>>>>>>>>
>>>>>>>>>>> - On detach the rpmsg devices should be unbind
>>>>>>>>>>> - On attach the remote processor should request RPmsg channels using
>>>>>>>>>>> the NS
>>>>>>>>>>> announcement mechanism
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Main issue is, Remote firmware needs to wait till all above happens. Then
>>>>>>>>>> only
>>>>>>>>>> initialize virtio devices. Currently we don't have any way to notify
>>>>>>>>>> recovery
>>>>>>>>>> progress from linux to remote fw in the remoteproc framework. So I might
>>>>>>>>>> have to
>>>>>>>>>> introduce some platform specific mechanism in remote firmware to wait for
>>>>>>>>>> recovery to complete successfully.
>>>>>>>>>
>>>>>>>>> I guess the rproc->clean_table contains a copy of the resource table
>>>>>>>>> that is
>>>>>>>>> reapplied on attach, and the virtio devices should be re-probed, right?
>>>>>>>>>
>>>>>>>>> During the virtio device probe, the vdev status in the resource table is
>>>>>>>>> updated
>>>>>>>>> to 7 when virtio is ready to communicate. Virtio should then call
>>>>>>>>> rproc_virtio_notify() to inform the remote processor of the status update.
>>>>>>>>> At this stage, your remoteproc driver should be able to send a mailbox
>>>>>>>>> message
>>>>>>>>> to inform the remote side about the recovery completion.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think I spot the problem now.
>>>>>>>>
>>>>>>>> Linux side: file: remoteproc_core.c
>>>>>>>> rproc_attach_recovery
>>>>>>>> __rproc_detach
>>>>>>>> cleans up the resource table and re-loads it
>>>>>>>> __rproc_attach
>>>>>>>> stops and re-starts subdevices
>>>>>>>>
>>>>>>>>
>>>>>>>> Remote side:
>>>>>>>> Remote re-boots after crash
>>>>>>>> Detects crash happened previously
>>>>>>>> notify crash to Linux
>>>>>>>> (Linux is executing above flow meanwhile)
>>>>>>>> starts creating virtio devices
>>>>>>>> **rproc_virtio_create_vdev - parse vring & create vdev device**
>>>>>>>> **rproc_virtio_wait_remote_ready - wait for remote ready** [1]
>>>>>>>>
>>>>>>>> I think Remote should wait on DRIVER_OK bit, before creating virtio devices.
>>>>>>>> The temporary solution I implemented was to make sure vrings addresses are
>>>>>>>> not 0xffffffff like following:
>>>>>>>>
>>>>>>>> while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY ||
>>>>>>>> rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) {
>>>>>>>> usleep(100);
>>>>>>>> metal_cache_invalidate(rsc, rproc->rsc_len);
>>>>>>>> }
>>>>>>>>
>>>>>>>> Above works, but I think better solution is to change sequence where remote
>>>>>>>> waits before creating virtio devices.
>>>>>>>
>>>>>>> I am sorry, I should have said, remote should wait before parsing and
>>>>>>> assigning vrings to virtio device.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://github.com/OpenAMP/open-amp/
>>>>>>>> blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/
>>>>>>>> remoteproc.c#L994
>>>>>>>>
>>>>>>
>>>>>> Actually upon further checking, I think above code is okay. I see that
>>>>>> wait_remote_ready is called before vrings are setup on remote fw side.
>>>>>>
>>>>>> However, during recovery time on remote side, somehow I still have to
>>>>>> implement
>>>>>> platform specific wait for vrings to setup correctly.
>>>>>>
>>>>>> From linux side, DRIVER_OK bit is set before vrings are setup correctly.
>>>>>> Because of that, when remote firmware sets up wrong vring addresses and then
>>>>>> rpmsg channels are not created.
>>>>>>
>>>>>> I am investigating on this further.
>>>>>
>>>>> Do you reset the vdev status as requested by the virtio spec?
>>>>> https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html…
>>>>>
>>>>> Regards,
>>>>> Arnaud
>>>>>
>>>>
>>>> Yes I do. I am actually restoring deafult resource table on firmware side, which
>>>> will set rpmsg_vdev status to 0.
>>>>
>>>> However, when printing vrings right before wait_remote_ready, I see vrings are
>>>> not set correctly from linux side:
>>>>
>>>> `vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF`
>>>
>>> That makes sense if values corresponds to the initial values of the resource
>>> table
>>> rproc->clean_table should contain a copy of these initial values.
>>>
>>>>
>>>> However, the rproc state was still moved to attach when checked from remoteproc
>>>> sysfs.
>>>
>>> Does the rproc_handle_resources() is called before going back in attached state?
>>
>> You are right. I think __rproc_attach() isn't calling rproc_handle_resources().
>>
>> But recovery is supported by other platforms so I think recovery should work
>> without calling rproc_handle_resources().
>
> Right. Having taken a deeper look at the code, it seems that there is an issue.
> In rproc_reset_rsc_table_on_detach(), we clean the resource table without
> calling rproc_resource_cleanup().
>
> It seems to me that rproc_reset_rsc_table_on_detach() should not be called in
> __rproc_detach() but rather in rproc_detach() after calling
> rproc_resource_cleanup().
>
>
Arnaud, I can confirm above is correct. After moving rsc_table_on_detach
out of __rproc_detach the recovery works.
I will send patch to fix that when I add recovery support on AMD-Xilinx
platform driver.
Thank you so much for looking into this.
Tanmay
>>
>> May be re-storing resource table from firmware side after reboot isn't a good
>> idea. I will try without it.
>>
>>>
>>>>
>>>> `cat /sys/class/remoteproc/remoteproc0/state`
>>>> attached
>>>>
>>>> Somehow the sync between remote fw and linux isn't right.
>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Tanmay
>>>>>>>>> Regards
>>>>>>>>> Arnaud
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Arnaud
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Tanmay
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>
On 7/3/25 2:49 AM, Arnaud POULIQUEN wrote:
>
>
> On 7/2/25 19:00, Tanmay Shah wrote:
>>
>>
>> On 7/2/25 10:47 AM, Arnaud POULIQUEN wrote:
>>>
>>>
>>> On 7/2/25 17:23, Tanmay Shah wrote:
>>>>
>>>>
>>>> On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote:
>>>>>
>>>>>
>>>>> On 7/1/25 23:19, Tanmay Shah wrote:
>>>>>>
>>>>>>
>>>>>> On 7/1/25 1:06 PM, Tanmay Shah wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/1/25 12:56 PM, Tanmay Shah wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/1/25 17:16, Tanmay Shah wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote:
>>>>>>>>>>> Hi Tanmay,
>>>>>>>>>>>
>>>>>>>>>>> On 6/27/25 23:29, Tanmay Shah wrote:
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> I am implementing remoteproc recovery on attach-detach use case.
>>>>>>>>>>>> I have implemented the feature in the platform driver, and it works for
>>>>>>>>>>>> boot
>>>>>>>>>>>> recovery.
>>>>>>>>>>>
>>>>>>>>>>> Few questions to better understand your use case.
>>>>>>>>>>>
>>>>>>>>>>> 1) The linux remoteproc firmware attach to a a remote processor, and you
>>>>>>>>>>> generate a crash of the remote processor, right?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes correct.
>>>>>>>>>>
>>>>>>>>>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it
>>>>>>>>>>> is an
>>>>>>>>>>> autoreboot independent from the Linux core?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> It is auto-reboot independent from the linux core.
>>>>>>>>>>
>>>>>>>>>>> 2) In case of auto reboot, when does the remoteprocessor send an even to
>>>>>>>>>>> the
>>>>>>>>>>> Linux remoteproc driver ? beforeor after the reset?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Right now, when Remote reboots, it sends crash event to remoteproc driver
>>>>>>>>>> after
>>>>>>>>>> reboot.
>>>>>>>>>>
>>>>>>>>>>> 3) Do you expect to get core dump on crash?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> No coredump expected as of now, but only recovery. Eventually will
>>>>>>>>>> implement
>>>>>>>>>> coredump functionality as well.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> However, I am stuck at the testing phase.
>>>>>>>>>>>>
>>>>>>>>>>>> When should firmware report the crash ? After reboot ? or during some
>>>>>>>>>>>> kind of
>>>>>>>>>>>> crash handler ?
>>>>>>>>>>>>
>>>>>>>>>>>> So far, I am reporting crash after rebooting remote processor, but it
>>>>>>>>>>>> doesn't
>>>>>>>>>>>> seem to work i.e. I don't see rpmsg devices created after recovery.>
>>>>>>>>>>>> What should be the correct process to test this feature ? How other
>>>>>>>>>>>> platforms
>>>>>>>>>>>> are testing this?
>>>>>>>>>>>
>>>>>>>>>>> I have never tested it on ST board. As a first analysis, in case of
>>>>>>>>>>> autoreboot
>>>>>>>>>>> of the remote processor, it look like you should detach and reattach to
>>>>>>>>>>> recover.
>>>>>>>>>>
>>>>>>>>>> That is what's done from the remoteproc framework.
>>>>>>>>>>
>>>>>>>>>>> - On detach the rpmsg devices should be unbind
>>>>>>>>>>> - On attach the remote processor should request RPmsg channels using
>>>>>>>>>>> the NS
>>>>>>>>>>> announcement mechanism
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Main issue is, Remote firmware needs to wait till all above happens. Then
>>>>>>>>>> only
>>>>>>>>>> initialize virtio devices. Currently we don't have any way to notify
>>>>>>>>>> recovery
>>>>>>>>>> progress from linux to remote fw in the remoteproc framework. So I might
>>>>>>>>>> have to
>>>>>>>>>> introduce some platform specific mechanism in remote firmware to wait for
>>>>>>>>>> recovery to complete successfully.
>>>>>>>>>
>>>>>>>>> I guess the rproc->clean_table contains a copy of the resource table
>>>>>>>>> that is
>>>>>>>>> reapplied on attach, and the virtio devices should be re-probed, right?
>>>>>>>>>
>>>>>>>>> During the virtio device probe, the vdev status in the resource table is
>>>>>>>>> updated
>>>>>>>>> to 7 when virtio is ready to communicate. Virtio should then call
>>>>>>>>> rproc_virtio_notify() to inform the remote processor of the status update.
>>>>>>>>> At this stage, your remoteproc driver should be able to send a mailbox
>>>>>>>>> message
>>>>>>>>> to inform the remote side about the recovery completion.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think I spot the problem now.
>>>>>>>>
>>>>>>>> Linux side: file: remoteproc_core.c
>>>>>>>> rproc_attach_recovery
>>>>>>>> __rproc_detach
>>>>>>>> cleans up the resource table and re-loads it
>>>>>>>> __rproc_attach
>>>>>>>> stops and re-starts subdevices
>>>>>>>>
>>>>>>>>
>>>>>>>> Remote side:
>>>>>>>> Remote re-boots after crash
>>>>>>>> Detects crash happened previously
>>>>>>>> notify crash to Linux
>>>>>>>> (Linux is executing above flow meanwhile)
>>>>>>>> starts creating virtio devices
>>>>>>>> **rproc_virtio_create_vdev - parse vring & create vdev device**
>>>>>>>> **rproc_virtio_wait_remote_ready - wait for remote ready** [1]
>>>>>>>>
>>>>>>>> I think Remote should wait on DRIVER_OK bit, before creating virtio devices.
>>>>>>>> The temporary solution I implemented was to make sure vrings addresses are
>>>>>>>> not 0xffffffff like following:
>>>>>>>>
>>>>>>>> while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY ||
>>>>>>>> rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) {
>>>>>>>> usleep(100);
>>>>>>>> metal_cache_invalidate(rsc, rproc->rsc_len);
>>>>>>>> }
>>>>>>>>
>>>>>>>> Above works, but I think better solution is to change sequence where remote
>>>>>>>> waits before creating virtio devices.
>>>>>>>
>>>>>>> I am sorry, I should have said, remote should wait before parsing and
>>>>>>> assigning vrings to virtio device.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://github.com/OpenAMP/open-amp/
>>>>>>>> blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/
>>>>>>>> remoteproc.c#L994
>>>>>>>>
>>>>>>
>>>>>> Actually upon further checking, I think above code is okay. I see that
>>>>>> wait_remote_ready is called before vrings are setup on remote fw side.
>>>>>>
>>>>>> However, during recovery time on remote side, somehow I still have to
>>>>>> implement
>>>>>> platform specific wait for vrings to setup correctly.
>>>>>>
>>>>>> From linux side, DRIVER_OK bit is set before vrings are setup correctly.
>>>>>> Because of that, when remote firmware sets up wrong vring addresses and then
>>>>>> rpmsg channels are not created.
>>>>>>
>>>>>> I am investigating on this further.
>>>>>
>>>>> Do you reset the vdev status as requested by the virtio spec?
>>>>> https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html…
>>>>>
>>>>> Regards,
>>>>> Arnaud
>>>>>
>>>>
>>>> Yes I do. I am actually restoring deafult resource table on firmware side, which
>>>> will set rpmsg_vdev status to 0.
>>>>
>>>> However, when printing vrings right before wait_remote_ready, I see vrings are
>>>> not set correctly from linux side:
>>>>
>>>> `vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF`
>>>
>>> That makes sense if values corresponds to the initial values of the resource
>>> table
>>> rproc->clean_table should contain a copy of these initial values.
>>>
>>>>
>>>> However, the rproc state was still moved to attach when checked from remoteproc
>>>> sysfs.
>>>
>>> Does the rproc_handle_resources() is called before going back in attached state?
>>
>> You are right. I think __rproc_attach() isn't calling rproc_handle_resources().
>>
>> But recovery is supported by other platforms so I think recovery should work
>> without calling rproc_handle_resources().
>
> Right. Having taken a deeper look at the code, it seems that there is an issue.
> In rproc_reset_rsc_table_on_detach(), we clean the resource table without
> calling rproc_resource_cleanup().
>
> It seems to me that rproc_reset_rsc_table_on_detach() should not be called in
> __rproc_detach() but rather in rproc_detach() after calling
> rproc_resource_cleanup().
>
>
Yes that sounds correct. It's long-weekend here in US. So, I will try
this next week and update.
Thanks,
Tanmay
>>
>> May be re-storing resource table from firmware side after reboot isn't a good
>> idea. I will try without it.
>>
>>>
>>>>
>>>> `cat /sys/class/remoteproc/remoteproc0/state`
>>>> attached
>>>>
>>>> Somehow the sync between remote fw and linux isn't right.
>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Tanmay
>>>>>>>>> Regards
>>>>>>>>> Arnaud
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Arnaud
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Tanmay
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>