- Openamp-rp - lists.openampproject.org

by Tanmay Shah

On 7/3/25 2:49 AM, Arnaud POULIQUEN wrote: > > > On 7/2/25 19:00, Tanmay Shah wrote: >> >> >> On 7/2/25 10:47 AM, Arnaud POULIQUEN wrote: >>> >>> >>> On 7/2/25 17:23, Tanmay Shah wrote: >>>> >>>> >>>> On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote: >>>>> >>>>> >>>>> On 7/1/25 23:19, Tanmay Shah wrote: >>>>>> >>>>>> >>>>>> On 7/1/25 1:06 PM, Tanmay Shah wrote: >>>>>>> >>>>>>> >>>>>>> On 7/1/25 12:56 PM, Tanmay Shah wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/1/25 17:16, Tanmay Shah wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote: >>>>>>>>>>> Hi Tanmay, >>>>>>>>>>> >>>>>>>>>>> On 6/27/25 23:29, Tanmay Shah wrote: >>>>>>>>>>>> Hello all, >>>>>>>>>>>> >>>>>>>>>>>> I am implementing remoteproc recovery on attach-detach use case. >>>>>>>>>>>> I have implemented the feature in the platform driver, and it works for >>>>>>>>>>>> boot >>>>>>>>>>>> recovery. >>>>>>>>>>> >>>>>>>>>>> Few questions to better understand your use case. >>>>>>>>>>> >>>>>>>>>>> 1) The linux remoteproc firmware attach to a a remote processor, and you >>>>>>>>>>> generate a crash of the remote processor, right? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes correct. >>>>>>>>>> >>>>>>>>>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it >>>>>>>>>>> is an >>>>>>>>>>> autoreboot independent from the Linux core? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> It is auto-reboot independent from the linux core. >>>>>>>>>> >>>>>>>>>>> 2) In case of auto reboot, when does the remoteprocessor send an even to >>>>>>>>>>> the >>>>>>>>>>> Linux remoteproc driver ? beforeor after the reset? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Right now, when Remote reboots, it sends crash event to remoteproc driver >>>>>>>>>> after >>>>>>>>>> reboot. >>>>>>>>>> >>>>>>>>>>> 3) Do you expect to get core dump on crash? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> No coredump expected as of now, but only recovery. Eventually will >>>>>>>>>> implement >>>>>>>>>> coredump functionality as well. >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> However, I am stuck at the testing phase. >>>>>>>>>>>> >>>>>>>>>>>> When should firmware report the crash ? After reboot ? or during some >>>>>>>>>>>> kind of >>>>>>>>>>>> crash handler ? >>>>>>>>>>>> >>>>>>>>>>>> So far, I am reporting crash after rebooting remote processor, but it >>>>>>>>>>>> doesn't >>>>>>>>>>>> seem to work i.e. I don't see rpmsg devices created after recovery.> >>>>>>>>>>>> What should be the correct process to test this feature ? How other >>>>>>>>>>>> platforms >>>>>>>>>>>> are testing this? >>>>>>>>>>> >>>>>>>>>>> I have never tested it on ST board. As a first analysis, in case of >>>>>>>>>>> autoreboot >>>>>>>>>>> of the remote processor, it look like you should detach and reattach to >>>>>>>>>>> recover. >>>>>>>>>> >>>>>>>>>> That is what's done from the remoteproc framework. >>>>>>>>>> >>>>>>>>>>> - On detach the rpmsg devices should be unbind >>>>>>>>>>> - On attach the remote processor should request RPmsg channels using >>>>>>>>>>> the NS >>>>>>>>>>> announcement mechanism >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Main issue is, Remote firmware needs to wait till all above happens. Then >>>>>>>>>> only >>>>>>>>>> initialize virtio devices. Currently we don't have any way to notify >>>>>>>>>> recovery >>>>>>>>>> progress from linux to remote fw in the remoteproc framework. So I might >>>>>>>>>> have to >>>>>>>>>> introduce some platform specific mechanism in remote firmware to wait for >>>>>>>>>> recovery to complete successfully. >>>>>>>>> >>>>>>>>> I guess the rproc->clean_table contains a copy of the resource table >>>>>>>>> that is >>>>>>>>> reapplied on attach, and the virtio devices should be re-probed, right? >>>>>>>>> >>>>>>>>> During the virtio device probe, the vdev status in the resource table is >>>>>>>>> updated >>>>>>>>> to 7 when virtio is ready to communicate. Virtio should then call >>>>>>>>> rproc_virtio_notify() to inform the remote processor of the status update. >>>>>>>>> At this stage, your remoteproc driver should be able to send a mailbox >>>>>>>>> message >>>>>>>>> to inform the remote side about the recovery completion. >>>>>>>>> >>>>>>>> >>>>>>>> I think I spot the problem now. >>>>>>>> >>>>>>>> Linux side: file: remoteproc_core.c >>>>>>>> rproc_attach_recovery >>>>>>>> __rproc_detach >>>>>>>> cleans up the resource table and re-loads it >>>>>>>> __rproc_attach >>>>>>>> stops and re-starts subdevices >>>>>>>> >>>>>>>> >>>>>>>> Remote side: >>>>>>>> Remote re-boots after crash >>>>>>>> Detects crash happened previously >>>>>>>> notify crash to Linux >>>>>>>> (Linux is executing above flow meanwhile) >>>>>>>> starts creating virtio devices >>>>>>>> **rproc_virtio_create_vdev - parse vring & create vdev device** >>>>>>>> **rproc_virtio_wait_remote_ready - wait for remote ready** [1] >>>>>>>> >>>>>>>> I think Remote should wait on DRIVER_OK bit, before creating virtio devices. >>>>>>>> The temporary solution I implemented was to make sure vrings addresses are >>>>>>>> not 0xffffffff like following: >>>>>>>> >>>>>>>> while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY || >>>>>>>> rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) { >>>>>>>> usleep(100); >>>>>>>> metal_cache_invalidate(rsc, rproc->rsc_len); >>>>>>>> } >>>>>>>> >>>>>>>> Above works, but I think better solution is to change sequence where remote >>>>>>>> waits before creating virtio devices. >>>>>>> >>>>>>> I am sorry, I should have said, remote should wait before parsing and >>>>>>> assigning vrings to virtio device. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> [1] https://github.com/OpenAMP/open-amp/ >>>>>>>> blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/ >>>>>>>> remoteproc.c#L994 >>>>>>>> >>>>>> >>>>>> Actually upon further checking, I think above code is okay. I see that >>>>>> wait_remote_ready is called before vrings are setup on remote fw side. >>>>>> >>>>>> However, during recovery time on remote side, somehow I still have to >>>>>> implement >>>>>> platform specific wait for vrings to setup correctly. >>>>>> >>>>>> From linux side, DRIVER_OK bit is set before vrings are setup correctly. >>>>>> Because of that, when remote firmware sets up wrong vring addresses and then >>>>>> rpmsg channels are not created. >>>>>> >>>>>> I am investigating on this further. >>>>> >>>>> Do you reset the vdev status as requested by the virtio spec? >>>>> https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html… >>>>> >>>>> Regards, >>>>> Arnaud >>>>> >>>> >>>> Yes I do. I am actually restoring deafult resource table on firmware side, which >>>> will set rpmsg_vdev status to 0. >>>> >>>> However, when printing vrings right before wait_remote_ready, I see vrings are >>>> not set correctly from linux side: >>>> >>>> `vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF` >>> >>> That makes sense if values corresponds to the initial values of the resource >>> table >>> rproc->clean_table should contain a copy of these initial values. >>> >>>> >>>> However, the rproc state was still moved to attach when checked from remoteproc >>>> sysfs. >>> >>> Does the rproc_handle_resources() is called before going back in attached state? >> >> You are right. I think __rproc_attach() isn't calling rproc_handle_resources(). >> >> But recovery is supported by other platforms so I think recovery should work >> without calling rproc_handle_resources(). > > Right. Having taken a deeper look at the code, it seems that there is an issue. > In rproc_reset_rsc_table_on_detach(), we clean the resource table without > calling rproc_resource_cleanup(). > > It seems to me that rproc_reset_rsc_table_on_detach() should not be called in > __rproc_detach() but rather in rproc_detach() after calling > rproc_resource_cleanup(). > > Yes that sounds correct. It's long-weekend here in US. So, I will try this next week and update. Thanks, Tanmay >> >> May be re-storing resource table from firmware side after reboot isn't a good >> idea. I will try without it. >> >>> >>>> >>>> `cat /sys/class/remoteproc/remoteproc0/state` >>>> attached >>>> >>>> Somehow the sync between remote fw and linux isn't right. >>>> >>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Tanmay >>>>>>>>> Regards >>>>>>>>> Arnaud >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Arnaud >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Tanmay >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>

2 days, 9 hours

1
0
0 0

Re: Testing on attach-detach recovery

by Tanmay Shah

On 7/2/25 10:47 AM, Arnaud POULIQUEN wrote: > > > On 7/2/25 17:23, Tanmay Shah wrote: >> >> >> On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote: >>> >>> >>> On 7/1/25 23:19, Tanmay Shah wrote: >>>> >>>> >>>> On 7/1/25 1:06 PM, Tanmay Shah wrote: >>>>> >>>>> >>>>> On 7/1/25 12:56 PM, Tanmay Shah wrote: >>>>>> >>>>>> >>>>>> On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote: >>>>>>> >>>>>>> >>>>>>> On 7/1/25 17:16, Tanmay Shah wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote: >>>>>>>>> Hi Tanmay, >>>>>>>>> >>>>>>>>> On 6/27/25 23:29, Tanmay Shah wrote: >>>>>>>>>> Hello all, >>>>>>>>>> >>>>>>>>>> I am implementing remoteproc recovery on attach-detach use case. >>>>>>>>>> I have implemented the feature in the platform driver, and it works for >>>>>>>>>> boot >>>>>>>>>> recovery. >>>>>>>>> >>>>>>>>> Few questions to better understand your use case. >>>>>>>>> >>>>>>>>> 1) The linux remoteproc firmware attach to a a remote processor, and you >>>>>>>>> generate a crash of the remote processor, right? >>>>>>>>> >>>>>>>> >>>>>>>> Yes correct. >>>>>>>> >>>>>>>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it is an >>>>>>>>> autoreboot independent from the Linux core? >>>>>>>>> >>>>>>>> >>>>>>>> It is auto-reboot independent from the linux core. >>>>>>>> >>>>>>>>> 2) In case of auto reboot, when does the remoteprocessor send an even to >>>>>>>>> the >>>>>>>>> Linux remoteproc driver ? beforeor after the reset? >>>>>>>>> >>>>>>>> >>>>>>>> Right now, when Remote reboots, it sends crash event to remoteproc driver >>>>>>>> after >>>>>>>> reboot. >>>>>>>> >>>>>>>>> 3) Do you expect to get core dump on crash? >>>>>>>>> >>>>>>>> >>>>>>>> No coredump expected as of now, but only recovery. Eventually will implement >>>>>>>> coredump functionality as well. >>>>>>>> >>>>>>>>>> >>>>>>>>>> However, I am stuck at the testing phase. >>>>>>>>>> >>>>>>>>>> When should firmware report the crash ? After reboot ? or during some >>>>>>>>>> kind of >>>>>>>>>> crash handler ? >>>>>>>>>> >>>>>>>>>> So far, I am reporting crash after rebooting remote processor, but it >>>>>>>>>> doesn't >>>>>>>>>> seem to work i.e. I don't see rpmsg devices created after recovery.> >>>>>>>>>> What should be the correct process to test this feature ? How other >>>>>>>>>> platforms >>>>>>>>>> are testing this? >>>>>>>>> >>>>>>>>> I have never tested it on ST board. As a first analysis, in case of >>>>>>>>> autoreboot >>>>>>>>> of the remote processor, it look like you should detach and reattach to >>>>>>>>> recover. >>>>>>>> >>>>>>>> That is what's done from the remoteproc framework. >>>>>>>> >>>>>>>>> - On detach the rpmsg devices should be unbind >>>>>>>>> - On attach the remote processor should request RPmsg channels using the NS >>>>>>>>> announcement mechanism >>>>>>>>> >>>>>>>> >>>>>>>> Main issue is, Remote firmware needs to wait till all above happens. Then >>>>>>>> only >>>>>>>> initialize virtio devices. Currently we don't have any way to notify >>>>>>>> recovery >>>>>>>> progress from linux to remote fw in the remoteproc framework. So I might >>>>>>>> have to >>>>>>>> introduce some platform specific mechanism in remote firmware to wait for >>>>>>>> recovery to complete successfully. >>>>>>> >>>>>>> I guess the rproc->clean_table contains a copy of the resource table that is >>>>>>> reapplied on attach, and the virtio devices should be re-probed, right? >>>>>>> >>>>>>> During the virtio device probe, the vdev status in the resource table is >>>>>>> updated >>>>>>> to 7 when virtio is ready to communicate. Virtio should then call >>>>>>> rproc_virtio_notify() to inform the remote processor of the status update. >>>>>>> At this stage, your remoteproc driver should be able to send a mailbox >>>>>>> message >>>>>>> to inform the remote side about the recovery completion. >>>>>>> >>>>>> >>>>>> I think I spot the problem now. >>>>>> >>>>>> Linux side: file: remoteproc_core.c >>>>>> rproc_attach_recovery >>>>>> __rproc_detach >>>>>> cleans up the resource table and re-loads it >>>>>> __rproc_attach >>>>>> stops and re-starts subdevices >>>>>> >>>>>> >>>>>> Remote side: >>>>>> Remote re-boots after crash >>>>>> Detects crash happened previously >>>>>> notify crash to Linux >>>>>> (Linux is executing above flow meanwhile) >>>>>> starts creating virtio devices >>>>>> **rproc_virtio_create_vdev - parse vring & create vdev device** >>>>>> **rproc_virtio_wait_remote_ready - wait for remote ready** [1] >>>>>> >>>>>> I think Remote should wait on DRIVER_OK bit, before creating virtio devices. >>>>>> The temporary solution I implemented was to make sure vrings addresses are >>>>>> not 0xffffffff like following: >>>>>> >>>>>> while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY || >>>>>> rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) { >>>>>> usleep(100); >>>>>> metal_cache_invalidate(rsc, rproc->rsc_len); >>>>>> } >>>>>> >>>>>> Above works, but I think better solution is to change sequence where remote >>>>>> waits before creating virtio devices. >>>>> >>>>> I am sorry, I should have said, remote should wait before parsing and >>>>> assigning vrings to virtio device. >>>>> >>>>>> >>>>>> >>>>>> [1] https://github.com/OpenAMP/open-amp/ >>>>>> blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/ >>>>>> remoteproc.c#L994 >>>>>> >>>> >>>> Actually upon further checking, I think above code is okay. I see that >>>> wait_remote_ready is called before vrings are setup on remote fw side. >>>> >>>> However, during recovery time on remote side, somehow I still have to implement >>>> platform specific wait for vrings to setup correctly. >>>> >>>> From linux side, DRIVER_OK bit is set before vrings are setup correctly. >>>> Because of that, when remote firmware sets up wrong vring addresses and then >>>> rpmsg channels are not created. >>>> >>>> I am investigating on this further. >>> >>> Do you reset the vdev status as requested by the virtio spec? >>> https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html… >>> >>> Regards, >>> Arnaud >>> >> >> Yes I do. I am actually restoring deafult resource table on firmware side, which >> will set rpmsg_vdev status to 0. >> >> However, when printing vrings right before wait_remote_ready, I see vrings are >> not set correctly from linux side: >> >> `vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF` > > That makes sense if values corresponds to the initial values of the resource table > rproc->clean_table should contain a copy of these initial values. > >> >> However, the rproc state was still moved to attach when checked from remoteproc >> sysfs. > > Does the rproc_handle_resources() is called before going back in attached state? You are right. I think __rproc_attach() isn't calling rproc_handle_resources(). But recovery is supported by other platforms so I think recovery should work without calling rproc_handle_resources(). May be re-storing resource table from firmware side after reboot isn't a good idea. I will try without it. > >> >> `cat /sys/class/remoteproc/remoteproc0/state` >> attached >> >> Somehow the sync between remote fw and linux isn't right. >> >>>> >>>>>> >>>>>> Thanks, >>>>>> Tanmay >>>>>>> Regards >>>>>>> Arnaud >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Arnaud >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Tanmay >>>>>>>> >>>>>> >>>>> >>>> >>

3 days, 7 hours

1
0
0 0

Re: Testing on attach-detach recovery

by Tanmay Shah

On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote: > > > On 7/1/25 23:19, Tanmay Shah wrote: >> >> >> On 7/1/25 1:06 PM, Tanmay Shah wrote: >>> >>> >>> On 7/1/25 12:56 PM, Tanmay Shah wrote: >>>> >>>> >>>> On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote: >>>>> >>>>> >>>>> On 7/1/25 17:16, Tanmay Shah wrote: >>>>>> >>>>>> >>>>>> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote: >>>>>>> Hi Tanmay, >>>>>>> >>>>>>> On 6/27/25 23:29, Tanmay Shah wrote: >>>>>>>> Hello all, >>>>>>>> >>>>>>>> I am implementing remoteproc recovery on attach-detach use case. >>>>>>>> I have implemented the feature in the platform driver, and it works for boot >>>>>>>> recovery. >>>>>>> >>>>>>> Few questions to better understand your use case. >>>>>>> >>>>>>> 1) The linux remoteproc firmware attach to a a remote processor, and you >>>>>>> generate a crash of the remote processor, right? >>>>>>> >>>>>> >>>>>> Yes correct. >>>>>> >>>>>>> 1) How does the remoteprocessor reboot? On a remoteproc request or it is an >>>>>>> autoreboot independent from the Linux core? >>>>>>> >>>>>> >>>>>> It is auto-reboot independent from the linux core. >>>>>> >>>>>>> 2) In case of auto reboot, when does the remoteprocessor send an even to the >>>>>>> Linux remoteproc driver ? beforeor after the reset? >>>>>>> >>>>>> >>>>>> Right now, when Remote reboots, it sends crash event to remoteproc driver >>>>>> after >>>>>> reboot. >>>>>> >>>>>>> 3) Do you expect to get core dump on crash? >>>>>>> >>>>>> >>>>>> No coredump expected as of now, but only recovery. Eventually will implement >>>>>> coredump functionality as well. >>>>>> >>>>>>>> >>>>>>>> However, I am stuck at the testing phase. >>>>>>>> >>>>>>>> When should firmware report the crash ? After reboot ? or during some >>>>>>>> kind of >>>>>>>> crash handler ? >>>>>>>> >>>>>>>> So far, I am reporting crash after rebooting remote processor, but it >>>>>>>> doesn't >>>>>>>> seem to work i.e. I don't see rpmsg devices created after recovery.> >>>>>>>> What should be the correct process to test this feature ? How other >>>>>>>> platforms >>>>>>>> are testing this? >>>>>>> >>>>>>> I have never tested it on ST board. As a first analysis, in case of >>>>>>> autoreboot >>>>>>> of the remote processor, it look like you should detach and reattach to >>>>>>> recover. >>>>>> >>>>>> That is what's done from the remoteproc framework. >>>>>> >>>>>>> - On detach the rpmsg devices should be unbind >>>>>>> - On attach the remote processor should request RPmsg channels using the NS >>>>>>> announcement mechanism >>>>>>> >>>>>> >>>>>> Main issue is, Remote firmware needs to wait till all above happens. Then only >>>>>> initialize virtio devices. Currently we don't have any way to notify recovery >>>>>> progress from linux to remote fw in the remoteproc framework. So I might >>>>>> have to >>>>>> introduce some platform specific mechanism in remote firmware to wait for >>>>>> recovery to complete successfully. >>>>> >>>>> I guess the rproc->clean_table contains a copy of the resource table that is >>>>> reapplied on attach, and the virtio devices should be re-probed, right? >>>>> >>>>> During the virtio device probe, the vdev status in the resource table is >>>>> updated >>>>> to 7 when virtio is ready to communicate. Virtio should then call >>>>> rproc_virtio_notify() to inform the remote processor of the status update. >>>>> At this stage, your remoteproc driver should be able to send a mailbox message >>>>> to inform the remote side about the recovery completion. >>>>> >>>> >>>> I think I spot the problem now. >>>> >>>> Linux side: file: remoteproc_core.c >>>> rproc_attach_recovery >>>> __rproc_detach >>>> cleans up the resource table and re-loads it >>>> __rproc_attach >>>> stops and re-starts subdevices >>>> >>>> >>>> Remote side: >>>> Remote re-boots after crash >>>> Detects crash happened previously >>>> notify crash to Linux >>>> (Linux is executing above flow meanwhile) >>>> starts creating virtio devices >>>> **rproc_virtio_create_vdev - parse vring & create vdev device** >>>> **rproc_virtio_wait_remote_ready - wait for remote ready** [1] >>>> >>>> I think Remote should wait on DRIVER_OK bit, before creating virtio devices. >>>> The temporary solution I implemented was to make sure vrings addresses are >>>> not 0xffffffff like following: >>>> >>>> while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY || >>>> rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) { >>>> usleep(100); >>>> metal_cache_invalidate(rsc, rproc->rsc_len); >>>> } >>>> >>>> Above works, but I think better solution is to change sequence where remote >>>> waits before creating virtio devices. >>> >>> I am sorry, I should have said, remote should wait before parsing and >>> assigning vrings to virtio device. >>> >>>> >>>> >>>> [1] https://github.com/OpenAMP/open-amp/ >>>> blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/ remoteproc.c#L994 >>>> >> >> Actually upon further checking, I think above code is okay. I see that >> wait_remote_ready is called before vrings are setup on remote fw side. >> >> However, during recovery time on remote side, somehow I still have to implement >> platform specific wait for vrings to setup correctly. >> >> From linux side, DRIVER_OK bit is set before vrings are setup correctly. >> Because of that, when remote firmware sets up wrong vring addresses and then >> rpmsg channels are not created. >> >> I am investigating on this further. > > Do you reset the vdev status as requested by the virtio spec? > https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html… > > Regards, > Arnaud > Yes I do. I am actually restoring deafult resource table on firmware side, which will set rpmsg_vdev status to 0. However, when printing vrings right before wait_remote_ready, I see vrings are not set correctly from linux side: `vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF` However, the rproc state was still moved to attach when checked from remoteproc sysfs. `cat /sys/class/remoteproc/remoteproc0/state` attached Somehow the sync between remote fw and linux isn't right. >> >>>> >>>> Thanks, >>>> Tanmay >>>>> Regards >>>>> Arnaud >>>>> >>>>> >>>>>> >>>>>>> Regards, >>>>>>> Arnaud >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Tanmay >>>>>> >>>> >>> >>

3 days, 9 hours

1
0
0 0

Re: Testing on attach-detach recovery

by Tanmay Shah

On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote: > > > On 7/1/25 17:16, Tanmay Shah wrote: >> >> >> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote: >>> Hi Tanmay, >>> >>> On 6/27/25 23:29, Tanmay Shah wrote: >>>> Hello all, >>>> >>>> I am implementing remoteproc recovery on attach-detach use case. >>>> I have implemented the feature in the platform driver, and it works for boot >>>> recovery. >>> >>> Few questions to better understand your use case. >>> >>> 1) The linux remoteproc firmware attach to a a remote processor, and you >>> generate a crash of the remote processor, right? >>> >> >> Yes correct. >> >>> 1) How does the remoteprocessor reboot? On a remoteproc request or it is an >>> autoreboot independent from the Linux core? >>> >> >> It is auto-reboot independent from the linux core. >> >>> 2) In case of auto reboot, when does the remoteprocessor send an even to the >>> Linux remoteproc driver ? beforeor after the reset? >>> >> >> Right now, when Remote reboots, it sends crash event to remoteproc driver after >> reboot. >> >>> 3) Do you expect to get core dump on crash? >>> >> >> No coredump expected as of now, but only recovery. Eventually will implement >> coredump functionality as well. >> >>>> >>>> However, I am stuck at the testing phase. >>>> >>>> When should firmware report the crash ? After reboot ? or during some kind of >>>> crash handler ? >>>> >>>> So far, I am reporting crash after rebooting remote processor, but it doesn't >>>> seem to work i.e. I don't see rpmsg devices created after recovery.> >>>> What should be the correct process to test this feature ? How other platforms >>>> are testing this? >>> >>> I have never tested it on ST board. As a first analysis, in case of autoreboot >>> of the remote processor, it look like you should detach and reattach to recover. >> >> That is what's done from the remoteproc framework. >> >>> - On detach the rpmsg devices should be unbind >>> - On attach the remote processor should request RPmsg channels using the NS >>> announcement mechanism >>> >> >> Main issue is, Remote firmware needs to wait till all above happens. Then only >> initialize virtio devices. Currently we don't have any way to notify recovery >> progress from linux to remote fw in the remoteproc framework. So I might have to >> introduce some platform specific mechanism in remote firmware to wait for >> recovery to complete successfully. > > I guess the rproc->clean_table contains a copy of the resource table that is > reapplied on attach, and the virtio devices should be re-probed, right? > > During the virtio device probe, the vdev status in the resource table is updated > to 7 when virtio is ready to communicate. Virtio should then call > rproc_virtio_notify() to inform the remote processor of the status update. > At this stage, your remoteproc driver should be able to send a mailbox message > to inform the remote side about the recovery completion. > I think I spot the problem now. Linux side: file: remoteproc_core.c rproc_attach_recovery __rproc_detach cleans up the resource table and re-loads it __rproc_attach stops and re-starts subdevices Remote side: Remote re-boots after crash Detects crash happened previously notify crash to Linux (Linux is executing above flow meanwhile) starts creating virtio devices **rproc_virtio_create_vdev - parse vring & create vdev device** **rproc_virtio_wait_remote_ready - wait for remote ready** [1] I think Remote should wait on DRIVER_OK bit, before creating virtio devices. The temporary solution I implemented was to make sure vrings addresses are not 0xffffffff like following: while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY || rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) { usleep(100); metal_cache_invalidate(rsc, rproc->rsc_len); } Above works, but I think better solution is to change sequence where remote waits before creating virtio devices. [1] https://github.com/OpenAMP/open-amp/blob/391671ba24840833d882c1a75c5d730770… Thanks, Tanmay > Regards > Arnaud > > >> >>> Regards, >>> Arnaud >>> >>>> >>>> >>>> Thanks, >>>> Tanmay >>

4 days, 3 hours

1
2
0 0

Re: Testing on attach-detach recovery

by Tanmay Shah

On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote: > Hi Tanmay, > > On 6/27/25 23:29, Tanmay Shah wrote: >> Hello all, >> >> I am implementing remoteproc recovery on attach-detach use case. >> I have implemented the feature in the platform driver, and it works for boot >> recovery. > > Few questions to better understand your use case. > > 1) The linux remoteproc firmware attach to a a remote processor, and you > generate a crash of the remote processor, right? > Yes correct. > 1) How does the remoteprocessor reboot? On a remoteproc request or it is an > autoreboot independent from the Linux core? > It is auto-reboot independent from the linux core. > 2) In case of auto reboot, when does the remoteprocessor send an even to the > Linux remoteproc driver ? beforeor after the reset? > Right now, when Remote reboots, it sends crash event to remoteproc driver after reboot. > 3) Do you expect to get core dump on crash? > No coredump expected as of now, but only recovery. Eventually will implement coredump functionality as well. >> >> However, I am stuck at the testing phase. >> >> When should firmware report the crash ? After reboot ? or during some kind of >> crash handler ? >> >> So far, I am reporting crash after rebooting remote processor, but it doesn't >> seem to work i.e. I don't see rpmsg devices created after recovery.> >> What should be the correct process to test this feature ? How other platforms >> are testing this? > > I have never tested it on ST board. As a first analysis, in case of autoreboot > of the remote processor, it look like you should detach and reattach to recover. That is what's done from the remoteproc framework. > - On detach the rpmsg devices should be unbind > - On attach the remote processor should request RPmsg channels using the NS > announcement mechanism > Main issue is, Remote firmware needs to wait till all above happens. Then only initialize virtio devices. Currently we don't have any way to notify recovery progress from linux to remote fw in the remoteproc framework. So I might have to introduce some platform specific mechanism in remote firmware to wait for recovery to complete successfully. > Regards, > Arnaud > >> >> >> Thanks, >> Tanmay

4 days, 9 hours

1
0
0 0

[OpenAMP/libmetal] 35c65e: cmake: add public header files for CMake shared an...

by aelray

Branch: refs/heads/main Home: https://github.com/OpenAMP/libmetal Commit: 35c65e05b9c3f62713cf865c547287e5c8e3942f https://github.com/OpenAMP/libmetal/commit/35c65e05b9c3f62713cf865c547287e5… Author: Adel El-Rayyes <aelray(a)gmail.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: M lib/CMakeLists.txt Log Message: ----------- cmake: add public header files for CMake shared and static lib targets For consumers who don't want to install libmetal it's more convenient to use the CMake targets metal-static or metal-shared. However, they currently don't bring the necessary include directories (In contrast to the Zephyr version where the include directories are already exported via `zephyr_include_directories(...)` Signed-off-by: Adel El-Rayyes <aelray(a)gmail.com> To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/libmetal/settings/notifications

5 days, 9 hours

1
0
0 0

[OpenAMP/openamp-system-reference] 4a5842: examples: legacy_apps: zynqmp_r5: Move linker file...

by Arnaud Pouliquen

Branch: refs/heads/main Home: https://github.com/OpenAMP/openamp-system-reference Commit: 4a5842466f46847438a441446d9dd75d665a810f https://github.com/OpenAMP/openamp-system-reference/commit/4a5842466f468474… Author: Ben Levinsky <ben.levinsky(a)amd.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: M examples/legacy_apps/machine/zynqmp_r5/CMakeLists.txt A examples/legacy_apps/machine/zynqmp_r5/linker_large_text.ld A examples/legacy_apps/machine/zynqmp_r5/linker_remote.ld M examples/legacy_apps/system/generic/machine/zynqmp_r5/CMakeLists.txt R examples/legacy_apps/system/generic/machine/zynqmp_r5/linker_large_text.ld R examples/legacy_apps/system/generic/machine/zynqmp_r5/linker_remote.ld Log Message: ----------- examples: legacy_apps: zynqmp_r5: Move linker file logic to machine/zynqmp_r5 As prep to support multiple OS's move the logic to be in area common for all RPU based OpenAMP applications. Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com> Commit: 9b5617141dc86b501f1a03223d3d35a84ccb8fd7 https://github.com/OpenAMP/openamp-system-reference/commit/9b5617141dc86b50… Author: Ben Levinsky <ben.levinsky(a)amd.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: M examples/legacy_apps/machine/zynqmp_r5/CMakeLists.txt A examples/legacy_apps/machine/zynqmp_r5/generic/gic_init.c A examples/legacy_apps/machine/zynqmp_r5/helper.c M examples/legacy_apps/system/generic/machine/zynqmp_r5/CMakeLists.txt R examples/legacy_apps/system/generic/machine/zynqmp_r5/helper.c Log Message: ----------- examples: legacy_apps: zynqmp_r5: move GIC setup logic to baremetal area As the GIC initialization logic is specific to baremetal R5, move to that location. The logging is common though so keep it in common area. Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com> Commit: 4220a50955b831038895a106da1d06b295c3532e https://github.com/OpenAMP/openamp-system-reference/commit/4220a50955b83103… Author: Ben Levinsky <ben.levinsky(a)amd.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: M examples/legacy_apps/examples/echo/CMakeLists.txt A examples/legacy_apps/examples/echo/generic/main.c M examples/legacy_apps/examples/echo/rpmsg-echo.c M examples/legacy_apps/examples/echo/rpmsg-echo.h M examples/legacy_apps/examples/matrix_multiply/CMakeLists.txt A examples/legacy_apps/examples/matrix_multiply/generic/main.c M examples/legacy_apps/examples/matrix_multiply/matrix_multiply.h M examples/legacy_apps/examples/matrix_multiply/matrix_multiplyd.c M examples/legacy_apps/examples/rpc_demo/CMakeLists.txt A examples/legacy_apps/examples/rpc_demo/generic/main.c M examples/legacy_apps/examples/rpc_demo/rpc_demo.c M examples/legacy_apps/examples/rpc_demo/rpmsg-rpc-demo.h Log Message: ----------- examples: legacy_apps: Prepare legacy demos to support other OS's In the original OpenAMP Remote files define main as weak and then provide strong main routines to separate files for echo, matrix and rpc_demo examples so that other OS's can be added too. Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com> Commit: d0696e30ca11f9acc47ad3c70ac00b9c9ea994b2 https://github.com/OpenAMP/openamp-system-reference/commit/d0696e30ca11f9ac… Author: Ben Levinsky <ben.levinsky(a)amd.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: M examples/legacy_apps/machine/zynqmp_r5/CMakeLists.txt Log Message: ----------- examples: legacy_apps: zynqmp_r5: Move library dependency logic Move library dependency logic to zynqmp_r5 common area to ensure all OS targets will pick up dependencies. Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com> Commit: c3e2349c6a30d057d6782bcb57da371c525cbe3b https://github.com/OpenAMP/openamp-system-reference/commit/c3e2349c6a30d057… Author: Ben Levinsky <ben.levinsky(a)amd.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: A examples/legacy_apps/examples/echo/freertos/main.c A examples/legacy_apps/examples/matrix_multiply/freertos/main.c A examples/legacy_apps/examples/rpc_demo/freertos/main.c Log Message: ----------- examples: legacy_apps: Add main routines for FreeRTOS OS for echo, matrix and rpc_demo Add main routine implementations for echo, matrix and rpc_demo examples for FreeRTOS OS. Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com> Commit: e0c9828a0c52f00320a7e932360b416bcfc992fd https://github.com/OpenAMP/openamp-system-reference/commit/e0c9828a0c52f003… Author: Ben Levinsky <ben.levinsky(a)amd.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: A examples/legacy_apps/system/freertos/CMakeLists.txt Log Message: ----------- examples: legacy_apps: freertos: Add link dependency for library Add link dependency to ensure that library is found for FreeRTOS applications Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com> Commit: 019186f66d6ae55d4c33acef03a1faef345e4ebc https://github.com/OpenAMP/openamp-system-reference/commit/019186f66d6ae55d… Author: Ben Levinsky <ben.levinsky(a)amd.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: A examples/legacy_apps/machine/zynqmp_r5/freertos/gic_init.c Log Message: ----------- examples: legacy_apps: freertos: implement GIC setup Introduce GIC initializatio nusing AMD FreeRTOS BSP API to connect GIC to libmetal ISR and application IRQ. Signed-off-by: Ben Levinsky <ben.levinsky(a)amd.com> Commit: 49d74fec0b509cba50f34a353e0a639b80a270f9 https://github.com/OpenAMP/openamp-system-reference/commit/49d74fec0b509cba… Author: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com> Date: 2025-06-30 (Mon, 30 Jun 2025) Changed paths: M examples/legacy_apps/machine/zynqmp/platform_info.c M examples/legacy_apps/machine/zynqmp_r5/platform_info.c M examples/legacy_apps/machine/zynqmp_r5/zynqmp_r5_a53_rproc.c M examples/legacy_apps/system/CMakeLists.txt M examples/legacy_apps/system/freertos/CMakeLists.txt A examples/legacy_apps/system/freertos/suspend.c M examples/legacy_apps/system/generic/CMakeLists.txt A examples/legacy_apps/system/generic/suspend.c M examples/legacy_apps/system/linux/CMakeLists.txt M examples/legacy_apps/system/linux/machine/generic/platform_info.c A examples/legacy_apps/system/linux/suspend.c A examples/legacy_apps/system/suspend.h Log Message: ----------- examples: legacy_apps: Add default routines for suspend/resume The application or the CPU can be suspended and resumed while waiting for new RPMsg messages from the remote processor. This commit introduces `system_suspend` and `system_resume` functions, along with their implementations for different systems. - FreeRTOS: The suspend/resume functionality involves suspending and resuming the application. - Linux and Baremetal: By default, `metal_cpu_yield()` is called. However, these functions can be redefined in the machine layer if needed. The application can be suspended and resume waiting from new rpmsg from the remote. Add system_suspend and system_resume functions and associated implementation for the different system. For the FreeRTOS the suspend/resume consist in suspending resuming the application For Linux and baremetal metal_cpu_yield() is called by default, but some functions can be redefined in the machine on need Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com> Compare: https://github.com/OpenAMP/openamp-system-reference/compare/d0546027cd32...… To unsubscribe from these emails, change your notification settings at https://github.com/OpenAMP/openamp-system-reference/settings/notifications

5 days, 13 hours

1
0
0 0

Testing on attach-detach recovery

by Tanmay Shah

Hello all, I am implementing remoteproc recovery on attach-detach use case. I have implemented the feature in the platform driver, and it works for boot recovery. However, I am stuck at the testing phase. When should firmware report the crash ? After reboot ? or during some kind of crash handler ? So far, I am reporting crash after rebooting remote processor, but it doesn't seem to work i.e. I don't see rpmsg devices created after recovery. What should be the correct process to test this feature ? How other platforms are testing this? Thanks, Tanmay

1 week, 1 day

1
0
0 0

[RFC]: remoteproc state definition discrepancy in openamp library and Linux kernel framework

by Tanmay Shah

Hello all, I found out that there is discrepancy between remoteproc state definition in kernel and open-amp library: Linux kernel side definition: https://github.com/torvalds/linux/blob/52da431bf03b5506203bca27fe14a97895c8… enum rproc_state { RPROC_OFFLINE = 0, RPROC_SUSPENDED = 1, RPROC_RUNNING = 2, RPROC_CRASHED = 3, RPROC_DELETED = 4, RPROC_ATTACHED = 5, RPROC_DETACHED = 6, RPROC_LAST = 7, }; open-amp library side definition: https://github.com/OpenAMP/open-amp/blob/391671ba24840833d882c1a75c5d730770… /** * @brief Remote processor states */ enum remoteproc_state { /** Remote is offline */ RPROC_OFFLINE = 0, /** Remote is configured */ RPROC_CONFIGURED = 1, /** Remote is ready to start */ RPROC_READY = 2, /** Remote is up and running */ RPROC_RUNNING = 3, /** Remote is suspended */ RPROC_SUSPENDED = 4, /** Remote is has error; need to recover */ RPROC_ERROR = 5, /** Remote is stopped */ RPROC_STOPPED = 6, /** Just keep this one at the end */ RPROC_LAST = 7, }; IIUC, both side state definition should match, so that if remote needs to report crash error, it can use same name and mapped int value in the code. Please let me know if I am missing something in this understanding. Should we fix this? If so, I believe it should be library side. I am looking for suggestion to fix this without breaking backward compatibility: Approach 1: deprecate library side remoteproc_state definition. deprecate current enum (with __deprecated attribute or add comment) and introduce new one with same definition as in linux kernel. Then after 2 year (as per code of conduct policy) we can remove remoteproc_state. Approach 2: Platform driver uses library side remoteproc state definition. If we don't want to fix this, then another approach is, platform driver in linux kernel should have library-side remoteproc state definition and convert interprete it to kernel side remoteproc definition. With this second approach we don't have to deprecate anything, but platform drivers are responsible to maintain different remoteproc_state definition. (Might create confusion). I am open to any other suggestion regarding this. Thanks, Tanmay

1 week, 5 days

3
4
0 0

[RFC] Proposal to add mailbox interface

by Levinsky, Ben

[AMD Official Use Only - AMD Internal Distribution Only] Hello All, Presently for the OpenAMP Legacy Demos available in the community system reference repo the interrupt mechanism involves Libmetal APIs directly writing to control registers. This is an issue because it is clearly coupling the demos to vendor-specific logic. If we could instead have a refactor of demos without the vendor-specific interrupt register writes this would be a code clean up. [1] In one of the Libmetal-supported OS's, Zephyr there is already IPM support, though it is being deprecated as we have been told upstream. [2][3] That being said, there is already mailbox support in Zephyr and AMD is upstreaming support for this too. In order to (a) clean up the vendor-specific register writes and (b) add a generic mailbox support to Libmetal libraries below is a proposal for a patch series. 1. Add mailbox support - lib/mbox.h - descripe APIs for init, deinit, send and receive. 2. Add stubs for baremetal for the above APIs 3. Add stubs for freertos for the above APIs. 4. Add implementation for Zephyr with: - init - As Zephyr OS statically defines mailbox structures today based on device tree, this will simply store the mailbox structure [4] - deinit - free mailbox - send/receive will simply wrap around the Zephyr APIs for mbox send/receive 1. https://github.com/OpenAMP/libmetal/tree/main/lib/system/zephyr 2. https://github.com/zephyrproject-rtos/zephyr/tree/main/drivers/mbox 3. (pull request pending) 4. https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/drivers/mbox

2 weeks, 1 day

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

Openamp-rp