rvanderbijl Posted December 16, 2016 Posted December 16, 2016 Hello, Perhaps this is not 100% on-topic, but just in case people in this forum have come across this before -- I am running the Power PMAC on the Dual Core 465 CPU. Besides the normal Delta Tau code, I am also running a Xenomai C++ task. When I started to look at performance and CPU use, I noticed that my task was using up 20% CPU or so (using cat /proc/xenomai/stat). I also noticed it was on the same CPU as the PMAC tasks. So I added this to my C++ code: cpu_set_t set; CPU_ZERO(&set); CPU_SET(1, &set); if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) CPUAffinityFault = true; This looks to work great, as when I do cat /prox/xenomai/stat, I now see that my threads are all running on CPU 1, not 0 as is the default apparently. Unfortunately, after running my code for 2-3 minutes, it crashes with a fun segmentation fault... If I remove the affinity code, it runs fine and does not crash. Does anyone have any thoughts on this? I am accessing PMAC shared memory (pshm->....). Perhaps you can't do that across different cores? Or if you do, is there some synchronization code I need to add? Thanks! Robbert
KEJR Posted December 21, 2016 Posted December 21, 2016 Hello, This is probably not the issue but I wanted to mention this: Are you doing your reads and writes with entire words? The reason I ask is that I had an issue with shared memory accessing P vars by byte (i.e. memcpy() ) and instead I rewrote my code to use double type pointers so that the CPU would read the entire 64 bit double as one CPU instruction. Be careful here because it could be something in a standard library (C or C++) that is doing it as in my case. THAT SAID .... I can't see how this would be DIFFERENT just by changing the processor affinity. I would dive into some xenomai reading for running on multiple cores with shared memory. Maybe there is something special that needs to happen with shm interfaces? KEJR
rvanderbijl Posted December 21, 2016 Author Posted December 21, 2016 Thanks for your thoughts. I'm not accessing any memory by memcpy's. I'm currently only doing assignments, like: pshm->P[nnn] = value Or pshm->ECAT[n].IO[m].Data = value So no explicit memcpy's. Not sure what the compiler makes of these assignments, but I would hope I don't have to put special code in place to make sure the data is accessed in 64-bit blocks... No luck yet googling multi-core Xenomai shared memory issues.... Hello, This is probably not the issue but I wanted to mention this: Are you doing your reads and writes with entire words? The reason I ask is that I had an issue with shared memory accessing P vars by byte (i.e. memcpy() ) and instead I rewrote my code to use double type pointers so that the CPU would read the entire 64 bit double as one CPU instruction. Be careful here because it could be something in a standard library (C or C++) that is doing it as in my case. THAT SAID .... I can't see how this would be DIFFERENT just by changing the processor affinity. I would dive into some xenomai reading for running on multiple cores with shared memory. Maybe there is something special that needs to happen with shm interfaces? KEJR
rvanderbijl Posted December 21, 2016 Author Posted December 21, 2016 I'm cautiously optimistic that I found the reason for the segmentation fault. I didn't realize that the code to switch CPU affinity was preceded by an allocation of a fairly significant object in my code. I guess the object was allocated on core 0, then the rest of the code that uses that object was running on core 1. Apparently Xenomai does not like that. After I switched that around (change affinity - then allocate), I have not seen the crash yet, after about 15 mins of running. I'll leave it run for a few more hours to be sure, but this may have been a non-issue with respect to Delta Tau.
rvanderbijl Posted December 22, 2016 Author Posted December 22, 2016 And another update ... I spoke too soon. It looked hopeful, but after approx. 1/2 hour, I still got a segmentation fault. And on two separate occasions afterwards, this also managed to "crash" the Delta Tau tasks -- The firmware just stopped running (no watchdog error popped up). Issuing a $$$ got it running again. Guessing there are some fundamental reasons why this doesn't work. It would still be good to understand the background of why, and potentially how this could be done. So if anyone has some ideas? Robbert
Recommended Posts