Jump to content
OMRON Forums

random watchdog error


piefum
 Share

Recommended Posts

Hi all

I have some strange watchdog error that I cannot really debug.

The situation is the following:

- I have a complete system (6 degrees of freedom) running with a Power Brick LV IMS

- The system is expanded with an ethercat network to read additional serial encoders and temperature sensors

- The system is considered finished, ready to be shipped to customer

- The system is here in my lab to execute some run-in tests, that is moving the system day and night, trying to detect some error or faults

 

The strange situation comes from some watchdog faults that appears at random and I cannot really see where they come from: sometimes it happens that the system stays alive for 80 hours and nothing happens; then, I perform a reboot and after few minutes the watchdog trips and I need to power-cycle the system.

How can I debug a situation like that? Is there a log inside the PMAC that tells us why the watchdog has tripped?

 

Thanks a lot

gigi

Link to comment
Share on other sites

  • Replies 7
  • Created
  • Last Reply

Top Posters In This Topic

When it does watchdog, what type is it? If it is a "soft" watchdog, there is some debugging that can be done.

 

You can set up a gather of PMAC Status Elements and also of status elements for your program--if you have a PLC that functions as a state machine, for instance, you can record what states you are in--and then in a separate PLC, set Gather.Enable=3 to perform an indefinite gather. Once the status indicates that a watchdog has occurred, the plot can be stopped and the data can be analyzed on the computer.

 

The most common cause of a hard watchdog is, ultimately, the 5V supply on the unit dipping too low. On a Clipper, this may be easy to troubleshoot (as the 5V is brought in directly), but on other form factors, it may be harder to address, as the 5V is likely stepped down from an external 24V supply.

Link to comment
Share on other sites

When it does watchdog, what type is it? If it is a "soft" watchdog, there is some debugging that can be done.

 

at the moment I am debugging the system without the IDE, only our custom software is connected. We see that the front watchdog led switch-on, then after few moments the system reboots by itself. From our software, I can see that the sys.uptime restart from zero.

The debug is painfully slow, since the watchdog trips after some hours of work: as an example, the system has just got into error after 12400 s (3.5 hours) of use. I believe that a power supply fault should power off the system immediately, correct?

 

thanks a lot

gigi

Link to comment
Share on other sites

Ciao Gigi,

it must be something in Lecco's area (joke)

I experienced a similar problem, albeit with a different CPU (UMAC 465), when using a ethercat network.

After some weeks of debugging we came to conclusion together with DTCH that there is something in the critical interrupt routine that causes a kernel panic in these conditions. Initially I thought it was related to the number of Ethercat axes (16) I was using, but then it happened (apparently in a random fashion) to "lighter" machines (just the WD, not the reboot).

So it could be an idea to turn the critical interrupt off

 

Ciao

Andrea

Link to comment
Share on other sites

Ciao Andrea et all

 

...

After some weeks of debugging we came to conclusion together with DTCH that there

...

 

I installed the patch that disables the interrupt one week ago, and since then the system did not encoured any WD trip or reboot or something strange.

I would say now that the problem is fixed.

 

Many thanks for the help, I would never ever fixed that in time for delivery.

 

DeltaTau guys: is it possible to make this update something "official" and known to public?

 

Ciao

gigi

Link to comment
Share on other sites

Hello guys,

 

I have the same problem here: two PowerBrickAC-based system with multiple axes falling into hardware watchdog state randomly.

Having a lot of C code, both in background programs and RTI, I am familiar with DT software watchdogs, but hardware ones? I have no idea how to debug them.

 

I am curious about your solution, the critical interrupt disabling. How could you do this ? You mentioned a patch; could you tell me where did you find it ?

 

Thanks a lot !

 

Johann

Link to comment
Share on other sites

OK I find what my problem was. Using the linux "top" and "watch -n 0.5 cat /proc/xenomai/stat" commands, I could see that one of my debug process overload the CPU (idle dropped less that 1%). Disabling this debug process (basically high frequency logs) help the idle to rise back to 40%: no more hardware WD.
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share


×
×
  • Create New...