daves Posted May 9, 2016 Share Posted May 9, 2016 Should I be worried my usralgo.ko compiles to 600KB when building for a 465 dual core CPU and only 18KB when building for a 460? (Same project/IDE/fw version) I am doing a lot of work on custom kernel modules which compile to 5KB on the 460 and 300KB on the 465. As well as being a lot slower to FTP to the PC I wonder what is in them... I am seeing other performance degradations in the 465 build (like the slow memory allocation erratum). We have a system relying on this up for delivery soon. Is it as expected? Link to comment Share on other sites More sharing options...
steve.milici Posted May 9, 2016 Share Posted May 9, 2016 Make certain the compile options are set for release - otherwise its about 155x bigger. Link to comment Share on other sites More sharing options...
daves Posted May 10, 2016 Author Share Posted May 10, 2016 Thanks Steve, I have been caught out by this in the past, but it doesn't explain it this time. They were both in Release and I think the usralgo.ko is built the same irrespective of configuration. If you build in Release or Debug it is the same size, the makefile is identical apart from the output folder. Link to comment Share on other sites More sharing options...
shansen Posted May 10, 2016 Share Posted May 10, 2016 daves: If you are ambitious and really want to get to the bottom of this, upload both files to the Power PMAC and dump their executables to assembly (objdump -S [FILE] > source.lst). Then compare the two and see what the differences are. Link to comment Share on other sites More sharing options...
daves Posted May 12, 2016 Author Share Posted May 12, 2016 Thanks shansen, I am and do! The -S comparison looked basically the same (just the functions I expected) I did 'objdump --section-headers' on them and it appears debug info is being added to the 465 build. This troubles me. The additional sections in the 465 build were 17 .gnu.attributes 00000010 00000000 00000000 00003790 2**0 CONTENTS, READONLY 18 .debug_aranges 00000068 00000000 00000000 000037a0 2**0 CONTENTS, RELOC, READONLY, DEBUGGING 19 .debug_info 00037b09 00000000 00000000 00003808 2**0 CONTENTS, RELOC, READONLY, DEBUGGING 20 .debug_abbrev 00000f00 00000000 00000000 0003b311 2**0 CONTENTS, READONLY, DEBUGGING 21 .debug_line 000024e9 00000000 00000000 0003c211 2**0 CONTENTS, RELOC, READONLY, DEBUGGING 22 .debug_frame 00000588 00000000 00000000 0003e6fc 2**2 CONTENTS, RELOC, READONLY, DEBUGGING 23 .debug_str 000251aa 00000000 00000000 0003ec84 2**0 CONTENTS, READONLY, DEBUGGING 24 .debug_loc 00001a50 00000000 00000000 00063e2e 2**0 CONTENTS, RELOC, READONLY, DEBUGGING 25 .debug_ranges 000004d8 00000000 00000000 0006587e 2**0 CONTENTS, RELOC, READONLY, DEBUGGING So there is over 400KB of debug info stuck in here, does this mean the libraries linked are debug builds (my initial suspicion)? We have intensive calculation requirements and cannot afford to be running debug code... I've spent a morning hunting where it comes from compiling from the commandline, but it's a bit beyond me and the time I can spend now. How do I get a nice clean build of release code on the 465? Link to comment Share on other sites More sharing options...
shansen Posted May 12, 2016 Share Posted May 12, 2016 I would check the cflags in the makefile for the 465 build. Are they passing the '- g' flag? Also check that optimizations are turned on (I think Delta Tau typically uses '-O2 '). Link to comment Share on other sites More sharing options...
daves Posted May 12, 2016 Author Share Posted May 12, 2016 The makefile is always using -O2 and no -g. DT build (or try to) the usralgo.ko using Release code independent of the project code. The damning thing is I am seeing this phenomenon when I build my LKM too from the command line. I have exactly the same makefile (based on DT usralgo one) for both CPUs but all I change is the "PMAC_ARCH=ppc465-2" setting. The flags are identical, just the paths and gcc compiler change, and the ARCH setting. I agree, the flags should control it, but could the problem be in where a statically linked library is found? Or is the 465 cross-compiler ignoring flags? Link to comment Share on other sites More sharing options...
steve.milici Posted May 12, 2016 Share Posted May 12, 2016 We are looking into this. I do see the same large difference in the ".ko" file. It definitely is not debug code as that would crash the kernel. It does not affect execution speed as I am able to get a CapCompISR to run at nearly 1MHz!!!! (Warning: on a 1.2 GHz dual core only). Link to comment Share on other sites More sharing options...
daves Posted May 12, 2016 Author Share Posted May 12, 2016 Thanks Steve. It's a relief to know you are on the trail. Maybe it's not executing debug code, but that objdump definitely shows the debug sections have made it into the binary doesn't it? Anyway hope it helps. The performance seems acceptable at the moment. I had my simple simulation running at 8kHz. We only need 2.2kHz at he moment. But obviously it is the execution time that has to fit in that interval and that goes up with complexity. A more advanced simulation was taking 80us and we will want to do more, so every microsecond counts as they say!!! We also FTP these files back and forth so smaller=faster!!! Thanks again Link to comment Share on other sites More sharing options...
dro Posted May 12, 2016 Share Posted May 12, 2016 A large kernel object or having debugging information in it has no affect on the performance of a kernel object. We build our kernel object with optimization and the debug info is probably inserted at some point from the specific Linux kernel source that we are using to compile usralgo.ko. Debug symbols of a compiled code are located in a different section of a code/data structure and they are not loaded at the runtime, so those extra symbols would not penalize the performance. what you see in your objdump shows you the debug symbols in the extra sections. you can easily remove the debug symbols from usralgo.ko with the following command: strip --strip-debug usralgo.ko you will notice that the size of usralgo.ko will be much smaller, which will be similar to the size of a usralgo.ko compiled for a Power Pmac 460. I will check to see if our cross compiler would support stripping at the compile time and if it does then I will change the make files in the IDE to do so. Nonetheless it has no affect on performance. Link to comment Share on other sites More sharing options...
daves Posted May 12, 2016 Author Share Posted May 12, 2016 Thanks Dro! Great news! Thanks for looking into this, it is a relief. I suspected everything was OK otherwise the card would choke pretty quickly. I just panic easily. I think I recall the lsmod lists the modules at a reasonable size but I may be misremembering. I guess that means only the needed data sections are loaded... I'm working from home tomorrow where I only have a 460 but I'll check the strip out on Monday. Can I do this from your Windows (cygwin) environment? If I want to speed up the FTP I will need this. I can't see it at first look but it is late here Link to comment Share on other sites More sharing options...
daves Posted May 18, 2016 Author Share Posted May 18, 2016 This works, thanks. I think you must have been aware of this before and someone has tried something in the past. I notice in your usralgo makefile for the 465 sections you have the line #STRIP=i686-meau-linux-gnu-strip but nothing is done after this. Anyway I added the following to my makefile at the end of the all:: section (not very flexible but works for me) ifeq ($(PMAC_ARCH),ppc465-2) powerpc-meau-linux-gnu-strip --strip-debug spmmhils.ko endif Can you confirm I am using the correct strip (it differs from your commented out one...)? Link to comment Share on other sites More sharing options...
Recommended Posts