Opened 16 years ago

Closed 16 years ago

Last modified 16 years ago

#778 closed Defect (worksforme)

Client fails with malloc error, when stopped and started

Reported by: rybakowicki Owned by: davea
Priority: Major Milestone: Undetermined
Component: Client - Build Version: 6.2.15
Keywords: Cc:

Description (last modified by Nicolas)

When signed to World Community Grid project URL: http://www.worldcommunitygrid.org/ and BOINC is disturbed (closed), either by ./boinc_cmd --quit or signal, the client fails to start again with an malloc error. To start it again client_state*.xml have to be deleted.

Failed:

21-Nov-2008 23:58:12 [---] Starting BOINC client version 6.2.15 for i686-pc-linux-gnu
21-Nov-2008 23:58:12 [---] log flags: task, file_xfer, sched_ops
21-Nov-2008 23:58:12 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1
21-Nov-2008 23:58:12 [---] Data directory: /home/michal/BOINC
21-Nov-2008 23:58:12 [---] Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz [Family 6 Model 15 Stepping 11]
21-Nov-2008 23:58:12 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ssht tm pbe lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
21-Nov-2008 23:58:12 [---] OS: Linux: 2.6.25.18-0.2-pae
21-Nov-2008 23:58:12 [---] Memory: 1.98 GB physical, 2.01 GB virtual
21-Nov-2008 23:58:12 [---] Disk: 74.31 GB total, 64.13 GB free
21-Nov-2008 23:58:12 [---] Local time is UTC +1 hours
*** glibc detected *** ./boinc: malloc(): memory corruption: 0xbfd3eed4 ***
======= Backtrace: =========
/lib/libc.so.6[0xb7dd7fc4]
/lib/libc.so.6[0xb7dda6aa]
/lib/libc.so.6(__libc_malloc+0x9c)[0xb7ddc11c]
/usr/lib/libstdc++.so.6(_Znwj+0x27)[0xb7fa9d67]
./boinc[0x8058a60]
======= Memory map: ========
08048000-081d7000 r-xp 00000000 08:07 3465920    /home/michal/BOINC/boinc
081d7000-081e7000 rwxp 0018f000 08:07 3465920    /home/michal/BOINC/boinc
081e7000-08346000 rwxp 081e7000 00:00 0          [heap]
b7800000-b7821000 rwxp b7800000 00:00 0
b7821000-b7900000 ---p b7821000 00:00 0
b791a000-b7cd5000 r-xp 00000000 08:07 3056697    /usr/lib/libcuda.so.177.73
b7cd5000-b7cd7000 rwxp 003ba000 08:07 3056697    /usr/lib/libcuda.so.177.73
b7cd7000-b7d00000 rwxp b7cd7000 00:00 0
b7d00000-b7d07000 r-xp 00000000 08:07 794939     /lib/librt-2.8.so
b7d07000-b7d08000 r-xp 00006000 08:07 794939     /lib/librt-2.8.so
b7d08000-b7d09000 rwxp 00007000 08:07 794939     /lib/librt-2.8.so
b7d09000-b7d50000 r-xp 00000000 08:07 3752336    /usr/local/cuda/lib/libcudart.so.2.0
b7d50000-b7d52000 rwxp 00046000 08:07 3752336    /usr/local/cuda/lib/libcudart.so.2.0
b7d52000-b7d5b000 r-xp 00000000 08:07 794986     /lib/libnss_files-2.8.so
b7d5b000-b7d5c000 r-xp 00008000 08:07 794986     /lib/libnss_files-2.8.so
b7d5c000-b7d5d000 rwxp 00009000 08:07 794986     /lib/libnss_files-2.8.so
b7d5d000-b7d5e000 rwxp b7d5d000 00:00 0
b7d5e000-b7d6a000 r-xp 00000000 08:07 794674     /lib/libgcc_s.so.1
b7d6a000-b7d6b000 r-xp 0000b000 08:07 794674     /lib/libgcc_s.so.1
b7d6b000-b7d6c000 rwxp 0000c000 08:07 794674     /lib/libgcc_s.so.1
b7d6c000-b7ea9000 r-xp 00000000 08:07 794975     /lib/libc-2.8.so
b7ea9000-b7eab000 r-xp 0013d000 08:07 794975     /lib/libc-2.8.so
b7eab000-b7eac000 rwxp 0013f000 08:07 794975     /lib/libc-2.8.so
b7eac000-b7eb0000 rwxp b7eac000 00:00 0
b7eb0000-b7ed4000 r-xp 00000000 08:07 794985     /lib/libm-2.8.so
b7ed4000-b7ed5000 r-xp 00023000 08:07 794985     /lib/libm-2.8.so
b7ed5000-b7ed6000 rwxp 00024000 08:07 794985     /lib/libm-2.8.so
b7ed6000-b7eea000 r-xp 00000000 08:07 794677     /lib/libpthread-2.8.so
b7eea000-b7eeb000 r-xp 00013000 08:07 794677     /lib/libpthread-2.8.so
b7eeb000-b7eec000 rwxp 00014000 08:07 794677     /lib/libpthread-2.8.so
b7eec000-b7eee000 rwxp b7eec000 00:00 0
b7eee000-b7fd3000 r-xp 00000000 08:07 3056523    /usr/lib/libstdc++.so.6.0.10
b7fd3000-b7fd7000 r-xp 000e5000 08:07 3056523    /usr/lib/libstdc++.so.6.0.10
b7fd7000-b7fd8000 rwxp 000e9000 08:07 3056523    /usr/lib/libstdc++.so.6.0.10
b7fd8000-b7fde000 rwxp b7fd8000 00:00 0
b7fde000-b7ff0000 r-xp 00000000 08:07 794762     /lib/libz.so.1.2.3
b7ff0000-b7ff1000 r-xp 00011000 08:07 794762     /lib/libz.so.1.2.3
b7ff1000-b7ff2000 rwxp 00012000 08:07 794762     /lib/libz.so.1.2.3
b7ff2000-b7ff4000 r-xp 00000000 08:07 794980     /lib/libdl-2.8.so
b7ff4000-b7ff5000 r-xp 00001000 08:07 794980     /lib/libdl-2.8.so
b7ff5000-b7ff6000 rwxp 00002000 08:07 794980     /lib/libdl-2.8.so
b7ff6000-b8009000 r-xp 00000000 08:07 794749     /lib/libnsl-2.8.so
b8009000-b800a000 r-xp 00012000 08:07 794749     /lib/libnsl-2.8.so
b800a000-b800b000 rwxp 00013000 08:07 794749     /lib/libnsl-2.8.so
b800b000-b800e000 rwxp b800b000 00:00 0
b8027000-b8042000 r-xp 00000000 08:07 795070     /lib/ld-2.8.so
b8042000-b8043000 r-xp 0001a000 08:07 795070     /lib/ld-2.8.so
b8043000-b8044000 rwxp 0001b000 08:07 795070     /lib/ld-2.8.so
bfd2e000-bfd43000 rwxp bffeb000 00:00 0          [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0          [vdso]
SIGABRT: abort called
Stack trace (11 frames):
./boinc[0x8094e76]
[0xffffe400]
[0xffffe430]
/lib/libc.so.6(gsignal+0x50)[0xb7d96900]
/lib/libc.so.6(abort+0x188)[0xb7d98238]
/lib/libc.so.6[0xb7dd210d]
/lib/libc.so.6[0xb7dd7fc4]
/lib/libc.so.6[0xb7dda6aa]
/lib/libc.so.6(__libc_malloc+0x9c)[0xb7ddc11c]
/usr/lib/libstdc++.so.6(_Znwj+0x27)[0xb7fa9d67]
./boinc[0x8058a60]

Exiting...

and after:

rm client_state*
./boinc
22-Nov-2008 00:18:01 [---] Starting BOINC client version 6.2.15 for i686-pc-linux-gnu
22-Nov-2008 00:18:01 [---] log flags: task, file_xfer, sched_ops
22-Nov-2008 00:18:01 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1
22-Nov-2008 00:18:01 [---] Data directory: /home/michal/BOINC
22-Nov-2008 00:18:01 [---] Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz [Family 6 Model 15 Stepping 11]
22-Nov-2008 00:18:01 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ssht tm pbe lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
22-Nov-2008 00:18:01 [---] OS: Linux: 2.6.25.18-0.2-pae
22-Nov-2008 00:18:01 [---] Memory: 1.98 GB physical, 2.01 GB virtual
22-Nov-2008 00:18:01 [---] Disk: 74.31 GB total, 64.13 GB free
22-Nov-2008 00:18:01 [---] Local time is UTC +1 hours
22-Nov-2008 00:18:01 [---] Coprocessor: CUDA (1)
22-Nov-2008 00:18:01 [World Community Grid] URL: http://www.worldcommunitygrid.org/; Computer ID: not assigned yet; location: (none); project prefs: default
22-Nov-2008 00:18:01 [---] General prefs: from World Community Grid (last modified 01-Jan-1970 01:00:01)
22-Nov-2008 00:18:01 [---] Host location: none
22-Nov-2008 00:18:01 [---] General prefs: using your defaults
22-Nov-2008 00:18:01 [---] Preferences limit memory usage when active to 1519.66MB
22-Nov-2008 00:18:01 [---] Preferences limit memory usage when idle to 1519.66MB
22-Nov-2008 00:18:01 [---] Preferences limit disk usage to 3.73GB
22-Nov-2008 00:18:01 [---] Running CPU benchmarks
22-Nov-2008 00:18:01 [World Community Grid] Fetching scheduler list
22-Nov-2008 00:18:06 [World Community Grid] Master file download succeeded
22-Nov-2008 00:18:11 [World Community Grid] Sending scheduler request: Project initialization.  Requesting 1 seconds of work, reporting 0 completed tasks
22-Nov-2008 00:18:16 [World Community Grid] Scheduler request succeeded: got 0 new tasks
22-Nov-2008 00:18:16 [World Community Grid] Message from server: Not sending work - last request too recent: 49 sec
22-Nov-2008 00:18:18 [World Community Grid] File stat_v01.png exists already, skipping download
22-Nov-2008 00:18:18 [World Community Grid] File default_00_v01.gif exists already, skipping download
22-Nov-2008 00:18:18 [World Community Grid] File dddt_00_v01.gif exists already, skipping download
22-Nov-2008 00:18:18 [World Community Grid] File dddt_01_v01.png exists already, skipping download

Attachments (2)

client_state.xml (88.3 KB) - added by rybakowicki 16 years ago.
client_state.xml
client_state_prev.xml (88.3 KB) - added by rybakowicki 16 years ago.
client_state_prev.xml

Download all attachments as: .zip

Change History (11)

comment:1 Changed 16 years ago by Nicolas

Description: modified (diff)

comment:2 Changed 16 years ago by Nicolas

Ugh. The problem is not malloc failing. The problem is something else is corrupting memory, and the next call to malloc (fortunately!) detects it and aborts the program. The real mess is finding what causes the memory corruption...

Can you please attach your client_state.xml to this ticket? (The only "private" information in that file is your hostname)

Changed 16 years ago by rybakowicki

Attachment: client_state.xml added

client_state.xml

Changed 16 years ago by rybakowicki

Attachment: client_state_prev.xml added

client_state_prev.xml

comment:3 Changed 16 years ago by rybakowicki

Thanks for correcting description, both files added. I've tried with deleting one of them and it didn't helped.

comment:4 Changed 16 years ago by rybakowicki

Only when both are deleted it starts.

comment:5 Changed 16 years ago by Nicolas

Where did you get BOINC from? (to interpret the stacktrace, I need to get the exact same executable file you have)

comment:6 Changed 16 years ago by rybakowicki

http://boinc.berkeley.edu/download_all.php

tested with: boinc_6.2.15_i686-pc-linux-gnu.sh

comment:7 Changed 16 years ago by romw

Owner: changed from romw to davea

Might still happen on current builds.

comment:8 Changed 16 years ago by davea

Resolution: worksforme
Status: newclosed

I can't reproduce this with 6.6.11 using the given client_state.xml file. Is it still happening with 6.6.11?

comment:9 Changed 16 years ago by Nicolas

I never managed to reproduce this either.

Note: See TracTickets for help on using tickets.