Discussion:
[Libstoragemgmt-devel] [BUG] 'make check' will consume all disk space of '/tmp' folder for some plugin exception.
Gris Ge
2014-08-04 14:52:34 UTC
Permalink
Hi Guys,

I found 'make check' will consume all disk space of "/tmp" with this
reproduce patch:

===
diff --git a/plugin/sim/simarray.py b/plugin/sim/simarray.py
index d59a1b1..053c16c 100644
--- a/plugin/sim/simarray.py
+++ b/plugin/sim/simarray.py
@@ -1187,6 +1187,8 @@ class SimData(object):

def iscsi_chap_auth(self, init_id, in_user, in_pass, out_user, out_pass,
flags=0):
+ a = []
+ a[1]
# No iscsi chap query API yet, not need to setup anything
return None
===

The 'lsmcli iscsi-chap --init iqn.1994-05.com.redhat:test-test-iscsi-02-0'
command will got a python exception as expected.

But the 'make check' will run into a loop of 'lsm_connect_close()'[1],
and maybe plugin cannot handle too much socket request which cause socket
file consumed all disk space.

I am in complete ignorance of this bug.

Please take a look.
Best regards.

[1] Guessed by 'strace -p `pidof lt_tester`'.
--
Gris Ge
Tony Asleson
2014-08-04 21:24:08 UTC
Permalink
Without the complete patch series I'm not able to reproduce.

This patch only appears to be creating an 'list index out of range'
exception.

My only *guess* at the moment is that we are doing opens without a
corresponding close so we are exceeding the number of available socket
descriptors available for the FS.

Regards,
Tony
Post by Gris Ge
Hi Guys,
I found 'make check' will consume all disk space of "/tmp" with this
===
diff --git a/plugin/sim/simarray.py b/plugin/sim/simarray.py
index d59a1b1..053c16c 100644
--- a/plugin/sim/simarray.py
+++ b/plugin/sim/simarray.py
def iscsi_chap_auth(self, init_id, in_user, in_pass, out_user, out_pass,
+ a = []
+ a[1]
# No iscsi chap query API yet, not need to setup anything
return None
===
The 'lsmcli iscsi-chap --init iqn.1994-05.com.redhat:test-test-iscsi-02-0'
command will got a python exception as expected.
But the 'make check' will run into a loop of 'lsm_connect_close()'[1],
and maybe plugin cannot handle too much socket request which cause socket
file consumed all disk space.
I am in complete ignorance of this bug.
Please take a look.
Best regards.
[1] Guessed by 'strace -p `pidof lt_tester`'.
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Libstoragemgmt-devel mailing list
https://lists.sourceforge.net/lists/listinfo/libstoragemgmt-devel
Tony Asleson
2014-08-07 02:19:20 UTC
Permalink
Post by Gris Ge
Hi Guys,
I found 'make check' will consume all disk space of "/tmp" with this
===
diff --git a/plugin/sim/simarray.py b/plugin/sim/simarray.py
index d59a1b1..053c16c 100644
--- a/plugin/sim/simarray.py
+++ b/plugin/sim/simarray.py
def iscsi_chap_auth(self, init_id, in_user, in_pass, out_user, out_pass,
+ a = []
+ a[1]
# No iscsi chap query API yet, not need to setup anything
return None
===
The 'lsmcli iscsi-chap --init iqn.1994-05.com.redhat:test-test-iscsi-02-0'
command will got a python exception as expected.
But the 'make check' will run into a loop of 'lsm_connect_close()'[1],
and maybe plugin cannot handle too much socket request which cause socket
file consumed all disk space.
I am in complete ignorance of this bug.
Please take a look.
I didn't get to root cause today, but I do know the following.

The C unit library check is opening a file in /tmp, eg. /tmp/tmpf8XCYC7
and it deletes it while it's still open. Thus if you terminate the unit
test the file system gets cleaned up. The file gets created when the
unit test calls, srunner_run_all which makes a call to tmpfile. I don't
know the full reason for this but the function has this in the source.

* 'Pipe' is implemented as a temporary file to overcome message
* volume limitations outlined in bug #482012. This scheme works well
* with the existing usage wherein the parent does not begin reading
* until the child has done writing and exited.
*
* Pipe life cycle:
* - The parent creates a tmpfile().
* - The fork() call has the effect of duplicating the file descriptor
* and copying (on write) the FILE* data structures.
* - The child writes to the file, and its dup'ed file descriptor and
* data structures are cleaned up on child process exit.
* - Before reading, the parent rewind()'s the file to reset both
* FILE* and underlying file descriptor location data.
* - When finished, the parent fclose()'s the FILE*, deleting the
* temporary file, per tmpfile()'s semantics.
*
* This scheme may break down if the usage changes to asynchronous
* reading and writing.
*/

This temp file is binary and contains libStorageMgmt error messages
(repeatedly) which look like:

0000000: 0000 0000 0000 0002 0000 0002 0000 0008 ................
0000010: 7465 7374 6572 2e63 0000 0788 0000 0002 tester.c........
0000020: 0000 0008 7465 7374 6572 2e63 0000 016c ....tester.c...l
0000030: 0000 0002 0000 0008 7465 7374 6572 2e63 ........tester.c
0000040: 0000 0170 0000 0002 0000 0008 7465 7374 ...p........test
0000050: 6572 2e63 0000 0792 0000 0002 0000 0008 er.c............
0000060: 7465 7374 6572 2e63 0000 0795 0000 0002 tester.c........
0000070: 0000 0008 7465 7374 6572 2e63 0000 0796 ....tester.c....
0000080: 0000 0002 0000 0008 7465 7374 6572 2e63 ........tester.c
0000090: 0000 07a1 0000 0001 0000 0006 7263 203d ............rc =
00000a0: 2032 0000 0000 0000 0003 0000 0002 0000 2..............
00000b0: 0008 7465 7374 6572 2e63 0000 00b5 0000 ..tester.c......
00000c0: 0001 0000 003e 6361 6c6c 3a6c 736d 5f63 .....>call:lsm_c
00000d0: 6f6e 6e65 6374 5f63 6c6f 7365 2072 6320 onnect_close rc
00000e0: 3d20 3430 3020 4e6f 2061 6464 6c2e 2065 = 400 No addl. e
00000f0: 7272 6f72 2069 6e66 6f2e 2028 7768 6963 rror info. (whic
0000100: 6820 3029 0000 0000 0000 0003 0000 0002 h 0)............
0000110: 0000 0008 7465 7374 6572 2e63 0000 00b5 ....tester.c....
0000120: 0000 0001 0000 003e 6361 6c6c 3a6c 736d .......>call:lsm
0000130: 5f63 6f6e 6e65 6374 5f63 6c6f 7365 2072 _connect_close r
0000140: 6320 3d20 3130 3120 4e6f 2061 6464 6c2e c = 101 No addl.
0000150: 2065 7272 6f72 2069 6e66 6f2e 2028 7768 error info. (wh
0000160: 6963 6820 3029 0000 0000 0000 0003 0000 ich 0)..........
0000170: 0002 0000 0008 7465 7374 6572 2e63 0000 ......tester.c..
0000180: 00b5 0000 0001 0000 003e 6361 6c6c 3a6c .........>call:l
0000190: 736d 5f63 6f6e 6e65 6374 5f63 6c6f 7365 sm_connect_close
00001a0: 2072 6320 3d20 3130 3120 4e6f 2061 6464 rc = 101 No add
00001b0: 6c2e 2065 7272 6f72 2069 6e66 6f2e 2028 l. error info. (
00001c0: 7768 6963 6820 3029 0000 0000 0000 0003 which 0)........
00001d0: 0000 0002 0000 0008 7465 7374 6572 2e63 ........tester.c
00001e0: 0000 00b5 0000 0001 0000 003e 6361 6c6c ...........>call
00001f0: 3a6c 736d 5f63 6f6e 6e65 6374 5f63 6c6f :lsm_connect_clo
0000200: 7365 2072 6320 3d20 3130 3120 4e6f 2061 se rc = 101 No a
0000210: 6464 6c2e 2065 7272 6f72 2069 6e66 6f2e ddl. error info.
0000220: 2028 7768 6963 6820 3029 0000 0000 0000 (which 0)

This looks like stdout messages from the test, but the test stdout &
stderr are going to a different file. There is some bug where the test
client is stuck in a loop, but I'm at a loss at the moment why we don't
get an error returned and get stuck in a loop. Maybe the fixture
routines are throwing errors, not sure. We are using CK_FORK=no so I'm
not sure why this pipe exists if we are running the unit test in the
same process instead of a separate process.

I will continue to dig into this as we need to understand what is going
on. We have a looping bug which we need to understand/fix.

Thanks,
Tony

Loading...