Discussion:
[dm-devel] Unable to deactivate lv, pehaps due to semaphore problem...
Gianluca Cecchi
2014-11-27 14:26:11 UTC
Permalink
Hello,
I'm unable to deactivate an lvm.

My system is RHEL 6.5 with lvm2-2.02.100-8.el6.x86_64 and kernel
2.6.32-431.29.2.el6.x86_64

I get error code 5 with message
Logical volume VG_AAA_TEMP/LV_AAA_TEMP in use.

You can find output of
lvchange -d -d -d -d -d -d -an VG_AAA_TEMP/LV_AAA_TEMP
here:
https://drive.google.com/file/d/0BwoPbcrMv8mvTjlBMkRUbG9nczA/view?usp=sharing

Recreating sources from the related src.rpm the problem (as stated anyway
in error message too) is in code

activate/activate.c:711 Logical volume in use

that is
while (open_count_check_retries--) {
if (info->open_count > 0) {
if (open_count_check_retries) {
usleep(OPEN_COUNT_CHECK_USLEEP_DELAY);
log_debug_activation("Retrying open_count
check for %s/%s.",
lv->vg->name,
lv->name);
if (!lv_info(cmd, lv, 0, info, 1, 0))
return -1;
continue;
}
log_error("Logical volume %s/%s in use.",
lv->vg->name, lv->name);
return 0;
} else
break;
}

In fact I get this information querying device:

[***@orapr2 activate]# dmsetup info --checks
360a9800037543544465d424130533177
Name: 360a9800037543544465d424130533177
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 23
Major, minor: 253, 4
Number of targets: 1
UUID: mpath-360a9800037543544465d424130533177

[***@orapr2 activate]# dmsetup table 360a9800037543544465d424130533177
0 2097152 multipath 4 queue_if_no_path pg_init_retries 50
retain_attached_hw_handler 1 alua 2 1 round-robin 0 4 1 66:160 1 69:192 1
133:224 1 8:288 1 round-robin 0 4 1 8:96 1 68:48 1 129:32 1 130:128 1

I got in past days a problem related to sempahores
(maximum number of semaphore sets has been exceeded)
and I suspect that some resource could have been incorrectly locked.

The device is sort of
[***@orapr2 ~]# multipath -l 360a9800037543544465d424130533177
360a9800037543544465d424130533177 dm-4 NETAPP,LUN
size=1.0G features='4 queue_if_no_path pg_init_retries 50
retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| |- 1:0:1:7 sdaq 66:160 active undef running
| |- 1:0:2:7 sdco 69:192 active undef running
| |- 2:0:2:7 sdho 133:224 active undef running
| `- 2:0:3:7 sdiy 8:288 active undef running
`-+- policy='round-robin 0' prio=0 status=enabled
|- 1:0:0:7 sdg 8:96 active undef running
|- 2:0:0:7 sdbp 68:48 active undef running
|- 1:0:3:7 sdeq 129:32 active undef running
`- 2:0:1:7 sdfm 130:128 active undef running

and the filesystem on it has been successfully umounted, but

[***@orapr2 ~]# lvs VG_AAA_TEMP/LV_AAA_TEMP
LV VG Attr LSize Pool Origin Data% Move Log
Cpy%Sync Convert
LV_AAA_TEMP VG_AAA_TEMP -wi-ao---- 1020.00m

How can I see the responsible for the reference that apparently keeps it
open?

Open count: 1
so I can check and eventually fix??

Thanks in advance.

Gianluca
Zdenek Kabelac
2014-11-27 14:33:05 UTC
Permalink
Post by Gianluca Cecchi
Hello,
I'm unable to deactivate an lvm.
My system is RHEL 6.5 with lvm2-2.02.100-8.el6.x86_64 and kernel
2.6.32-431.29.2.el6.x86_64
I get error code 5 with message
Logical volume VG_AAA_TEMP/LV_AAA_TEMP in use.
You can find output of
lvchange -d -d -d -d -d -d -an VG_AAA_TEMP/LV_AAA_TEMP
https://drive.google.com/file/d/0BwoPbcrMv8mvTjlBMkRUbG9nczA/view?usp=sharing
Not really accessible.

But anyway - if you have problem with 'semaphore' resouces - you could
'recycle' old ones -

'dmsetup udevcomplete_all'

Of course it's hard to guess what experiments are you doing and would could
lead to uncompleted cockies (stuck udev scans)

Do you happen to have some suspend devices in your table ?
(dmsetup info -c should show them)
Post by Gianluca Cecchi
LV VG Attr LSize Pool Origin Data% Move Log
Cpy%Sync Convert
LV_AAA_TEMP VG_AAA_TEMP -wi-ao---- 1020.00m
How can I see the responsible for the reference that apparently keeps it open?
Open count: 1
so I can check and eventually fix??
dmsetup ls --tree

is usually good in shows deps between devs (i.e. target A holds target B)

Regards

Zdenek
Gianluca Cecchi
2014-11-27 15:01:53 UTC
Permalink
Post by Zdenek Kabelac
Post by Gianluca Cecchi
Hello,
I'm unable to deactivate an lvm.
My system is RHEL 6.5 with lvm2-2.02.100-8.el6.x86_64 and kernel
2.6.32-431.29.2.el6.x86_64
I get error code 5 with message
Logical volume VG_AAA_TEMP/LV_AAA_TEMP in use.
You can find output of
lvchange -d -d -d -d -d -d -an VG_AAA_TEMP/LV_AAA_TEMP
https://drive.google.com/file/d/0BwoPbcrMv8mvTjlBMkRUbG9nczA/
view?usp=sharing
Not really accessible.
strange, do you mean the google docs link?
I tried with a browser without access to any gmail account and I'm able to
download it....
Post by Zdenek Kabelac
But anyway - if you have problem with 'semaphore' resouces - you could
'recycle' old ones -
'dmsetup udevcomplete_all'
This is actually a production server with many other LVs... Is there any
drawback in the command above?
Post by Zdenek Kabelac
Of course it's hard to guess what experiments are you doing and would
could lead to uncompleted cockies (stuck udev scans)
Actually no experiment at all.
The node is part of a rhel 2-nodes production cluster with HA_LVM based
services.
We need to relocate many services to the other node for a planned
maintenance, but it seems that this one is able to stop the lvm resources,
but not cleanly deactivate the LVs. We get messages like

Nov 26 17:35:29 orapr2 rgmanager[5765]: [lvm] Deactivating
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:29 orapr2 rgmanager[5786]: [lvm] Making resilient : lvchange
-an VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:29 orapr2 rgmanager[5809]: [lvm] Resilient command: lvchange
-an VG_AAA_TEMP/LV_AAA_TEMP --config devices{filter=["a|/dev/ma
pper/360a9800037543544465d424
Nov 26 17:35:34 orapr2 rgmanager[5883]: [lvm] lv_exec_resilient failed
Nov 26 17:35:34 orapr2 rgmanager[5908]: [lvm] lv_activate_resilient stop
failed on VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5928]: [lvm] Unable to deactivate
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5948]: [lvm] Failed to stop
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5968]: [lvm] Attempting cleanup of
VG_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5989]: [lvm] VG_AAA_TEMP now consistent
Nov 26 17:35:34 orapr2 rgmanager[6013]: [lvm] Deactivating
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[6033]: [lvm] Making resilient : lvchange
-an VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:35 orapr2 rgmanager[6056]: [lvm] Resilient command: lvchange
-an VG_AAA_TEMP/LV_AAA_TEMP --config devices{filter=["a|/dev/ma
pper/360a9800037543544465d424
Nov 26 17:35:39 orapr2 rgmanager[6648]: [lvm] lv_exec_resilient failed
Nov 26 17:35:40 orapr2 rgmanager[6670]: [lvm] lv_activate_resilient stop
failed on VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[6690]: [lvm] Unable to deactivate
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[6710]: [lvm] Failed second attempt to stop
VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[20260]: stop on lvm "LV_AAA_TEMP" returned
1 (generic error)
Nov 26 17:35:40 orapr2 rgmanager[20260]: Marking service:AAA as 'disabled',
but some resources may still be allocated!
Nov 26 17:35:40 orapr2 rgmanager[20260]: Service service:AAA is disabled

And of course the other node then is unable to activate the service due to
LV maintained open from the first one:

Nov 26 17:35:40 orapr1 rgmanager[18596]: Starting disabled service
service:AAA
Nov 26 17:35:41 orapr1 rgmanager[31420]: [lvm] Someone else owns this
logical volume
Nov 26 17:35:41 orapr1 rgmanager[18596]: start on lvm "LV_AAA_TEMP"
returned 1 (generic error)
Nov 26 17:35:41 orapr1 rgmanager[18596]: #68: Failed to start service:AAA;
return value: 1

So I'm trying to reproduce the cluster command to see how to clean the
situation, using this particular service (named AAA) that is not critical
as the other ones running on the node
Post by Zdenek Kabelac
Do you happen to have some suspend devices in your table ?
(dmsetup info -c should show them)
It seems not so. Only (L)ive states...

[***@orapr2 ~]# dmsetup info -c | awk '{print $4}' | sort | uniq -c
77 L--w
1 Stat
Post by Zdenek Kabelac
Post by Gianluca Cecchi
LV VG Attr LSize Pool Origin Data% Move Log
Cpy%Sync Convert
LV_AAA_TEMP VG_AAA_TEMP -wi-ao---- 1020.00m
How can I see the responsible for the reference that apparently keeps it open?
Open count: 1
so I can check and eventually fix??
dmsetup ls --tree
is usually good in shows deps between devs (i.e. target A holds target B)
Regards
Zdenek
it returns no particular output related
...
VG_AAA_TEMP-LV_AAA_TEMP (253:49)
└─360a9800037543544465d424130533177 (253:4)
├─ (130:128)
├─ (129:32)
├─ (68:48)
├─ (8:96)
├─ (8:288)
├─ (133:224)
├─ (69:192)
└─ (66:160)
...

BTW: I'm testing this one but it seems that the problem is general, in the
sense that each LV gets this kind of behaviour trying to deactivating it...

Thanks in advance for any other insight and let me know if I can send it
the debug log of lvchange command in case you are not yet able to access
it...

Gianluca
Zdenek Kabelac
2014-11-27 15:24:16 UTC
Permalink
Post by Gianluca Cecchi
Hello,
I'm unable to deactivate an lvm.
My system is RHEL 6.5 with lvm2-2.02.100-8.el6.x86_64 and kernel
2.6.32-431.29.2.el6.x86_64
I get error code 5 with message
Logical volume VG_AAA_TEMP/LV_AAA_TEMP in use.
You can find output of
lvchange -d -d -d -d -d -d -an VG_AAA_TEMP/LV_AAA_TEMP
https://drive.google.com/file/__d/__0BwoPbcrMv8mvTjlBMkRUbG9nczA/__view?usp=sharing
<https://drive.google.com/file/d/0BwoPbcrMv8mvTjlBMkRUbG9nczA/view?usp=sharing>
Not really accessible.
strange, do you mean the google docs link?
I tried with a browser without access to any gmail account and I'm able to
download it....
Ahh - I'd too strict adblock filter - however passed info is not that useful.
Prefered output is 'lvchange -vvvv -an'
Post by Gianluca Cecchi
But anyway - if you have problem with 'semaphore' resouces - you could
'recycle' old ones -
'dmsetup udevcomplete_all'
This is actually a production server with many other LVs... Is there any
drawback in the command above?
You could 'complete' all cookies older then i.e. 1 minute.
Nothing should be holding cookie for that long time.
Post by Gianluca Cecchi
Nov 26 17:35:34 orapr2 rgmanager[5908]: [lvm] lv_activate_resilient stop
failed on VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5928]: [lvm] Unable to deactivate
VG_AAA_TEMP/LV_AAA_TEMP
Well - here we need to know the reason why it has failed -

Is there some user of VG_AAA_TEMP/LV_AAA_TEMP ?

Mounted, opened, used as a device for something else ?
Post by Gianluca Cecchi
LV VG Attr LSize Pool Origin Data%
Move Log
Cpy%Sync Convert
LV_AAA_TEMP VG_AAA_TEMP -wi-ao---- 1020.00m
How can I see the responsible for the reference that apparently keeps
it open?
Open count: 1
so I can check and eventually fix??
lsof - look for user of 253,49

Zdenek
Gianluca Cecchi
2014-11-27 15:36:52 UTC
Permalink
Post by Zdenek Kabelac
Ahh - I'd too strict adblock filter - however passed info is not that useful.
Prefered output is 'lvchange -vvvv -an'
Thanks!
here it is:
https://drive.google.com/file/d/0BwoPbcrMv8mvellaS1FtVXpqU2c/view?usp=sharing
Post by Zdenek Kabelac
Post by Zdenek Kabelac
But anyway - if you have problem with 'semaphore' resouces - you could
'recycle' old ones -
'dmsetup udevcomplete_all'
This is actually a production server with many other LVs... Is there any
drawback in the command above?
You could 'complete' all cookies older then i.e. 1 minute.
Nothing should be holding cookie for that long time.
So the command would be

dmsetup udevcomplete_all 1
correct?
Post by Zdenek Kabelac
Nov 26 17:35:34 orapr2 rgmanager[5908]: [lvm] lv_activate_resilient stop
Post by Zdenek Kabelac
failed on VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5928]: [lvm] Unable to deactivate
VG_AAA_TEMP/LV_AAA_TEMP
Well - here we need to know the reason why it has failed -
Is there some user of VG_AAA_TEMP/LV_AAA_TEMP ?
Mounted, opened, used as a device for something else ?
It seems not so.
Post by Zdenek Kabelac
Post by Zdenek Kabelac
LV VG Attr LSize Pool Origin Data%
Move Log
Cpy%Sync Convert
LV_AAA_TEMP VG_AAA_TEMP -wi-ao---- 1020.00m
How can I see the responsible for the reference that apparently keeps
it open?
Open count: 1
so I can check and eventually fix??
lsof - look for user of 253,49
[***@orapr2 ~]# lsof | grep 253,49
[***@orapr2 ~]#

While for example for another LV (with major,minor=253,68) that is part of
a running service I have at this moment:
[***@orapr2 ~]# lsof | grep 253,68
oracle 25298 oracle11se 273uW REG 253,68 988811264
12 /PABX/temp/tempPABX.dbf
oracle 25298 oracle11se 274uW REG 253,68 1319112704
13 /PABX/temp/tempnewPABX.dbf
oracle 25300 oracle11se 269u REG 253,68 988811264
12 /PABX/temp/tempPABX.dbf
oracle 25302 oracle11se 266u REG 253,68 1319112704
13 /PABX/temp/tempnewPABX.dbf
oracle 25302 oracle11se 270u REG 253,68 988811264
12 /PABX/temp/tempPABX.dbf
oracle 25304 oracle11se 266u REG 253,68 1319112704
13 /PABX/temp/tempnewPABX.dbf
oracle 25304 oracle11se 269u REG 253,68 988811264
12 /PABX/temp/tempPABX.dbf
oracle 25306 oracle11se 283u REG 253,68 988811264
12 /PABX/temp/tempPABX.dbf
oracle 25306 oracle11se 284u REG 253,68 1319112704
13 /PABX/temp/tempnewPABX.dbf
oracle 25310 oracle11se 270u REG 253,68 988811264
12 /PABX/temp/tempPABX.dbf
oracle 25310 oracle11se 271u REG 253,68 1319112704
13 /PABX/temp/tempnewPABX.dbf

Gianluca
Zdenek Kabelac
2014-11-27 15:43:22 UTC
Permalink
Post by Zdenek Kabelac
Ahh - I'd too strict adblock filter - however passed info is not that useful.
Prefered output is 'lvchange -vvvv -an'
Thanks!
https://drive.google.com/file/d/0BwoPbcrMv8mvellaS1FtVXpqU2c/view?usp=sharing
Well yes - it shows the device is being held open - lvm2 retries to several
times if the device does not get close - but it doesn't so it refuses o
deactivate.

So if you are sure there is no user - then maybe you have some kpartx
partition this device ?

I'd probably need to see your whole 'dmsetup table'
(at least do a grep for VG_AAA_TEMP/LV_AAA_TEMP minor)
Post by Zdenek Kabelac
You could 'complete' all cookies older then i.e. 1 minute.
Nothing should be holding cookie for that long time.
So the command would be
dmsetup udevcomplete_all 1
correct?
Yes

Zdenek
Gianluca Cecchi
2014-11-27 15:56:13 UTC
Permalink
Post by Zdenek Kabelac
Post by Gianluca Cecchi
Thanks!
https://drive.google.com/file/d/0BwoPbcrMv8mvellaS1FtVXpqU2c/
view?usp=sharing
Well yes - it shows the device is being held open - lvm2 retries to
several times if the device does not get close - but it doesn't so it
refuses o deactivate.
So if you are sure there is no user - then maybe you have some kpartx
partition this device ?
No, the PV is the whole disk and the VG is composed only of this LV that
uses all the PEs

[***@orapr2 ~]# pvs /dev/mapper/360a9800037543544465d424130533177
PV VG Fmt Attr PSize
PFree
/dev/mapper/360a9800037543544465d424130533177 VG_AAA_TEMP lvm2 a--
1020.00m 0

[***@orapr2 ~]# fdisk -l /dev/mapper/360a9800037543544465d424130533177

Disk /dev/mapper/360a9800037543544465d424130533177: 1073 MB, 1073741824
bytes
255 heads, 63 sectors/track, 130 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x00000000

[***@orapr2 ~]# vgdisplay -v VG_AAA_TEMP
Using volume group(s) on command line
Finding volume group "VG_AAA_TEMP"
--- Volume group ---
VG Name VG_AAA_TEMP
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 56
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 1
Open LV 1
Max PV 0
Cur PV 1
Act PV 1
VG Size 1020.00 MiB
PE Size 4.00 MiB
Total PE 255
Alloc PE / Size 255 / 1020.00 MiB
Free PE / Size 0 / 0
VG UUID 3sItFO-bvAo-FINS-R6Mj-Fcl7-4kF2-gmzY4M

--- Logical volume ---
LV Path /dev/VG_AAA_TEMP/LV_AAA_TEMP
LV Name LV_AAA_TEMP
VG Name VG_AAA_TEMP
LV UUID azFNtf-L1CY-3rdz-OQLA-7oF4-3rMX-uWfXpH
LV Write Access read/write
LV Creation host, time ,
LV Status available
# open 1
LV Size 1020.00 MiB
Current LE 255
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:49

--- Physical volumes ---
PV Name /dev/mapper/360a9800037543544465d424130533177
PV UUID LnsWgD-XfYe-axfJ-nBgC-Gew5-jBH3-oXpiIO
PV Status allocatable
Total PE / Free PE 255 / 0
Post by Zdenek Kabelac
I'd probably need to see your whole 'dmsetup table'
(at least do a grep for VG_AAA_TEMP/LV_AAA_TEMP minor)
[***@orapr2 ~]# dmsetup table | egrep
"AAA_TEMP|360a9800037543544465d424130533177|253:4"
VG_AAA_TEMP-LV_AAA_TEMP: 0 2088960 linear 253:4 384
360a9800037543544465d424130533177: 0 2097152 multipath 4 queue_if_no_path
pg_init_retries 50 retain_attached_hw_handler 1 alua 2 1 round-robin 0 4 1
66:160 1 69:192 1 133:224 1 8:288 1 round-robin 0 4 1 8:96 1 68:48 1 129:32
1 130:128 1

It seems no sempahores to be cleaned...
[***@orapr2 ~]# dmsetup udevcomplete_all 1
This operation will destroy all semaphores older than 1 minutes with keys
that have a prefix 3405 (0xd4d).
Do you really want to continue? [y/n]: y
0 semaphores with keys prefixed by 3405 (0xd4d) destroyed. 0 skipped.

and still
[***@orapr2 ~]# lvchange -an VG_AAA_TEMP/LV_AAA_TEMP
Logical volume VG_AAA_TEMP/LV_AAA_TEMP in use.

;-(
Zdenek Kabelac
2014-11-27 16:05:32 UTC
Permalink
Post by Gianluca Cecchi
LV Path /dev/VG_AAA_TEMP/LV_AAA_TEMP
LV Name LV_AAA_TEMP
VG Name VG_AAA_TEMP
LV UUID azFNtf-L1CY-3rdz-OQLA-7oF4-3rMX-uWfXpH
LV Write Access read/write
LV Creation host, time ,
LV Status available
# open 1
LV Size 1020.00 MiB
Current LE 255
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:49
--- Physical volumes ---
PV Name /dev/mapper/360a9800037543544465d424130533177
PV UUID LnsWgD-XfYe-axfJ-nBgC-Gew5-jBH3-oXpiIO
PV Status allocatable
Total PE / Free PE 255 / 0
I'd probably need to see your whole 'dmsetup table'
(at least do a grep for VG_AAA_TEMP/LV_AAA_TEMP minor)
"AAA_TEMP|360a9800037543544465d424130533177|253:4"
So it's likely a time to look through 'dmesg' output if there is not something
suspicious ?

Kernel bug/report showing some error behavior which may lead to openned
descriptor being leaked?

Is there anything in /sys/block/dm-49/holders ?

cat /proc/self/mountinfo | grep 253:49 ?


Zdenek
Gianluca Cecchi
2014-11-27 16:27:35 UTC
Permalink
Post by Zdenek Kabelac
So it's likely a time to look through 'dmesg' output if there is not
something suspicious ?
Kernel bug/report showing some error behavior which may lead to openned
descriptor being leaked?
Is there anything in /sys/block/dm-49/holders ?
cat /proc/self/mountinfo | grep 253:49 ?
I tried before, but found nothing

[***@orapr2 ~]# ll /sys/block/dm-49/holders
total 0

[***@orapr2 ~]# cat /proc/self/mountinfo | grep 253:49
[***@orapr2 ~]#

last lines of dmesg contain
dlm: closing connection to node 1
dlm: got connection from 1

That refers to when we restarted the passive node yesterday
Nov 26 17:23:58 orapr2 kernel: dlm: closing connection to node 1
Zdenek Kabelac
2014-11-27 16:35:48 UTC
Permalink
Post by Zdenek Kabelac
So it's likely a time to look through 'dmesg' output if there is not
something suspicious ?
Kernel bug/report showing some error behavior which may lead to openned
descriptor being leaked?
Is there anything in /sys/block/dm-49/holders ?
cat /proc/self/mountinfo | grep 253:49 ?
I tried before, but found nothing
total 0
As I've been pointed our cgroups could play some role -

so lets repeat with /proc/*/mountinfo

Also check /proc/swaps ?

Zdenek
Gianluca Cecchi
2014-11-27 16:46:23 UTC
Permalink
Post by Zdenek Kabelac
As I've been pointed our cgroups could play some role -
mhh I don't think I'm using Control Groups at all....
[***@orapr2 ~]# rpm -qa |grep group
[***@orapr2 ~]#

[***@orapr2 ~]# ll /etc/cgconfig.conf
ls: cannot access /etc/cgconfig.conf: No such file or directory
Post by Zdenek Kabelac
so lets repeat with /proc/*/mountinfo
[***@orapr2 ~]# cat /proc/*/mountinfo > /tmp/proc.log
cat: /proc/33700/mountinfo: Invalid argument
cat: /proc/33735/mountinfo: No such file or directory

[***@orapr2 ~]# wc -l /tmp/proc.log
55050 /tmp/proc.log

[***@orapr2 ~]# grep 253:49 /tmp/proc.log
[***@orapr2 ~]#


Also check /proc/swaps ?
[***@orapr2 ~]# cat /proc/swaps
Filename Type Size Used Priority
/dev/sda2 partition 16777212 0 -1
Zdenek Kabelac
2014-11-27 17:04:01 UTC
Permalink
Post by Zdenek Kabelac
As I've been pointed our cgroups could play some role -
mhh I don't think I'm using Control Groups at all....
ls: cannot access /etc/cgconfig.conf: No such file or directory
so lets repeat with /proc/*/mountinfo
cat: /proc/33700/mountinfo: Invalid argument
cat: /proc/33735/mountinfo: No such file or directory
55050 /tmp/proc.log
Also check /proc/swaps ?
FilenameTypeSizeUsedPriority
/dev/sda2 partition167772120-1
So I'm afraid you will need to open a RHEL support ticket to get all the
diagnostic (sosreport) - since if this is production system,
all further advices may lead to some problems.

Zdenek
Gianluca Cecchi
2014-11-27 17:18:50 UTC
Permalink
Post by Zdenek Kabelac
So I'm afraid you will need to open a RHEL support ticket to get all the
diagnostic (sosreport) - since if this is production system,
all further advices may lead to some problems.
Zdenek
OK. Can I refer this thread and your help provided in the case discussion?
Gianluca
Zdenek Kabelac
2014-11-27 17:21:07 UTC
Permalink
Post by Zdenek Kabelac
So I'm afraid you will need to open a RHEL support ticket to get all the
diagnostic (sosreport) - since if this is production system,
all further advices may lead to some problems.
Zdenek
OK. Can I refer this thread and your help provided in the case discussion?
Gianluca
Sure, np

Zdenek
Gianluca Cecchi
2014-11-27 17:49:09 UTC
Permalink
Post by Zdenek Kabelac
Post by Zdenek Kabelac
So I'm afraid you will need to open a RHEL support ticket to get all the
diagnostic (sosreport) - since if this is production system,
all further advices may lead to some problems.
Zdenek
OK. Can I refer this thread and your help provided in the case discussion?
Gianluca
Sure, np
Zdenek
I opened case 01308210. Let's see what will happen.
We will have a maintenance on Saturday, so eventually we will have to stop
services and power off the node before enabling them again on the other
one....
Thanks for your kindness and speed in answering with proposals.

Gianluca

Loading...