Discussion:
[dm-devel] dmitry.kasatkin@huawei.com
Renesanso
2017-07-03 04:44:07 UTC
Permalink
Hi for all.

Dmitry Kasatkin's fork of linux.git write dm-integrity patch for linux
kernel and write example to create dm device
https://kernel.googlesource.com/pub/scm/linux/kernel/git/kasatkin/linux-digsig/+/2dfa67a1a4c049fd33fcc6abcb1c8ca57b17a268/Documentation/device-mapper/dm-integrity.txt
, but
it mainline kernel is other dm-integrity, that must work in ecc mode or
as AEAD-backend for dm-crypt.
https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/dm-integrity.txt
https://www.redhat.com/archives/dm-devel/2017-January/msg00028.html

I try to use dmsetup to setup dm-integrity in ecc mode (but if change
block on backend device dm-integrity gives not reaction and give another
md5sum to upper level. but non error), for dm-crypt I cannot understand
how to use AEAD mode.

Please, give full instrustion to use dm-integrity in ecc mode and with
dm-crypt (with kernel keychain creation)..

Thanks, Renesanso.
Milan Broz
2017-07-03 15:05:35 UTC
Permalink
Post by Renesanso
Hi for all.
Dmitry Kasatkin's fork of linux.git write dm-integrity patch for linux
...

yes, unfortunately we named the target the same (and I realized it too late).

It is doing something similar but definitely it is not the same.
Post by Renesanso
I try to use dmsetup to setup dm-integrity in ecc mode (but if change
block on backend device dm-integrity gives not reaction and give another
md5sum to upper level. but non error), for dm-crypt I cannot understand
how to use AEAD mode.
You probably configured it in mode when it only provide tag space,
but does not calculate and verify internal hash.

(ECC means error correction, this target do not provide error correction,
only detection of error (such a tool could be written on top of dm-integrity though).
Post by Renesanso
Please, give full instrustion to use dm-integrity in ecc mode and with
dm-crypt (with kernel keychain creation)..
dm-integrity can work in standalone mode or together with dm-crypt.

For the standalone mode, it is the best to use integritysetup tool
(for now in master branch of cryptsetup project).
https://gitlab.com/cryptsetup/cryptsetup

There is some simple documentation in man page and on this page
https://gitlab.com/cryptsetup/cryptsetup/wikis/DMIntegrity

(You can setup HMAC integrity protection in standalone mode as well.)
I will update it soon with some more info and prepare some better examples
(the whole userspace is still not finished though but should work.)

For the combination with dm-crypt and AEAD - this is part of LUKS2 branch
in the same repository but it is really only for experiments.
Once we will have some testing build, I'll write more here, sorry, it takes
longer than I expected.

Milan
Milan Broz
2017-07-07 10:47:47 UTC
Permalink
1. And in this (
https://kernel.googlesource.com/pub/scm/linux/kernel/git/kasatkin/linux-digsig/+/2dfa67a1a4c049fd33fcc6abcb1c8ca57b17a268/Documentation/device-mapper/dm-integrity.txt
) implementation gives variant to use external device for metadata and
journal. It really affect perfomance, I think. Do you plan do analogue
functions?
Not yet. The design was meant for authenticated extension and cryptsetup
has only one device as a backend (I really do not want to make it another volume manager :)
(It is not easy task to solve the situation when the separate devices are out of sync,
and security implications only complicates it more.)

But I can imagine that if there is some strong use case, it can be added there.

(As said, it was not means as straight replacement of the module mentioned above;
the name clash is just my stupid coincidence.)
2. And other question: in your implementation tags write rarery after
data (eg. data[512b], tag[32b], data [512b], tag[32b]) or data stores in
one "half" of disk and tags in another (in end of disk, example)? Second
variant gives VERY HUGE penalty on hdd's.
Tags are written into metadata sectors that are interleaved with normal
data sectors. (This allow us to resize the device later.)

There are multiple tags in one metadata sector and dm-integrity
provides all the infrastructure to perform atomic write of sector+metadata.
3. And can, as I see there many options (as journal, buffers and other).
Can you give me example of parameters configuration, that fully
correctly work in production (KVMs- VMs -> raw -> EXT4 -> LVM -> MD ->
multiple dm-integrity on multiple phisical disks )?
Sorry, I do not have any "good practices" or any configurations yet.
In fact, I am still curious what use cases people can find for it :)

There is a lot of things to fine-tune though. But the initial
penalty of using data journal will be always big.
(You can switch journalling off if it is already on higher layer.)

The whole integrity/crypt stack was meant for experiments with
authenticated encryption on the sector level (and with expected
performance penalty - security is the first class citizen here).

I will try to put some more info into some blog soon.

Thanks,
Milan
Renesanso
2017-07-12 18:30:06 UTC
Permalink
Hi.*Please, give link to your blog.*

I succesfully create configuration for virtual machines (and it works
fine), test force reboot many time (and machine have some problem with
ACPI table, thats why SATA controller reset links for all disks (now it
fixed), but only one disk is out from md-raid and system worked. when
controller problem was it heavy test I/O). E.g. I think, that with
journal and crc32 dm-integrity is stable. In one of tests I write 1Mb
with dd (with skip), then read (cat > /dev/null) LVM-partition on top of
md. dm-intergity succesfully detect corrution, generate i/o error and md
fencing disk, but still works.


I use this configuration: 7 SATA disks => each with own dm-integrity =>
md raid6 6 disks + 1 spare disk -> LVM for virtual machines.

If any disk gives up buggy info, dm-integrity will generate i/o error
and md fence disk from raid, but system continious working. You can say,
that I must use ZFS for this case, but I don't have SSD-cache, and
without SSD-cache ZFS is too slow, SLOWSLOWSLOW in future, when virtual
machines disks will fragment, e.g. CoW. My solution gives integrity
opportunity without fragmentation penalty.

Please, use this info in your blog, becase in production we want not
only know, that info from disk is buggy, but read non-buggy info and
continue working. If you will paste to your blog, please. change serial
numbers of my disks. :)

*And one more think: after system boot and assemble script executed, it
eates ~4Gb RAM (only dm-intergity). What parameter doing it?*


I use this sript on boot to assemle raid:

cd /root/local/integrity/src

LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5000AADS-00S9B0_WD-WCAV90886975-part1
integra-WCAV90886975

LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F0SJX4T6-part1
integra-WCC3F0SJX4T6

LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5001AALS-00J7B0_WD-WMATV7467797-part1
integra-WMATV7467797

LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F0YY89KN-part1
integra-WCC3F0YY89KN

LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F5HF3NN0-part1
integra-WCC3F5HF3NN0

LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5003AZEX-00K3CA0_WD-WCC6Y4XEA1RS-part1
integra-WCC6Y4XEA1RS

LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/mapper/vg--super--blue-lv--super--spare integra-spare

cd /

mdadm --assemble --scan

sleep 6

echo 50000 > /proc/sys/dev/raid/speed_limit_min

echo 500000 > /proc/sys/dev/raid/speed_limit_max

echo 32768 > /sys/block/md0/md/stripe_cache_size

And this to create:

LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup -v
--sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5000AADS-00S9B0_WD-WCAV90886975-part1 ##->
../../sda1 LD_PRELOAD='../lib/.libs/libcryptsetup.so.12'
./integritysetup -v --sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F0SJX4T6-part1 ## ->
../../sdb1 LD_PRELOAD='../lib/.libs/libcryptsetup.so.12'
./integritysetup -v --sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5001AALS-00J7B0_WD-WMATV7467797-part1 ##->
../../sdc1 LD_PRELOAD='../lib/.libs/libcryptsetup.so.12'
./integritysetup -v --sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F0YY89KN-part1 ##->
../../sdd1 LD_PRELOAD='../lib/.libs/libcryptsetup.so.12'
./integritysetup -v --sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F5HF3NN0-part1 ## ->
../../sde1 LD_PRELOAD='../lib/.libs/libcryptsetup.so.12'
./integritysetup -v --sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5003AZEX-00K3CA0_WD-WCC6Y4XEA1RS-part1 ## ->
../../sdf1 LD_PRELOAD='../lib/.libs/libcryptsetup.so.12'
./integritysetup -v --sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/mapper/vg--super--blue-lv--super--spare

mdadm --verbose --create /dev/md0 --bitmap=internal
--bitmap-chunk=524288 -c 128 --level=6 --raid-devices=6
/dev/mapper/integra-WCAV90886975 /dev/mapper/integra-WCC3F0SJX4T6
/dev/mapper/integra-WMATV7467797 /dev/mapper/integra-WCC3F0YY89KN
/dev/mapper/integra-WCC3F5HF3NN0 /dev/mapper/integra-WCC6Y4XEA1RS
--spare-devices=1 /dev/mapper/integra-spare
And I have other question: is it ready for production in crc32+journal
mode? Not as like IBM-like production, I mean, but it is not full> buggy, but stable now? :)
Hi,
well, there are some tests, we found and fixed some issues in 4.12-rcX
(so be sure you are using code from released 4.12.0 or later!).
So I hope it is stable, but it would definitely need more users to test.
If you find any problem, crash or anything related to kernel code, please send
Thanks,
Milan
Mikulas Patocka
2017-07-13 13:22:37 UTC
Permalink
Hi
Hi. Please, give link to your blog.
I succesfully create configuration for virtual machines (and it works
fine), test force reboot many time (and machine have some problem with
ACPI table, thats why SATA controller reset links for all disks (now it
fixed), but only one disk is out from md-raid and system worked. when
controller problem was it heavy test I/O). E.g. I think, that with
journal and crc32 dm-integrity is stable. In one of tests I write 1Mb
with dd (with skip), then read (cat > /dev/null) LVM-partition on top of
md. dm-intergity succesfully detect corrution, generate i/o error and md
fencing disk, but still works.
I use this configuration: 7 SATA disks => each with own dm-integrity =>
md raid6 6 disks + 1 spare disk -> LVM for virtual machines.
If any disk gives up buggy info, dm-integrity will generate i/o error
and md fence disk from raid, but system continious working. You can say,
that I must use ZFS for this case, but I don't have SSD-cache, and
without SSD-cache ZFS is too slow, SLOWSLOWSLOW in future, when virtual
machines disks will fragment, e.g. CoW. My solution gives integrity
opportunity without fragmentation penalty.
Please, use this info in your blog, becase in production we want not
only know, that info from disk is buggy, but read non-buggy info and
continue working. If you will paste to your blog, please. change serial
numbers of my disks. :)
And one more think: after system boot and assemble script executed, it
eates ~4Gb RAM (only dm-intergity). What parameter doing it?
The journal must be kept in memory. You have journal size 836870912 for 7
devices, that's 5858096384 bytes. If you want to reduce memory
consumption, reduce the journal size.
cd /root/local/integrity/src
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5000AADS-00S9B0_WD-WCAV90886975-part1
integra-WCAV90886975
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F0SJX4T6-part1
integra-WCC3F0SJX4T6
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5001AALS-00J7B0_WD-WMATV7467797-part1
integra-WMATV7467797
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F0YY89KN-part1
integra-WCC3F0YY89KN
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F5HF3NN0-part1
integra-WCC3F5HF3NN0
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/disk/by-id/ata-WDC_WD5003AZEX-00K3CA0_WD-WCC6Y4XEA1RS-part1
integra-WCC6Y4XEA1RS
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup
--integrity=crc32 --buffer-sectors=414096 --journal-watermark=70
--journal-commit-time=60000 open
/dev/mapper/vg--super--blue-lv--super--spare integra-spare
cd /
mdadm --assemble --scan
sleep 6
echo 50000 > /proc/sys/dev/raid/speed_limit_min
echo 500000 > /proc/sys/dev/raid/speed_limit_max
echo 32768 > /sys/block/md0/md/stripe_cache_size
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup -v
--sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5000AADS-00S9B0_WD-WCAV90886975-part1 ##->
../../sda1
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup -v
--sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F0SJX4T6-part1 ##->
../../sdb1
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup -v
--sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5001AALS-00J7B0_WD-WMATV7467797-part1 ##->
../../sdc1
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup -v
--sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F0YY89KN-part1 ##->
../../sdd1
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup -v
--sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5003AZEX-00MK2A0_WD-WCC3F5HF3NN0-part1 ##->
../../sde1
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup -v
--sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/disk/by-id/ata-WDC_WD5003AZEX-00K3CA0_WD-WCC6Y4XEA1RS-part1 ##->
../../sdf1
LD_PRELOAD='../lib/.libs/libcryptsetup.so.12' ./integritysetup -v
--sector-size=512 --integrity=crc32 --tag-size 32
--buffer-sectors=414096 --journal-size=836870912 format
/dev/mapper/vg--super--blue-lv--super--spare
Note that --tag-size is in bytes and it is useless to have 32 bytes for
the crc32 function because crc32 returns just 4 bytes. These extra bytes
just take up disk space and reduce performance. You should omit the
--tag-size argument entirely - so that it defaults to the size of the
crc32 function (4 bytes).

Also - for better performance, use sector size 4096 (you must use
filesystem block size 4096 bytes as well). It makes sense to use small
sector size only if you need small filesystem block size (if you intend to
use the filesystem for storing many small files).
mdadm --verbose --create /dev/md0  --bitmap=internal --bitmap-chunk=524288 -c 128  --level=6  --raid-devices=6 /dev/mapper/integra-WCAV90886975
/dev/mapper/integra-WCC3F0SJX4T6 /dev/mapper/integra-WMATV7467797 /dev/mapper/integra-WCC3F0YY89KN /dev/mapper/integra-WCC3F5HF3NN0 /dev/mapper/integra-WCC6Y4XEA1RS
--spare-devices=1 /dev/mapper/integra-spare
And I have other question: is it ready for production in crc32+journal
mode? Not as like IBM-like production, I mean, but it is not full> buggy, but stable now? :)
There were no data corruption reports from users. But this is new code, so
not many users have tried it. Back up your data and then you can try it :)

Mikulas
Hi,
well, there are some tests, we found and fixed some issues in 4.12-rcX
(so be sure you are using code from released 4.12.0 or later!).
So I hope it is stable, but it would definitely need more users to test.
If you find any problem, crash or anything related to kernel code, please send
Thanks,
Milan
Renesanso
2017-07-12 18:36:38 UTC
Permalink
I have other question: why you dont use AEAD idea from redhad for
dm-crypt (cryptsetup, that works, as they present), that realise AES-GCM
(as, example ZFS use)? Why do you want to merge dm-integrity and
dm-crypt?
https://mbroz.fedorapeople.org/talks/DevConf2017/devconf2017-aead.pdf
Post by Milan Broz
1. And in this (
https://kernel.googlesource.com/pub/scm/linux/kernel/git/kasatkin/linux-digsig/+/2dfa67a1a4c049fd33fcc6abcb1c8ca57b17a268/Documentation/device-mapper/dm-integrity.txt
) implementation gives variant to use external device for metadata and
journal. It really affect perfomance, I think. Do you plan do analogue
functions?
Not yet. The design was meant for authenticated extension and cryptsetup
has only one device as a backend (I really do not want to make it another volume manager :)
(It is not easy task to solve the situation when the separate devices are out of sync,
and security implications only complicates it more.)
But I can imagine that if there is some strong use case, it can be added there.
(As said, it was not means as straight replacement of the module mentioned above;
the name clash is just my stupid coincidence.)
2. And other question: in your implementation tags write rarery after
data (eg. data[512b], tag[32b], data [512b], tag[32b]) or data stores in
one "half" of disk and tags in another (in end of disk, example)? Second
variant gives VERY HUGE penalty on hdd's.
Tags are written into metadata sectors that are interleaved with normal
data sectors. (This allow us to resize the device later.)
There are multiple tags in one metadata sector and dm-integrity
provides all the infrastructure to perform atomic write of sector+metadata.
3. And can, as I see there many options (as journal, buffers and other).
Can you give me example of parameters configuration, that fully
correctly work in production (KVMs- VMs -> raw -> EXT4 -> LVM -> MD ->
multiple dm-integrity on multiple phisical disks )?
Sorry, I do not have any "good practices" or any configurations yet.
In fact, I am still curious what use cases people can find for it :)
There is a lot of things to fine-tune though. But the initial
penalty of using data journal will be always big.
(You can switch journalling off if it is already on higher layer.)
The whole integrity/crypt stack was meant for experiments with
authenticated encryption on the sector level (and with expected
performance penalty - security is the first class citizen here).
I will try to put some more info into some blog soon.
Thanks,
Milan
Milan Broz
2017-07-13 10:35:54 UTC
Permalink
Post by Renesanso
I have other question: why you dont use AEAD idea from redhad for
dm-crypt (cryptsetup, that works, as they present), that realise AES-GCM
(as, example ZFS use)? Why do you want to merge dm-integrity and
dm-crypt?
https://mbroz.fedorapeople.org/talks/DevConf2017/devconf2017-aead.pdf
Sorry? You mean my own talk? That exactly describes how it is implemented now.

We use AEAD when used together with encryption (dm-crypt) but this requires
LUKS2 userspace branch and this is not something I would like to use until
it is more stable.

As said in the slides you linked, dm-integrity can operate in two modes:

- standalone [parity only] (configured through integritysetup) and

- in cooperation with dm-crypt [for AEAD - authenticated encryption)
(will be configured through cryptsetup, but it is not yet in master branch)

Milan
Loading...