factoring out the mode code and more
mark powers
mark.powers at sun.com
Wed Mar 5 08:45:59 PST 2008
Anthony Scarpino wrote:
> mark powers wrote:
>> I would like to propose the factoring out and parameterization of
>> algorithm
>> mode code, e.g. ECB and CBC, that is duplicated in softtoken as well as
>> various kernel providers (both hardware and software). For example,
>> aes_cbc_encrypt_contiguous_blocks() would become
>> cbc_encrypt_contiguous_blocks()
>> with additional parameters for an algorithm specific block encryption
>> routine,
>> a block copy routine, and a block XOR routine.
>
> I'm all for that..
>
>>
>> Routines that handle data formats could also be made generic by adding
>> additional parameters. For example aes_cipher_update_iov() could become
>> cipher_update_iov() by adding two additional parameters for a algorithm
>> specific block encryption routine and block copy routine.
>>
>> Cut/paste copies abound, e.g. copy_key_to_ctx().
>>
>> This is not the most interesting work but it is work that needs to be
>> done.
>> Modularization is always good and duplicate routines are most always
>> bad. This would result in smaller and more modular providers.
>
> Good stuff! hopefully this will make our lives easier in the future..
>
>>
>> I have working code for CBC mode and have obtained the following
>> numbers.
>> Positive numbers represent a percent increase in execution time (bad);
>> negative numbers represent a decrease. Tests were run on a sun4v using
>> the stc2 kernel performance test, e.g.
>> 'cryperf_cmd -m CKM_AES_CBC -i 1000 -s 1600'.
>>
>> Test Description AES Blowfish DES
>> --------------------------------------------------------------------------
>>
>> decrypt MBLK no ctx_tmpl in-place -0.7056294 -0.155349
>> 0.3140838
>> decrypt MBLK no ctx_tmpl not in-place -0.596724 0.8790499
>> 1.9297353
>> decrypt MBLK with ctx_tmpl in-place 0.3312157 -0.5511493
>> -0.1187767
>> decrypt MBLK with ctx_tmpl not in-place 0.3557175 3.1672403
>> 1.437967
>> decrypt RAW no ctx_tmpl in-place -0.6246451 -0.1557791
>> 0.3585965
>> decrypt RAW no ctx_tmpl not in-place -0.6174762 0.8615217
>> 2.2032619
>> decrypt RAW with ctx_tmpl in-place 0.0713436 -0.5460843
>> -0.222882
>> decrypt RAW with ctx_tmpl not in-place -0.1375983 3.1271017
>> 1.3236677
>> decrypt UIO no ctx_tmpl in-place -0.7118715 -0.1385048
>> 0.270597
>> decrypt UIO no ctx_tmpl not in-place -0.3315865 0.7249463
>> 1.8685502
>> decrypt UIO with ctx_tmpl in-place 0.10926 -0.5672195
>> 0.0704959
>> decrypt UIO with ctx_tmpl not in-place 0.4491429 2.6794898
>> 1.5629129
>> encrypt MBLK no ctx_tmpl in-place -0.0652814 -0.0782052
>> 0.5574273
>> encrypt MBLK no ctx_tmpl not in-place -0.5442802 1.0515774
>> 2.4623328
>> encrypt MBLK with ctx_tmpl in-place -0.1152394 -0.2715053
>> 0.334237
>> encrypt MBLK with ctx_tmpl not in-place -0.1140231 3.8085545
>> 1.9412041
>> encrypt RAW no ctx_tmpl in-place -0.0227794 -0.0743264
>> 0.6902638
>> encrypt RAW no ctx_tmpl not in-place 0.2266514 0.944803
>> 2.5581755
>> encrypt RAW with ctx_tmpl in-place 0.0765947 -0.3225215
>> -0.0158841
>> encrypt RAW with ctx_tmpl not in-place 0.3697104 3.3197377
>> 1.9408673
>> encrypt UIO no ctx_tmpl in-place -0.1278967 -0.0602069
>> 0.5402997
>> encrypt UIO no ctx_tmpl not in-place 0.3321964 0.9407582
>> 2.0355516
>> encrypt UIO with ctx_tmpl in-place 0.0967105 -0.2433845
>> 0.3593826
>> encrypt UIO with ctx_tmpl not in-place 0.9618294 3.3013999
>> 1.8216312
>
> Know a bit about the performance tests in the stc2 gate, have you run
> them a number of times to make sure these increases and decreases are
> consistent? Seem to remember variations performance numbers in the
> past, but that may not be specific with this particular test..
Although I didn't mention it, I ran cryperf_cmd three times and averaged
the results.
>
> Is there any visible reason why blowfish should stuff 3% performance
> loss with ctx_tmpl not in_place consistently? Is the current blowfish
> cbc module assuming something that makes it faster? it seems that
> generally most of the not in-place operations performed worse that the
> in-place.
Yes I noticed that. As Ferenc noted in his reply, it's probably due to
cache collisions. This is something
I need to pay close attention to.
>
>>
>> Taking this one step further, it might be possible to synthesize the
>> various
>> modes totally within the kef module. The only problem is that it
>> would probably
>> result in poor performance for hardware providers handling
>> asynchronous requests
>> since there would be an asynchronous request/reply for every block.
>>
>> _______________________________________________
>> crypto-discuss mailing list
>> crypto-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/crypto-discuss
>
More information about the crypto-discuss
mailing list