The TLS layer is checking for mode, such as GCM, CCM, CBC, STREAM. ChachaPoly
needs to have its own mode, even if it's used just one cipher, in order to
allow consistent handling of mode in the TLS layer.
* development: (182 commits)
Change the library version to 2.11.0
Fix version in ChangeLog for fix for #552
Add ChangeLog entry for clang version fix. Issue #1072
Compilation warning fixes on 32b platfrom with IAR
Revert "Turn on MBEDTLS_SSL_ASYNC_PRIVATE by default"
Fix for missing len var when XTS config'd and CTR not
ssl_server2: handle mbedtls_x509_dn_gets failure
Fix harmless use of uninitialized memory in ssl_parse_encrypted_pms
SSL async tests: add a few test cases for error in decrypt
Fix memory leak in ssl_server2 with SNI + async callback
SNI + SSL async callback: make all keys async
ssl_async_resume: free the operation context on error
ssl_server2: get op_name from context in ssl_async_resume as well
Clarify "as directed here" in SSL async callback documentation
SSL async callbacks documentation: clarify resource cleanup
Async callback: use mbedtls_pk_check_pair to compare keys
Rename mbedtls_ssl_async_{get,set}_data for clarity
Fix copypasta in the async callback documentation
SSL async callback: cert is not always from mbedtls_ssl_conf_own_cert
ssl_async_set_key: detect if ctx->slots overflows
...
For the situation where the mbedTLS device has limited RAM, but the
other end of the connection doesn't support the max_fragment_length
extension. To be spec-compliant, mbedTLS has to keep a 16384 byte
incoming buffer. However the outgoing buffer can be made smaller without
breaking spec compliance, and we save some RAM.
See comments in include/mbedtls/config.h for some more details.
(The lower limit of outgoing buffer size is the buffer size used during
handshake/cert negotiation. As the handshake is half-duplex it might
even be possible to store this data in the "incoming" buffer during the
handshake, which would save even more RAM - but it would also be a lot
hackier and error-prone. I didn't really explore this possibility, but
thought I'd mention it here in case someone sees this later on a mission
to jam mbedTLS into an even tinier RAM footprint.)
Fix compilation warnings with IAR toolchain, on 32 bit platform.
Reported by rahmanih in #683
This is based on work by Ron Eldor in PR #750, some of which was independently
fixed by Azim Khan and already merged in PR #1646.
The AES XTS self-test was using a variable len, which was declared only when CTR
was enabled. Changed the declaration of len to be conditional on CTR and XTS.
The AES OFB self-test made use of a variable `offset` but failed to have a
preprocessor condition around it, so unless CTR and CBC were enabled, the
variable would be undeclared.
In ssl_parse_encrypted_pms, some operational failures from
ssl_decrypt_encrypted_pms lead to diff being set to a value that
depended on some uninitialized unsigned char and size_t values. This didn't
affect the behavior of the program (assuming an implementation with no
trap values for size_t) because all that matters is whether diff is 0,
but Valgrind rightfully complained about the use of uninitialized
memory. Behave nicely and initialize the offending memory.
THe function `mbedtls_gf128mul_x_ble()` doesn't multiply by x, x^4, and
x^8. Update the function description to properly describe what the function
does.
mbedtls_aes_crypt_xts() currently takes a `bits_length` parameter, unlike
the other block modes. Change the parameter to accept a bytes length
instead, as the `bits_length` parameter is not actually ever used in the
current implementation.
Add a new context structure for XTS. Adjust the API for XTS to use the new
context structure, including tests suites and the benchmark program. Update
Doxgen documentation accordingly.
AES-XEX is a building block for other cryptographic standards and not yet a
standard in and of itself. We'll just provide the standardized AES-XTS
algorithm, and not AES-XEX. The AES-XTS algorithm and interface provided
can be used to perform the AES-XEX algorithm when the length of the input
is a multiple of the AES block size.
If we're unlucky with memory placement, gf128mul_table_bbe may spread over
two cache lines and this would leak b >> 63 to a cache timing attack.
Instead, take an approach that is less likely to make different memory
loads depending on the value of b >> 63 and is also unlikely to be compiled
to a condition.
XTS mode is fully known as "xor-encrypt-xor with ciphertext-stealing".
This is the generalization of the XEX mode.
This implementation is limited to an 8-bits (1 byte) boundary, which
doesn't seem to be what was thought considering some test vectors [1].
This commit comes with tests, extracted from [1], and benchmarks.
Although, benchmarks aren't really nice here, as they work with a buffer
of a multiple of 16 bytes, which isn't a challenge for XTS compared to
XEX.
[1] http://csrc.nist.gov/groups/STM/cavp/documents/aes/XTSTestVectors.zip
As seen from the first benchmark run, AES-XEX was running pourly (even
slower than AES-CBC). This commit doubles the performances of the
current implementation.
XEX mode, known as "xor-encrypt-xor", is the simple case of the XTS
mode, known as "XEX with ciphertext stealing". When the buffers to be
encrypted/decrypted have a length divisible by the length of a standard
AES block (16), XTS is exactly like XEX.
When MBEDTLS_PLATFORM_MEMORY is defined but MBEDTLS_PLATFORM_FREE_MACRO or
MBEDTLS_PLATFORM_CALLOC_MACRO are not defined then the actual functions
used to allocate and free memory are stored in function pointers.
These pointers are exposed to the caller, and it means that the caller
and the library have to share a data section.
In TF-A, we execute in a very constrained environment, where some images
are executed from ROM and other images are executed from SRAM. The
images that are executed from ROM cannot be modified. The SRAM size
is very small and we are moving libraries to the ROM that can be shared
between the different SRAM images. These SRAM images could import all the
symbols used in mbedtls, but it would create an undesirable hard binary
dependency between the different images. For this reason, all the library
functions in ROM are accesed using a jump table whose base address is
known, allowing the images to execute with different versions of the ROM.
This commit changes the function pointers to actual functions,
so that the SRAM images only have to use the new exported symbols
(mbedtls_calloc and mbedtls_free) using the jump table. In
our scenario, mbedtls_platform_set_calloc_free is called from
mbedtls_memory_buffer_alloc_init which initializes the function pointers
to the internal buffer_alloc_calloc and buffer_alloc_free functions.
No functional changes to mbedtls_memory_buffer_alloc_init.
Signed-off-by: Roberto Vargas <roberto.vargas@arm.com>
Adds error handling into mbedtls_aes_crypt_ofb for AES errors, a self-test
for the OFB mode using NIST SP 800-38A test vectors and adds a check to
potential return errors in setting the AES encryption key in the OFB test
suite.
* development: (97 commits)
Updated version number to 2.10.0 for release
Add a disabled CMAC define in the no-entropy configuration
Adapt the ARIA test cases for new ECB function
Fix file permissions for ssl.h
Add ChangeLog entry for PR#1651
Fix MicroBlaze register typo.
Fix typo in doc and copy missing warning
Fix edit mistake in cipher_wrap.c
Update CTR doc for the 64-bit block cipher
Update CTR doc for other 128-bit block ciphers
Slightly tune ARIA CTR documentation
Remove double declaration of mbedtls_ssl_list_ciphersuites
Update CTR documentation
Use zeroize function from new platform_util
Move to new header style for ALT implementations
Add ifdef for selftest in header file
Fix typo in comments
Use more appropriate type for local variable
Remove useless parameter from function
Wipe sensitive info from the stack
...
Motivation is similar to NO_UDBL_DIVISION.
The alternative implementation of 64-bit mult is straightforward and aims at
obvious correctness. Also, visual examination of the generate assembly show
that it's quite efficient with clang, armcc5 and arm-clang. However current
GCC generates fairly inefficient code for it.
I tried to rework the code in order to make GCC generate more efficient code.
Unfortunately the only way to do that is to get rid of 64-bit add and handle
the carry manually, but this causes other compilers to generate less efficient
code with branches, which is not acceptable from a side-channel point of view.
So let's keep the obvious code that works for most compilers and hope future
versions of GCC learn to manage registers in a sensible way in that context.
See https://bugs.launchpad.net/gcc-arm-embedded/+bug/1775263
Allowing DECRYPT with crypt_and_tag is a risk as people might fail to check
the tag correctly (or at all). So force them to use auth_decrypt() instead.
See also https://github.com/ARMmbed/mbedtls/pull/1668
When MBEDTLS_TIMING_C was not defined in config.h, but the MemSan
memory sanitizer was activated, entropy_poll.c used memset without
declaring it. Fix this by including string.h unconditionally.
As a protection against the Lucky Thirteen attack, the TLS code for
CBC decryption in encrypt-then-MAC mode performs extra MAC
calculations to compensate for variations in message size due to
padding. The amount of extra MAC calculation to perform was based on
the assumption that the bulk of the time is spent in processing
64-byte blocks, which is correct for most supported hashes but not for
SHA-384. Correct the amount of extra work for SHA-384 (and SHA-512
which is currently not used in TLS, and MD2 although no one should
care about that).
Fix IAR compiler warnings
Two warnings have been fixed:
1. code 'if( len <= 0xFFFFFFFF )' gave warning 'pointless integer comparison'.
This was fixed by wraping the condition in '#if SIZE_MAX > 0xFFFFFFFF'.
2. code 'diff |= A[i] ^ B[i];' gave warning 'the order of volatile accesses is undefined in'.
This was fixed by read the volatile data in temporary variables before the computation.
Explain IAR warning on volatile access
Consistent use of CMAKE_C_COMPILER_ID
The cast to void was motivated by the assumption that the functions only
return non-zero when passed bad arguments, but that might not be true of
alternative implementation, for example on hardware failure.
- need HW failure codes too
- re-use relevant poly codes for chachapoly to save on limited space
Values were chosen to leave 3 free slots at the end of the NET odd range.
That's what it is. So we shouldn't set a block size != 1.
While at it, move call to chachapoly_update() closer to the one for GCM, as
they are similar (AEAD).
This reduces clutter, making the functions more readable.
Also, it makes lcov see each line as covered. This is not cheating, as the
lines that were previously seen as not covered are not supposed to be reached
anyway (failing branches of the selftests).
Thanks to this and previous test suite enhancements, lcov now sees chacha20.c
and poly1305.c at 100% line coverage, and for chachapoly.c only two lines are
not covered (error returns from lower-level module that should never happen
except perhaps if an alternative implementation returns an unexpected error).