A new test for mbedtls_timing_alarm(0) was introduced in PR 1136, which also
fixed it on Unix. Apparently test results on MinGW were not checked at that
point, so we missed that this new test was also failing on this platform.
generate add ctest test-suites, with the --verbose argument to be given
to the test suites.
The verbose output will be shown **only** if ctest is run with `-v` parameter
The verbose argument is to the test-suites, only when run through `ctest`
The race goes this way:
1. ssl_recv() succeeds (ie no signal received yet)
2. processing the message leads to aborting handshake with ret != 0
3. reset ret if we were signaled
4. print error if ret is still non-zero
5. go back to net_accept() which can be interrupted by a signal
We print the error message only if the signal is received between steps 3 and
5, not when it arrives between steps 1 and 3.
This can cause failures in ssl-opt.sh where we check for the presence of "Last
error was..." in the server's output: if we perform step 2, the client will be
notified and exit, then ssl-opt.sh will send SIGTERM to the server, but if it
didn't get a chance to run and pass step 3 in the meantime, we're in trouble.
The purpose of step 3 was to avoid spurious "Last error" messages in the
output so that ssl-opt.sh can check for a successful run by the absence of
that message. However, it is enough to suppress that message when the last
error we get is the one we expect from being interrupted by a signal - doing
more could hide real errors.
Also, improve the messages printed when interrupted to make it easier to
distinguish the two cases - this could be used in a testing script wanted to
check that the server doesn't see the client as disconnecting unexpectedly.
If lsof is not available, wait_server_start uses a fixed timeout,
which can trigger a race condition if the timeout turns out to be too
short. Emit a warning so that we know this is going on from the test
logs.
- Some of the CI machines don't have lsof installed yet, so rely on an sleeping
an arbitrary number of seconds while the server starts. We're seeing
occasional failures with the current delay because the CI machines are highly
loaded, which seems to indicate the current delay is not quite enough, but
hopefully not to far either, so double it.
- While at it, also double the watchdog delay: while I don't remember seeing
much failures due to client timeout, this change doesn't impact normal
running time of the script, so better err on the safe side.
These changes don't affect the test and should only affect the false positive
rate coming from the test framework in those scripts.
1) The MPI test for prime generation missed a return value
check for a call to `mbedtls_mpi_shift_r`. This is neither
critical nor new but should be fixed.
2) The RSA keygeneration example program contained code
initializing an RSA context after a potentially failing
call to CTR DRBG initialization, leaving the corresponding
RSA context free call in the cleanup section orphaned.
The commit fixes this by moving the initializtion of the
RSA context prior to the first potentially failing call.
* mbedtls-2.1:
selftest: fix build error in some configurations
Timing self test: shorten redundant tests
Timing self test: increased duration
Timing self test: increased tolerance
selftest: allow excluding a subset of the tests
selftest: allow running a subset of the tests
selftest: fixed an erroneous return code
selftest: refactor to separate the list of tests from the logic
Timing self test: print some diagnosis information
mbedtls_timing_get_timer: don't use uninitialized memory
timing interface documentation: minor clarifications
Timing: fix mbedtls_set_alarm(0) on Unix/POSIX
* public/pr/1223:
selftest: fix build error in some configurations
Timing self test: shorten redundant tests
Timing self test: increased duration
Timing self test: increased tolerance
selftest: allow excluding a subset of the tests
selftest: allow running a subset of the tests
selftest: fixed an erroneous return code
selftest: refactor to separate the list of tests from the logic
Timing self test: print some diagnosis information
mbedtls_timing_get_timer: don't use uninitialized memory
timing interface documentation: minor clarifications
Timing: fix mbedtls_set_alarm(0) on Unix/POSIX
Increase the duration of the self test, otherwise it tends to fail on
a busy machine even with the recently upped tolerance. But run the
loop only once, it's enough for a simple smoke test.
mbedtls_timing_self_test fails annoyingly often when running on a busy
machine such as can be expected of a continous integration system.
Increase the tolerances in the delay test, to reduce the chance of
failures that are only due to missing a deadline on a busy machine.
Print some not-very-nice-looking but helpful diagnosis information if
the timing selftest fails. Since the failures tend to be due to heavy
system load that's hard to reproduce, this information is necessary to
understand what's going on.
mbedtls_timing_get_timer with reset=1 is called both to initialize a
timer object and to reset an already-initialized object. In an
initial call, the content of the data structure is indeterminate, so
the code should not read from it. This could crash if signed overflows
trap, for example.
As a consequence, on reset, we can't return the previously elapsed
time as was previously done on Windows. Return 0 as was done on Unix.
The POSIX/Unix implementation of mbedtls_set_alarm did not set the
mbedtls_timing_alarmed flag when called with 0, which was inconsistent
with what the documentation implied and with the Windows behavior.
Add --keep-going mode to all.sh. In this mode, if a test fails, keep
running the subsequent tests. If a build fails, skip any tests of this
build and move on to the next tests. Errors in infrastructure, such as
git or cmake runs, remain fatal. Print an error summary at the end of
the run, and return a nonzero code if there was any failure.
In known terminal types, use color to highlight errors.
On a fatal signal, interrupt the run and report the errors so far.
Port wait_server_start from ssl-opt.sh to compat.sh, instead of just
using "sleep 1". This solves the problem that on a heavily loaded
machine, sleep 1 is sometimes not enough (we had CI failures because
of this). This is also faster on a lightly-loaded machine (execution
time reduced from ~8min to ~6min on my machine).
In wait_server_start, fork less. When lsof is present, call it on the
expected process. This saves a few percent of execution time on a
lightly loaded machine. Also, sleep for a short duration rather than
using a tight loop.