"for co_await" was in the coroutine-ts but did not make it into
the C++20 standard. This patch translates the "for co_await"
which were used to standard for loops using "co_await".
This is necessary to compile on MSVC 1928 in with c++-latest.
* Iterator type needed to support equality comparisons both it == sentinel and sentinel == it.
* difference_type needed to be a signed integer (it was an unsigned size_t)
Use a separate 'generator_sentinel' type for the end() iterator.
This avoids some mutation of the coroutine_handle member of the
iterator and turns the comparison into a call to .done().
Seems to allow better allocation elision of the coroutine frame
in some cases when building under Clang.
The awaiting coroutine will be resumed using the scheduler if it
is suspended waiting for a sequence number. This simplifies code
as the caller doesn't need to remember to manually reschedule.
Forcing a scheduler to provided also prevents the producer/consumer
coroutines from effectively becoming single-threaded.
This will allow them to be run on Linux platforms.
Change producer in multi-producer examples to reschedule onto the
thread-pool if probably suspended. This allows producer and
consumer to run concurrently rather than the consumer executing
the producer inline.
Declare win32_overlapped_operation's await_suspend() methods as
noinline to work around a bad codegen bug under x86/x64 optimised
builds under MSVC 2017.7 and 2017.8.
Mark task<T>'s final_awaitable::await_suspend() method as noinline.
This fixes a crash in the multi-threaded async_auto_reset_event
tests under x86 optimised builds under MSVC 2017.8.
MSVC 2017.8 generates bad code for when_all() and sync_wait()
that causes a crash due to it resuming the coroutine at the
wrong suspend-point in the expression 'co_yield co_await x'.
Work around this by manually calling promise.yield_value()
instead of using co_yield.
MSVC was inlining local variables in await_suspend() into the
coroutine frame of the caller which was breaking the guarantees
required by the for_each_async() test.
- Worker threads now spin for a short while before putting
themselves to sleep. This reduces the overhead for enqueueing
items as the enqueuing thread doesn't have to call into the OS
to wake up the thread so often. It should also improve the
responsiveness of worker threads.
- Keep track of the number of sleeping threads in an atomic integer
so that enqueueing threads only need to look in one place to check
whether any threads need to be woken rather than scanning the
thread-state of each worker thread.
- Make m_globalQueueHead atomic so that worker threads that are
spinning waiting for new work can perform an approximate check
for new work without needing to acquire the mutex lock.