Fix race in `concurrent_vector::grow_by()` #1532

aleksei-fedotov · 2024-10-22T14:10:49Z

Description

When thread tries to grow concurrent_vector, it first creates locally a new extended segmentation table, with which it then tries to update the global one. However, before this patch, the thread simply updated the global table not taking into account that there might be the other threads doing the same thing. This patch makes the thread aware about its possibly concurrent environment.

Fixes #1531

Type of change

Choose one or multiple, leave empty if none of the other choices apply

Add a respective label(s) to PR if you have permissions

bug fix - change that fixes an issue
new feature - change that adds functionality
tests - change in tests
infrastructure - change in infrastructure and CI
documentation - documentation update

Tests

added - required for new features and some bug fixes
updated
not needed

Documentation

updated in # - add PR number
needs to be updated
not needed

Breaks backward compatibility

Yes
No
Unknown

Notify the following users

@npotravkin

Other information

Signed-off-by: Fedotov, Aleksei <[email protected]>

include/oneapi/tbb/concurrent_vector.h

test/tbb/test_concurrent_vector.cpp

Signed-off-by: Fedotov, Aleksei <[email protected]>

kboyarinov

Other than this, the patch looks good for me.
I would still wait for feedback from @pavelkumbrasev to ensure we did not miss something.

include/oneapi/tbb/detail/_segment_table.h

kboyarinov · 2024-10-25T06:08:52Z

include/oneapi/tbb/detail/_segment_table.h

+                    } else if (new_table) {
+                        // Other thread was the first to replace the segment table. Current thread's
+                        // table is not needed anymore, so destroying it.
+                        destroy_and_deallocate_table(new_table, pointers_per_long_table);
                    }
                }).on_exception([&] {
                    my_segment_table_allocation_failed.store(true, std::memory_order_relaxed);


Would this logic work correct while having one thread which allocates the long table and successfully stores it into the my_segment_table and another which allocates the long table and receives a bad_alloc while trying to allocate?
It seems like we need to double check this here.

It depends on what logic we think is correct. Perhaps, not reporting allocation error in case it succeeded by some other thread makes sense, and double checking my_segment_table for updated state first can improve algorithm robustness in some cases. Although, I am not sure how frequent these cases can be. Also, it is not full cover anyway, as successful allocation can still sneak in after that double check. Full cover requires recording the failure state into single source of information such as my_segment_table or some kind of non-trivial synchronization between multiple information sources, but we don't have that kind of logic through the whole vector right now I guess. So, at most we can write something like the following instead:

Suggested change

my_segment_table_allocation_failed.store(true, std::memory_order_relaxed);

// Last chance to overcome the failure, hoping that other thread has succeeded

// extending the table

table = get_table();

if (table == my_embedded_table) {

my_segment_table_allocation_failed.store(true, std::memory_order_relaxed);

}

Does it suffice though?

Fix race in concurrent_vector::grow_by()

5798d44

Signed-off-by: Fedotov, Aleksei <[email protected]>

aleksei-fedotov added the bug fix label Oct 22, 2024

aleksei-fedotov requested review from kboyarinov and pavelkumbrasev October 22, 2024 14:11

github-actions bot added the tests label Oct 22, 2024

kboyarinov reviewed Oct 22, 2024

View reviewed changes

include/oneapi/tbb/concurrent_vector.h Outdated Show resolved Hide resolved

test/tbb/test_concurrent_vector.cpp Outdated Show resolved Hide resolved

aleksei-fedotov added 2 commits October 22, 2024 22:21

Address remarks in test

09ff625

Signed-off-by: Fedotov, Aleksei <[email protected]>

Address remarks in implementation

da909fe

Signed-off-by: Fedotov, Aleksei <[email protected]>

kboyarinov reviewed Oct 23, 2024

View reviewed changes

include/oneapi/tbb/detail/_segment_table.h Show resolved Hide resolved

aleksei-fedotov mentioned this pull request Oct 24, 2024

Deadlock in tbb::concurrent_vector #1531

Open

Extend the test and add explanation comment

d9e4ce0

pavelkumbrasev reviewed Oct 24, 2024

View reviewed changes

include/oneapi/tbb/detail/_segment_table.h Show resolved Hide resolved

kboyarinov reviewed Oct 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix race in `concurrent_vector::grow_by()` #1532

Fix race in `concurrent_vector::grow_by()` #1532

aleksei-fedotov commented Oct 22, 2024

kboyarinov left a comment

kboyarinov Oct 25, 2024

aleksei-fedotov Oct 25, 2024

-                    my_segment_table_allocation_failed.store(true, std::memory_order_relaxed);
+                    // Last chance to overcome the failure, hoping that other thread has succeeded
+                    // extending the table
+                    table = get_table();
+                    if (table == my_embedded_table) {
+                        my_segment_table_allocation_failed.store(true, std::memory_order_relaxed);
+                    }

Fix race in concurrent_vector::grow_by() #1532

Are you sure you want to change the base?

Fix race in concurrent_vector::grow_by() #1532

Conversation

aleksei-fedotov commented Oct 22, 2024

Description

Type of change

Tests

Documentation

Breaks backward compatibility

Notify the following users

Other information

kboyarinov left a comment

Choose a reason for hiding this comment

kboyarinov Oct 25, 2024

Choose a reason for hiding this comment

aleksei-fedotov Oct 25, 2024

Choose a reason for hiding this comment

Fix race in `concurrent_vector::grow_by()` #1532

Fix race in `concurrent_vector::grow_by()` #1532