Thread-unsafe access to lock->owner in PR_Lock

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Thread-unsafe access to lock->owner in PR_Lock

Alexander Potapenko
Hi NSPR developers,

while running Chromium tests under ThreadSanitizer (https://code.google.com/p/data-race-test/) we've encountered a report about a data race between PR_Lock and PR_Unlock: http://crbug.com/328521

It turns out that the lock->owner variable is being read by PR_Lock at the same time other threads read or write it in PR_Lock or PR_Unlock:

189 PR_IMPLEMENT(void) PR_Lock(PRLock *lock)
190 {
191     PRThread *me = _PR_MD_CURRENT_THREAD();
192     PRIntn is;
193     PRThread *t;
194     PRCList *q;
195
196     PR_ASSERT(me != suspendAllThread);
197     PR_ASSERT(!(me->flags & _PR_IDLE_THREAD));
198     PR_ASSERT(lock != NULL);
199 #ifdef _PR_GLOBAL_THREADS_ONLY
200     PR_ASSERT(lock->owner != me);
201     _PR_MD_LOCK(&lock->ilock);
202     lock->owner = me;
203     return;
204 #else  /* _PR_GLOBAL_THREADS_ONLY */

(code from nspr-4.9.5/mozilla/nsprpub/pr/src/threads/combined/prulock.c in Ubuntu Precise)

In the case the optimizer is conservative enough nothing will break (during the race lock->owner will be either another thread or NULL, but not |me|), but this is undefined behavior in C++11 and can possibly break in newer compilers.

Is there a reason to do this check before the lock is taken? I'm guessing it's there to avoid reentrancy, so it should be fine to swap lines 200 and 201.

WBR,
Alexander Potapenko
Software Engineer
Google Moscow
_______________________________________________
dev-tech-nspr mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-nspr
Reply | Threaded
Open this post in threaded view
|

Re: Thread-unsafe access to lock->owner in PR_Lock

Dave Hylands
Hi Alexander,

----- Original Message -----

> From: "Alexander Potapenko" <[hidden email]>
> To: [hidden email]
> Sent: Wednesday, December 18, 2013 3:46:30 AM
> Subject: Thread-unsafe access to lock->owner in PR_Lock
>
> Hi NSPR developers,
>
> while running Chromium tests under ThreadSanitizer
> (https://code.google.com/p/data-race-test/) we've encountered a report about
> a data race between PR_Lock and PR_Unlock: http://crbug.com/328521
>
> It turns out that the lock->owner variable is being read by PR_Lock at the
> same time other threads read or write it in PR_Lock or PR_Unlock:
>
> 189 PR_IMPLEMENT(void) PR_Lock(PRLock *lock)
> 190 {
> 191     PRThread *me = _PR_MD_CURRENT_THREAD();
> 192     PRIntn is;
> 193     PRThread *t;
> 194     PRCList *q;
> 195
> 196     PR_ASSERT(me != suspendAllThread);
> 197     PR_ASSERT(!(me->flags & _PR_IDLE_THREAD));
> 198     PR_ASSERT(lock != NULL);
> 199 #ifdef _PR_GLOBAL_THREADS_ONLY
> 200     PR_ASSERT(lock->owner != me);
> 201     _PR_MD_LOCK(&lock->ilock);
> 202     lock->owner = me;
> 203     return;
> 204 #else  /* _PR_GLOBAL_THREADS_ONLY */
>
> (code from nspr-4.9.5/mozilla/nsprpub/pr/src/threads/combined/prulock.c in
> Ubuntu Precise)
>
> In the case the optimizer is conservative enough nothing will break (during
> the race lock->owner will be either another thread or NULL, but not |me|),
> but this is undefined behavior in C++11 and can possibly break in newer
> compilers.
>
> Is there a reason to do this check before the lock is taken? I'm guessing
> it's there to avoid reentrancy, so it should be fine to swap lines 200 and
> 201.

If my understanding is correct, if you swap the 2 lines, then you may deadlock yourself rather than see the assert.

http://mozilla.6506.n7.nabble.com/Are-PR-Locks-thread-re-entrant-td244962.html

Dave Hylands
_______________________________________________
dev-tech-nspr mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-nspr
Reply | Threaded
Open this post in threaded view
|

Re: Thread-unsafe access to lock->owner in PR_Lock

Wan-Teh Chang-3
In reply to this post by Alexander Potapenko
On Wed, Dec 18, 2013 at 3:46 AM, Alexander Potapenko <[hidden email]> wrote:

> Hi NSPR developers,
>
> while running Chromium tests under ThreadSanitizer (https://code.google.com/p/data-race-test/) we've encountered a report about a data race between PR_Lock and PR_Unlock: http://crbug.com/328521
>
> It turns out that the lock->owner variable is being read by PR_Lock at the same time other threads read or write it in PR_Lock or PR_Unlock:
>
> 189 PR_IMPLEMENT(void) PR_Lock(PRLock *lock)
> 190 {
> 191     PRThread *me = _PR_MD_CURRENT_THREAD();
> 192     PRIntn is;
> 193     PRThread *t;
> 194     PRCList *q;
> 195
> 196     PR_ASSERT(me != suspendAllThread);
> 197     PR_ASSERT(!(me->flags & _PR_IDLE_THREAD));
> 198     PR_ASSERT(lock != NULL);
> 199 #ifdef _PR_GLOBAL_THREADS_ONLY
> 200     PR_ASSERT(lock->owner != me);
> 201     _PR_MD_LOCK(&lock->ilock);
> 202     lock->owner = me;
> 203     return;
> 204 #else  /* _PR_GLOBAL_THREADS_ONLY */
>
> (code from nspr-4.9.5/mozilla/nsprpub/pr/src/threads/combined/prulock.c in Ubuntu Precise)
>
[...]
>
> Is there a reason to do this check before the lock is taken? I'm guessing
> it's there to avoid reentrancy, so it should be fine to swap lines 200 and 201.

Hi Alexander,

PRLock is a non-recursive lock. So the PR_ASSERT(lock->owner != me) on
line 200 is trying to detect an attempt to lock a PRLock recursively.
It is intended as a debugging aid for NSPR users.

This debugging aid is useful, so ideally we should find a way to do it
the right way.

If we swap lines 200 and 201, it will only work for a _PR_MD_LOCK
implementation that allows a lock to be acquired recursively. As Dave
Hylands pointed out, in a _PR_MD_LOCK implementation that deadlocks,
we won't be able to see the assertion failure. However, since we only
use the PR_Lock code in prulock.c on Windows, where _PR_MD_LOCK
(EnterCriticalSection) is recursive, we can make the change you
suggested.

I checked the other PR_Lock implementation in ptsynch.c:
http://mxr.mozilla.org/nspr/source/pr/src/pthreads/ptsynch.c#170

It doesn't have the equivalent assertion, and nobody complained, so we
can also just remove the assertion.

Wan-Teh
_______________________________________________
dev-tech-nspr mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-nspr
Reply | Threaded
Open this post in threaded view
|

Re: Thread-unsafe access to lock->owner in PR_Lock

Wan-Teh Chang-3
On Thu, Dec 19, 2013 at 3:47 PM, Wan-Teh Chang <[hidden email]> wrote:

>
> Hi Alexander,
>
> PRLock is a non-recursive lock. So the PR_ASSERT(lock->owner != me) on
> line 200 is trying to detect an attempt to lock a PRLock recursively.
> It is intended as a debugging aid for NSPR users.
>
> This debugging aid is useful, so ideally we should find a way to do it
> the right way.
>
> If we swap lines 200 and 201, it will only work for a _PR_MD_LOCK
> implementation that allows a lock to be acquired recursively. As Dave
> Hylands pointed out, in a _PR_MD_LOCK implementation that deadlocks,
> we won't be able to see the assertion failure. However, since we only
> use the PR_Lock code in prulock.c on Windows, where _PR_MD_LOCK
> (EnterCriticalSection) is recursive, we can make the change you
> suggested.

I filed an NSPR bug report and wrote a patch:
https://bugzilla.mozilla.org/show_bug.cgi?id=952621

Wan-Teh Chang
_______________________________________________
dev-tech-nspr mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-nspr