Archived

This forum has been archived. Please start a new discussion on GitHub.

concurrency access with Freeze Map

i used several connections and only one map file.
(and i used serveral threads, each thread with a connection, and a map is associated with this connection,and all the map operations are in their own transaction's scope)
concurrency read works well, but concurrency write can not work. deadlock exception is throwed out.

is lock needed whern concurrency access to a map file ? should i write explict lock in my code(such as RWRecLock), or Freeze do this?

Comments

  • matthew
    matthew NL, Canada
    Since Berkeley DB files (in the database format that we use) are page locked and you cannot predict what data goes on what page hence there is nothing your application can do to prevent deadlock exceptions. Since it is therefore always possible get a deadlock exception the correct approach is to retry the database access in this event.
  • what about if i write explict lock in my code(such as RWRecLock)

    can this avoid the problem?
  • matthew
    matthew NL, Canada
    If you explicitly lock outside of the database connections then you can avoid the issue, yes. However, I don't think that is a very good idea since it doesn't allow any concurrency whatsoever among writers. The lock detection berkeley db has allows concurrency as long as the writers don't write the same page. The problem comes if writer 1 locks page A, writes and wants to lock page B. Writer 2 locks page B, writes and then wants to lock page A. Hilarious ensues and you get a deadlock exception causing one writer to give up the lock and retry.
  • and now i found some serious problems.

    concurency read on the same data file can not work! (each "read" thread is in its own connection)

    what's the matter?

    must i serialize "read" operations on the same data file? let alone write.

    { //each thread
                    
                    
                    vector<FileID> file_list;
                    file_list.push_back(FileID("1:539D4000-0000-0000-B65B-640000000000"));
                    file_list.push_back(FileID("1:539D4000-0000-0000-5C9F-9F0000000000"));
                    file_list.push_back(FileID("1:539D4000-0000-0000-09EE-770000000000"));
                    file_list.push_back(FileID("1:539D4000-0000-0000-BAE7-940000000000"));
                    file_list.push_back(FileID("1:539D4000-0000-0000-43AF-160000000000"));
    
                    {
                    printf("%x: come into 1\n", this);
                    FileMap file_map(conn, "file");
                    DataNodeMap datanode_map(conn,"datanode");
    
                    	{
                    	FileMap::iterator file_iter = file_map.find(file_list[thread_index]);
                    	FileInfo info = file_iter->second;
                    	LocationList::iterator location_iter = info.locations.begin();
                    	for (; location_iter != info.locations.end(); ++location_iter) {
                        	if (datanode_map.find(location_iter->first) == datanode_map.end()) {
                            	printf("%d: check datanode fail\n", thread_index);
                        	     }
                              }
    		        } 
                    
                    printf("%x: try leave 1\n", this);
                    
                    } 
                    
                }
    
  • i run this program using 5 threads(one connection per thread)

    sometimes,this test program runs OK,but sometimes it just hung up after "try leave 1"(the program do not end ,but just stop there), sometimes core dump is produced.
  • bernard
    bernard Jupiter, FL
    You can read and write concurrently the same Freeze map as long as each thread uses its own connection and associated map objects.

    If you think there is a bug, please provide a small yet complete test case. Please describe also your operating system, Ice version, Berkeley DB version, C++ compiler etc.

    Cheers,
    Bernard
  • matthew
    matthew NL, Canada
    Please see the attached example that shows that, as expected, concurrent reads work. I suggest that you look at this example to see where you are going wrong.

    If you want us to look further you must provide a complete working compilable example that demonstrates your problem.
  • thanks for your example and your patient.

    the example you guys gave me seemed work well on my PC.

    the first time,i run the program, db file were produced under "db" directory;

    then i do some change on your code, i comment line 104--126 in Client.cpp. that means this time i just read the data in "db" directory, without producing the data every time i run the program.

    then i run the program serval times, with small interval (the frequency is high), and sometimes the program core dumped.
    uname -a
    Linux bogon 2.6.9-55.3.ELsmp #1 SMP Thu May 17 18:31:38 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
    
    gcc -v
    Reading specs from /usr/lib/gcc/x86_64-redhat-linux/3.4.4/specs
    Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux
    Thread model: posix
    gcc version 3.4.4 20050721 (Red Hat 3.4.4-2)
    
    /lib/libc.so.6 
    GNU C Library stable release version 2.3.4, by Roland McGrath et al.
    Copyright (C) 2005 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.
    There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
    PARTICULAR PURPOSE.
    Compiled by GNU CC version 3.4.4 20050721 (Red Hat 3.4.4-2).
    Compiled on a Linux 2.4.20 system on 2005-08-19.
    Available extensions:
            GNU libio by Per Bothner
            crypt add-on version 2.1 by Michael Glad and others
            linuxthreads-0.10 by Xavier Leroy
            The C stubs add-on version 2.1.2.
            BIND-8.2.3-T5B
            NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
            Glibc-2.0 compatibility add-on by Cristian Gafton 
            GNU Libidn by Simon Josefsson
            libthread_db work sponsored by Alpha Processor Inc
    Thread-local storage support included.
    For bug reporting instructions, please see:
    <http://www.gnu.org/software/libc/bugs.html>.
    

    ICE : i install ICE 3.2.1 and BDB from the rpm packages on your website for RHEL 4
  • bernard
    bernard Jupiter, FL
    I've run this test in a loop (with the write section commented out after the first run), and everything works fine -- no crash. This was on RHEL4 x86_64 with the Ice 3.2.1 RPMs. (I also tried on i386, same results).

    Maybe you could extract the stack trace from your code dump and post it?

    Cheers,
    Bernard
  • the statck trace :
    i move the for(;;) loop outside the construction of map, and you can see it int the attached file. so i let each loop construct and deconstruct the map many many times.

    (gdb) thread apply all bt full

    Thread 9 (process 6660):
    #0 0x00000039c9c8ed65 in __nanosleep_nocancel () from /lib64/tls/libc.so.6
    No symbol table info available.
    #1 0x00000039c9c8ebd0 in sleep () from /lib64/tls/libc.so.6
    No symbol table info available.
    #2 0x0000000000404499 in TestApp::run (this=Variable "this" is not available.
    ) at ./Client.cpp:135
    i = {
    locations = {<std::_Vector_base<std::basic_string<char, std::char_traits<char>, std::allocator<char> >,std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >> = {
    _M_impl = {<std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >> = {<__gnu_cxx::new_allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >> = {<No data fields>}, <No data fields>},
    _M_start = 0x51c9d0, _M_finish = 0x51c9f0, _M_end_of_storage = 0x51c9f0}}, <No data fields>}}
    ids = {<std::_Vector_base<std::basic_string<char, std::char_traits<char>, std::allocator<char> >,std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >> = {
    _M_impl = {<std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >> = {<__gnu_cxx::new_allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >> = {<No data fields>}, <No data fields>},
    _M_start = 0x51c1a0, _M_finish = 0x51c1c8, _M_end_of_storage = 0x51c1e0}}, <No data fields>}
    p = {_M_current = 0x51c1c8}
    threads = {<std::_Vector_base<IceUtil::Handle<TestThread>,std::allocator<IceUtil::Handle<TestThread> > >> = {
    _M_impl = {<std::allocator<IceUtil::Handle<TestThread> >> = {<__gnu_cxx::new_allocator<IceUtil::Handle<TestThread> >> = {<No data fields>}, <No data fields>}, _M_start = 0x51d140, _M_finish = 0x51d168, _M_end_of_storage = 0x51d180}}, <No data fields>}
    #3 0x0000003d1d39740f in Ice::Application::main () from /usr/lib64/libIce.so.32
    No symbol table info available.
    #4 0x0000003d1d39802d in Ice::Application::main () from /usr/lib64/libIce.so.32
    No symbol table info available.
    #5 0x0000000000403bda in main (argc=1, argv=0x7fbffff998) at ./Client.cpp:151
    app = {<> = {<No data fields>}, <No data fields>}

    Thread 8 (process 6661):
    #0 0x00000039c9c2ea97 in do_sigwait () from /lib64/tls/libc.so.6
    No symbol table info available.
    #1 0x00000039c9c2eb2d in sigwait () from /lib64/tls/libc.so.6
    No symbol table info available.
    #2 0x00000038fac12a15 in IceUtil::CtrlCHandler::getCallback () from /usr/lib64/libIceUtil.so.32
    No symbol table info available.
    #3 0x00000039ca5060aa in start_thread () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #4 0x00000039c9cc5b43 in clone () from /lib64/tls/libc.so.6
    No symbol table info available.
    #5 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 7 (process 6662):
    ---Type <return> to continue, or q <return> to quit---
    #0 0x00000039ca508acf in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #1 0x0000003d1d3d03bb in IceInternal::ConnectionMonitor::run () from /usr/lib64/libIce.so.32
    No symbol table info available.
    #2 0x00000038fac2af06 in IceUtil::Thread::start () from /usr/lib64/libIceUtil.so.32
    No symbol table info available.
    #3 0x00000039ca5060aa in start_thread () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #4 0x00000039c9cc5b43 in clone () from /lib64/tls/libc.so.6
    No symbol table info available.
    #5 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 6 (process 6663):
    #0 0x00000039ca50ae2d in __lll_mutex_unlock_wake () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #1 0x0000002a95792218 in ?? ()
    No symbol table info available.
    #2 0x0000000000000064 in ?? ()
    No symbol table info available.
    #3 0x00000039ca507e58 in pthread_mutex_unlock () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #4 0x00000038fbdcee8f in __lock_vec () from /usr/lib64/libdb_cxx-4.5.so
    No symbol table info available.
    #5 0x00000038fbdf6eeb in __txn_set_timeout () from /usr/lib64/libdb_cxx-4.5.so
    No symbol table info available.
    #6 0x00000038fbdf797f in __txn_commit () from /usr/lib64/libdb_cxx-4.5.so
    No symbol table info available.
    #7 0x00000038fbdf7ffe in __txn_prepare () from /usr/lib64/libdb_cxx-4.5.so
    No symbol table info available.
    #8 0x00000038fbd346da in DbTxn::commit () from /usr/lib64/libdb_cxx-4.5.so
    No symbol table info available.
    #9 0x0000003d1dd4fadc in Freeze::IteratorHelperI::Tx::~Tx$delete () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #10 0x0000003d1dd52193 in Freeze::IteratorHelperI::cleanup () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #11 0x0000003d1dd522bd in Freeze::IteratorHelperI::close () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #12 0x0000003d1dd5266c in Freeze::IteratorHelperI::~IteratorHelperI$delete () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #13 0x0000000000404bc6 in ~Iterator (this=0x41e01fa0)
    at /usr/lib/gcc/x86_64-redhat-linux/3.4.4/../../../../include/c++/3.4.4/memory:260
    No locals.
    #14 0x0000000000406013 in TestThread::run (this=0x51cb50) at /usr/include/Freeze/Map.h:920
    ---Type <return> to continue, or q <return> to quit---
    q = {_M_current = 0x2a95803c58}
    __PRETTY_FUNCTION__ = "virtual void TestThread::run()"
    #15 0x00000038fac2af06 in IceUtil::Thread::start () from /usr/lib64/libIceUtil.so.32
    No symbol table info available.
    #16 0x00000039ca5060aa in start_thread () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #17 0x00000039c9cc5b43 in clone () from /lib64/tls/libc.so.6
    No symbol table info available.
    #18 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 5 (process 6664):
    #0 0x00000039ca50ad1b in __lll_mutex_lock_wait () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #1 0x0000000000599d50 in ?? ()
    No symbol table info available.
    #2 0x0000000000068f38 in ?? ()
    No symbol table info available.
    #3 0x00000039ca507b04 in pthread_mutex_lock () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #4 0x00000039c9e2f620 in __malloc_initialize_hook () from /lib64/tls/libc.so.6
    No symbol table info available.
    #5 0x0000000000597f00 in ?? ()
    No symbol table info available.
    #6 0x0000000000597f00 in ?? ()
    No symbol table info available.
    #7 0x0000000042802cb8 in ?? ()
    No symbol table info available.
    #8 0x0000000042803140 in ?? ()
    No symbol table info available.
    #9 0x0000000042803160 in ?? ()
    No symbol table info available.
    #10 0x00000039c9c689b6 in free () from /lib64/tls/libc.so.6
    No symbol table info available.
    #11 0x00000039cbaae0ae in operator delete () from /usr/lib64/libstdc++.so.6
    No symbol table info available.
    #12 0x0000003d1dd586c2 in std::list<Freeze::IteratorHelperI*, std::allocator<Freeze::IteratorHelperI*> >::remove ()
    from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #13 0x0000003d1dd68f55 in Freeze::SharedDb::__decRef () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #14 0x0000003d1dd549de in Freeze::MapHelperI::close () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #15 0x0000003d1dd54a4e in Freeze::MapHelperI::~MapHelperI$delete () from /usr/lib64/libFreeze.so.32
    ---Type <return> to continue, or q <return> to quit---
    No symbol table info available.
    #16 0x0000000000405b2d in TestThread::run (this=0x51cc80) at /usr/include/Freeze/Map.h:848
    ex = Variable "ex" is not available.

    Thread 4 (process 6665):
    #0 0x00000039ca50ae2d in __lll_mutex_unlock_wake () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #1 0x0000002a95792298 in ?? ()
    No symbol table info available.
    #2 0x0000000000000065 in ?? ()
    No symbol table info available.
    #3 0x00000039ca507e58 in pthread_mutex_unlock () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #4 0x0000000000000000 in ?? ()
    No symbol table info available.
  • Thread 3 (process 6667):
    #0 0x00000039c9cd11ed in __lll_mutex_unlock_wake () from /lib64/tls/libc.so.6
    No symbol table info available.
    #1 0x000000000059c160 in ?? ()
    No symbol table info available.
    #2 0x0000000000598360 in ?? ()
    No symbol table info available.
    #3 0x00000039c9c6d1de in posix_memalign () from /lib64/tls/libc.so.6
    No symbol table info available.
    #4 0x0000000000000001 in ?? ()
    No symbol table info available.
    #5 0x0000000044606100 in ?? ()
    No symbol table info available.
    #6 0x0000000044606120 in ?? ()
    No symbol table info available.
    #7 0x0000000000510b30 in std::__ioinit ()
    No symbol table info available.
    #8 0x0000000000510b38 in __StringIntDictValueCodec_typeId ()
    No symbol table info available.
    #9 0x0000003d1dd55510 in Freeze::MapHelper::create () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #10 0x0000000000405596 in TestThread::run (this=0x0) at /usr/include/IceUtil/Handle.h:177
    sintdict = {_helper = {_M_ptr = 0x59bf90}, _communicator = {<IceUtil::HandleBase<Ice::Communicator>> = {
    _ptr = 0x51bc00}, <No data fields>}}
    finfo = {_helper = {_M_ptr = 0x59bf90}, _communicator = {<IceUtil::HandleBase<Ice::Communicator>> = {
    _ptr = 0x51bc00}, <No data fields>}}
    __PRETTY_FUNCTION__ = "virtual void TestThread::run()"
    #11 0x00000038fac2af06 in IceUtil::Thread::start () from /usr/lib64/libIceUtil.so.32
    No symbol table info available.
    ---Type <return> to continue, or q <return> to quit---
    #12 0x00000039ca5060aa in start_thread () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #13 0x00000039c9cc5b43 in clone () from /lib64/tls/libc.so.6
    No symbol table info available.
    #14 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 2 (process 6668):
    #0 0x00000039ca508acf in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #1 0x0000003d1dd650ba in Freeze::CheckpointThread::run () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #2 0x00000038fac2af06 in IceUtil::Thread::start () from /usr/lib64/libIceUtil.so.32
    No symbol table info available.
    #3 0x00000039ca5060aa in start_thread () from /lib64/tls/libpthread.so.0
    No symbol table info available.
    #4 0x00000039c9cc5b43 in clone () from /lib64/tls/libc.so.6
    No symbol table info available.
    #5 0x0000000000000000 in ?? ()
    No symbol table info available.

    Thread 1 (process 6666):
    #0 0x00000039c9e2f858 in main_arena () from /lib64/tls/libc.so.6
    No symbol table info available.
    #1 0x0000003d1dd6933c in Freeze::SharedDb::__decRef () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #2 0x0000003d1dd549de in Freeze::MapHelperI::close () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #3 0x0000003d1dd54a4e in Freeze::MapHelperI::~MapHelperI$delete () from /usr/lib64/libFreeze.so.32
    No symbol table info available.
    #4 0x0000000000405b2d in TestThread::run (this=0x51cee0) at /usr/include/Freeze/Map.h:848
    ex = Variable "ex" is not available.
  • matthew
    matthew NL, Canada
    We'll look into this problem. In the meantime you should arrange to keep the database and connection alive during the lifetime of your application. For example, with the demo you can open a connection to the database and the map in main and keep the variables in scope until the threads terminate. Caching the connection & database in this manner also improves performance, so its not a bad thing to do this anyway.
  • thanks for your reply!

    for performance consideration, we can keep the connection and database. and i just want to know if the case i described may happen? or is there something weak in freeze or BDB itself. so the core dump is not caused by my source code,mybe caused by ICE internal.

    i do not know whether you can reproduce this situation.

    actually, i can do it in the way you've suggested,and there is no problem.
  • matthew
    matthew NL, Canada
    I suspect its caused by a bug in Freeze. We'll confirm once we've fully examined the issue.