Archived

This forum has been archived. Please start a new discussion on GitHub.

FreezeMap Use Clarification

In the Freeze documentation in Section 16.3.1 (page 371), one paragraph reads:
Next, the code instantiates the StringIntMap on the database. A Freeze map always assumes that it is the exclusive user of the database, and that all entries in the database were created by it. In other words, a database must be dedicated to a single Freeze map.
Why must a Freeze map assume that it is the exclusive user of the database? Why can't another thread (on the same machine or a different one) create a Freeze map for the same database? Particularly if the other thread(s) are only reading from the Freeze map.

Also, why must a Freeze map assume that it created all the entries in the database?

Thanks,


Ken Carpenter

Comments

  • matthew
    matthew NL, Canada
    Hi Ken,

    I'm not sure whether this is possible with berkeley db, and even if it is possible I've never tried it and it will not work with the Freeze map. The map itself must be protected from concurrent access. Therefore if you had two processes altering the database concurrently it could lead to inconsistencies in the maps (consider iteration over one map, while the other map alters the map).
    Why can't another thread (on the same machine or a different one) create a Freeze map for the same database?

    You can have access to the SAME map object within the same process from multple threads (as long as the map is protected - just as you expect with an STL map). But you cannot create two map objects on the same database and expected it work.

    Why do you want to do this anyway? It was certainly pretty common to have a shared database prior to nice middleware like Ice so that two processes could exchange data. However, I think its much more convenient to use Ice and Freeze as its intended than to muck around with shared databases like this :)

    Regards, Matthew
  • Perhaps I'm still thinking too data-centrically rather than middleware-centrically.

    I plan to have a database that stores Commands. There will be many net-facing servers that need to write Commands to the database. Originally this would have been done by performing an INSERT SQL statement from each of the hosts. The database's concurrency mechanisms keep things straight.

    With Ice, I would instead define an insertCommand() interface in slice, implement the servant and run it on the database server. I would alsocreate a FreezeMap for the Commands that insertCommand() would use to insert into. The net-facing servers then simply call insertCommand().

    From what you've described, I get the impression that as long as I only have one FreezeMap instantiated at any given time (i.e., no possibility of concurrent write access), then things will be fine, which is what I would expect.

    Just in case I need it (maybe I don't), if I create one FreezeMap within a read-only transaction and another within an update transaction, then there shouldn't be a problem, right? As long as I don't try to update the database.

    Thanks,


    Ken Carpenter
  • matthew
    matthew NL, Canada
    Hi,
    Just in case I need it (maybe I don't), if I create one FreezeMap within a read-only transaction and another within an update transaction, then there shouldn't be a problem, right? As long as I don't try to update the database.

    At present there is no transaction system available with Freeze. This is a planned future addition - up to this point it hasn't been necessary.
    From what you've described, I get the impression that as long as I only have one FreezeMap instantiated at any given time (i.e., no possibility of concurrent write access), then things will be fine, which is what I would expect.

    This isn't quite correct. You can have as many maps as you want instantiated - but each must refer to its own database.
    With Ice, I would instead define an insertCommand() interface in slice, implement the servant and run it on the database server. I would alsocreate a FreezeMap for the Commands that insertCommand() would use to insert into. The net-facing servers then simply call insertCommand().

    I still think this isn't OO enough. I wouldn't use the freeze map at all, but instead use a freeze evictor. With this define a slice class Command, define some public members on it, and the evictor does everything else for you! If you don't want to expose your public data members, then you can use the approach outlined by Marc http://www.zeroc.com/vbulletin/showthread.php?s=&threadid=37

    Regards, Matthew
  • At present there is no transaction system available with Freeze. This is a planned future addition - up to this point it hasn't been necessary.
    From the examples in the Ice docs, and from looking at DBI.h, it looks like if I call dbEnv->openDBWithTxn(), then I can pass the returned DBPtr into my Freeze map constructor.

    Will this not work?

    I believe I will need to do this or something that has a similar effect.

    I have a set of net-facing servers to answer queries, and a set of batch-oriented back-end servers doing wads of processing on the data that the net-facing servers answer queries on. This back-end processing is called a "turn" (the project is a massively multiplayer game).

    After each turn (10-30 seconds), I want the net-facing servers to "see" the results of the back-end processing. This must be transactional in nature. In other words, as far the net-facing servers are concerned all changes happened at the same time, so there is no inconsistency.

    When I was considering using a relational database, the net-facing servers would all begin read-only transactions on all the databases they might access. The back-end servers would then begin processing the next "turn". There would be many update transactions happening, however, the net-facing servers would not see any changes due to the multi-version concurrency control (MVCC) features of the database. Berekely DB claims to have the same feature.

    Once back-end processing has finished, the net-facing servers will then close their previous read transactions and begin new ones. The job of telling the machines to begin transactions, and begin processing is coordinated by a Turn Controller server.

    I think my net-facing servers might be good candidates for evictors, but my batch-oriented back-end servers don't seem to be.

    Thanks,


    Ken Carpenter
  • matthew
    matthew NL, Canada
    From the examples in the Ice docs, and from looking at DBI.h, it looks like if I call dbEnv->openDBWithTxn(), then I can pass the returned DBPtr into my Freeze map constructor.

    No, this doesn't work. A transaction object must be provided to each berkeley db method invocation and at present we don't do this. This is a planned future addition. That being said having transactions in one server is only somewhat useful. Its far more useful to be able to coordinate a transaction across multiple cooperating servers.

    An important thing to note here is that Transactional systems are FAR FAR slower than non transactional systems. If you want entirely safe transactions (that is - no guarantee of loss) then you have to live with the fact that you can only get at best 70 transactions a second with standard hardware (due to the number of writes and syncs required). If you can stand to lose some amount of data then you can use other transactional models and get better throughput.
    I think my net-facing servers might be good candidates for evictors, but my batch-oriented back-end servers don't seem to be.

    I think this is more because you have a particular model in mind to which you are trying to adopt Ice. This model doesn't appear to be particularly object oriented which makes adopting to the Ice object model more difficult. If your backend batch oriented system was viewed as a series of cooperating objects, my guess is that you'd find that the persistence model and the non-transactional nature of Freeze wouldn't be an issue. I bet the performance and ease of implementation would also be a pleasant suprise :)

    Of course, you can use Ice as a simple RPC mechanism, ignoring the Ice object model completely, but then you lose a lot of functionality.

    Regards, Matthew
  • Originally posted by matthew
    An important thing to note here is that Transactional systems are FAR FAR slower than non transactional systems. If you want entirely safe transactions (that is - no guarantee of loss) then you have to live with the fact that you can only get at best 70 transactions a second with standard hardware (due to the number of writes and syncs required). If you can stand to lose some amount of data then you can use other transactional models and get better throughput.
    Actually, I've been testing database performance fairly extensively lately, and I know that they can do better than that. Here is a chunk of code for a very simple Oracle update test. On a P4 1.7Ghz system with 512MB, it gives me about 420 transactions per second (over a 100mbit LAN).
    for (int i=0; i<NUM_ITEMS; i++)
    {
      Statement* stmt = conn->createStatement("UPDATE SHIPS SET NAME=:1 WHERE SHIPID=:2");
      char name[100] = "updatetest";
    
      stmt->setString(1, name);
      stmt->setInt(2, i);
      stmt->executeUpdate();
      conn->commit();
      delete stmt;
    }
    
    Originally posted by matthew
    I think this is more because you have a particular model in mind to which you are trying to adopt Ice. This model doesn't appear to be particularly object oriented which makes adopting to the Ice object model more difficult. If your backend batch oriented system was viewed as a series of cooperating objects, my guess is that you'd find that the persistence model and the non-transactional nature of Freeze wouldn't be an issue. I bet the performance and ease of implementation would also be a pleasant suprise :)
    I think you're probably right about my trying to fit Ice to my current model. Do you have any suggestions for books or online resources I might read to help reform my thinking? :)

    Thanks again Matthew,


    Ken Carpenter
  • matthew
    matthew NL, Canada
    Actually, I've been testing database performance fairly extensively lately, and I know that they can do better than that. Here is a chunk of code for a very simple Oracle update test. On a P4 1.7Ghz system with 512MB, it gives me about 420 transactions per second (over a 100mbit LAN).

    Its possible if you have very strict control over the hardware to get better transaction rates. Its also possible to get better rates if you cheat (ie: relax the atomicity guarantees that the transaction typically gives you) - I don't know exactly what Oracle is doing behind the scenes.

    However, consider that for a two phase commit transaction you need three atomic disk writes (txn_begin, txn_prepare, txn_commit) + the actual data write per transaction. For a disk with a 10ms seek time that equates best case to 30ms per transaction (if you have the data on a seperate disk) or ~33 transactions a second. If you start to relax the guarantees, or play other tricks (like not flushing the transactions or batching the transactions or a different less safe model) then you can get better rates.

    Anyway, don't take my word for it -- see the authorative guide on this subject (more than you'd EVER want to know) http://www.amazon.com/exec/obidos/tg/detail/-/1558601902/qid=1046483645/sr=8-3/ref=sr_8_3/102-5138017-1740124?v=glance&s=books&n=507846

    The point is if you have a transactional system you have a slower, but potentially (not guaranteed) safer system. I assure you have you can get much better than 420 invocations per second with Ice :)
    I think you're probably right about my trying to fit Ice to my current model. Do you have any suggestions for books or online resources I might read to help reform my thinking?

    I'm sorry, other than the classics (such as Distributed Systems: Principles and Paradigms http://www.amazon.com/exec/obidos/ASIN/0130888931/qid=1046483999/sr=2-2/ref=sr_2_2/102-5138017-1740124)
    I don't have any good books to recommend to you.

    Michi, perhaps you have some good books?

    Regards, Matthew
  • Originally posted by matthew
    Michi, perhaps you have some good books?

    Hmm... I think you pretty much wrapped it up. Patterns for Concurrent and Networked Objects is good: http://www.amazon.com/exec/obidos/ASIN/0471606952/qid=1046485570/sr=2-2/ref=sr_2_2/103-5507552-5991835

    Cheers,

    Michi.
  • Originally posted by matthew

    However, consider that for a two phase commit transaction you need three atomic disk writes (txn_begin, txn_prepare, txn_commit) + the actual data write per transaction. For a disk with a 10ms seek time that equates best case to 30ms per transaction (if you have the data on a seperate disk) or ~33 transactions a second. If you start to relax the guarantees, or play other tricks (like not flushing the transactions or batching the transactions or a different less safe model) then you can get better rates.
    A DBMS (including Berkeley DB) can achieve far higher transaction rates than 70/sec as long as your server is multi-threaded (which it would be for an Ice server). The DBMS does it by performing "group commits" which I assume is what you were referring to as "batching the transactions". A group commit amortizes the cost of the disk write over all of the transactions that are being committed as a group.

    There is nothing about "batching the transactions" that is "less safe" than commiting transactions serially. It is possible to deliver the full ACID transaction properties while performing group commits. The actual transaction throughput that a server achieves in practice will be affected by the level of contention of the active transactions. i.e. if they are all attempting to read/write the same data (high contention) then you won't get very good throughput, but if the transactions are not contending for the same data than the achievable transaction rates are much much higher. To give an example, at my previous employer (Bullant Technologies) we were building a transactional virtual machine. We managed to get around 50,000 transactions/second on an 8 processor PC for a benchmark that was specifically written to have no contention.

    cheers,
    mick
  • Originally posted by matthew
    The point is if you have a transactional system you have a slower, but potentially (not guaranteed) safer system. I assure you have you can get much better than 420 invocations per second with Ice :)[/B]
    By batching multiple objects into a transaction on the client side, I was able to achieve over 25,000 object updates per second with Oracle on the same system using the OCI C interface, which in my case is more or less equivalent to an Ice invocation.

    The code is basically the same as my previous post, but prior to the loop, you set a max interations value, and at the bottom of the loop, you call addIteration() instead of commit(). I'm at home right now, otherwise I'd just post the code.

    Even 25,00 object updates per second might be too slow for our application. We will probably have over 50 million objects to update every 10 seconds or so (5 million per second). Obviously this is spread over many servers, but still, it's a lot to ask!


    Ken Carpenter
  • matthew
    matthew NL, Canada
    Hi Ken,

    I agree that those are certainly high demands. In my opinion, having a cluster of distributed independent databases composed of cooperating objects is the best way of having scalability of this degree.

    Matthew