Archived

This forum has been archived. Please start a new discussion on GitHub.

about Freeze::Map::iterator

Hi!

I am beginning to write an academic application using Freeze and I find that Ice in general is absolutely amazing.

But I have one more question anyway about Freeze::Map.

Suppose I have for eg. string for key and StringSeq for value (with a lot of String in the StringSeq). If I want to read a particular key-value pair in the db, add a string in the StringSeq then write it back, I have to do the following:

MyMap::iterator p=mymap.find("this one");

if(p != mymap.end())
{
StringSeq ss = p->second; //at lot of copy occure here
ss.push_back("one more string");
p.set(ss);
}

That is the best I can do, right? My problem is that,
at "ss=p->second" the whole StringSeq will be copied from the internal value of the iterator to my "ss", and this can be very "long".

As I suppose you have a very good reason to use this scheme where the iterator type return a pair of const object, I would like to understand why you didn't choose for example the following approach:
- a iterator type that return a non-const reference to the "value" of the internal key-value pair object.
- a function like "validate_changes()" instead of the "set" that in fact internaly call set() with the internal value as parameter.

This way the above code should be:

MyMap::iterator p=mymap.find("this one");

if(p != mymap.end())
{
p->second.push_back("one more string");
p.validate_changes(); //the internal modified value is flushed to database
}

Of course, the state of the internal value object is changed before its db counterpart, so there is a lack of synchronisation before the call to validate_changes(), and if there is an exception somewhere during the modifying process the iterator will be left in a "dirty" non-synchronized state. Is that the reason for your choice? It seems to me that is some cases this would be less problematic than a lot of copies (for example in a case where the exception is caught a little bit outside the scope where the iterator is constructed, it will be destroyed anyway, so its value is of no relevance).

In fact I can make the changes in the source for my personal use, but I would like to know what you think about it, because I am pretty sure that there is some problems that I didn't think about.

Thanks a lot for your answer (and for the others you already gave to me).

Comments

  • bernard
    bernard Jupiter, FL
    Hi Sylvain,

    The only reason for this const value is that we felt non-const values would be too error prone.
    It would be too easy to write:
    p->second.push_back("one more string");
    and incorrectly believe that the update was saved to the database. This is in particular true if you use a generic algorithm for such updates: with const values, you get a compile-time error, not a surprising runtime behavior.

    Of course in-place updates would save memory allocations in some situations. If you can think of an API that is at the same time efficient and safe, I'd be happy to implement it!
    It would also be interesting to check how valuable such improvement would be: maybe you could measure this extra memory allocation cost with your example, and compare it with the time it takes to write to disk?

    Cheers,
    Bernard
  • Thanks a lot for your reply.

    Just to be sure I understand well your sentence:
    Originally posted by bernard

    It would also be interesting to check how valuable such improvement would be: maybe you could measure this extra memory allocation cost with your example, and compare it with the time it takes to write to disk?

    Do you mean that if the ratio of the time consumed to do "StringSeq ss=p->second" is (very) small compared to the time consumed by "p.set(ss)" then there is no need to to further optimisation because the overall time will not change a lot? (and so the safety of the "slow" implementation is a bigger overall benefit)

    It is very interresting, I was not used to think in this way, but I think your really right! I am learning everyday with you...

    I suppose there is some kind of rules of the thumb to set the threshold for the ratio between the time consumed by two part of an algorithm. If the ratio is below this threshold, then there is no need to optimize the fastest part.
  • bernard
    bernard Jupiter, FL
    Hi Sylvain,

    Yes: if optimizing the memory allocation saves 5% of the overall time it takes to update a value, it would be more meaningful to implement than if the saving is only 0.1%.

    Cheers,
    Bernard
  • I should make some test and tell you if there is something relevant. But I suppose it depends a lot on the hardware...

    Thanks again.