> < ^ Date: Mon, 10 Jan 1994 10:10:00 +0100
> < ^ From: Joachim Neubueser <joachim.neubueser@math.rwth-aachen.de >
> ^ Subject: AddSet again

Forwarded message:
>From math.rwth-aachen.de!fceller Mon Jan 10 09:04:51 1994
Message-Id: <m0pJHYi-0000UKC@rowlf.Math.RWTH-Aachen.DE>
Date: Mon, 10 Jan 1994 09:00 +0100
From: Frank Celler <frank.celler@math.rwth-aachen.de>
To: J.Neubueser@math.rwth-aachen.de
In-reply-to: Dima Pasechnik's message of 7 Jan 94 20:44 WST <m0pIHQC-000LsCC@samson.Math.RWTH-Aachen.DE>
Subject: Meine Antwort
Content-Type: text

Dear Dima Pasechnik,

To be efficient, this function must involve the machinery for efficient
addition of elements in the set, e.g. AVL-trees etc.

yes, I completely agree with you on that point, unfortunately, as I
tried to explain, this is of little help with your orignal question:

Some time ago we noticed that the function Orbit(G,S,OnSets) can be
hopelessly slow when |S| and in particular the length of the resulting
orbit are big enough.

if <S> is a big set the comparison between sets will be relative
expensive, together with the fact that 'AddSet' has to check all
elements this is taking the time. A short example: Take a set of 4000
vectors of length 450 and sort them in reversed order. Now start with
the first vector and build up a new set using 'AddSet'. The insertion
process during the addition of elements is as bad as can be. On a
NeXT this process will indeed be incredibly slow taking nearly an hour
to complete. However, if your remove the test in 'AddSet' that the
first argument is really a set, it will take 20sec, 10 of those - I
would guess - being used in the insertion process. Using Trees,
hashing or whatever will reduce this 10sec to maybe less than 1sec.
So you have a gain of 10sec in an 1 hour procedure. If on the other
hand you simply drop the test, you will speed up the whole thing by a
factor of 200.

Once you have decided that sets are sorted lists without holes (I
explained why we didn't choose the other options, in this situation
things would have been different), your suggestion

call a function (say) Upgrade(l). One could suggest even better
ways, by taking some extra care *before* changing ....

will create dangerous traps. If one is purely thinking in the context
of orbits, I agree that

The example is rather strange, since it's more or less clear that any
set can be treated as a list with an ordering attached. The example in
question seems to violate this, i.e. make some elements of the set
incomparable. I doubt that in any serious computation one would make
things like that:

But unfortunately that point of view can only be taken in standalone
programs but not in a system like GAP. The example I gave you is not
a wild construct, it is a simplification of a real bug. When changing
from GAP 2.4 to 3.0 we experimented for a while with that. And
because of the tempting example above we tried if we can do without
the test in 'AddSet'. But this created a bug in one of the
representation theory programs, where sets were used in a completely
different seting (and sets there are very small). It took the
programmer a long time to find this really nasty bug. That was the
reason why we decided against a faster but insecure 'AddSet' and are
now using a slow but stable one.

On the other hand, and that was the point (e) I tried to explain, this
does not mean that one has to live with an (in some cases slow)
implementation of 'Orbit' (even now, where in most cases the operation
is costly, it is not too bad). But the example above shows that with
very little work involved one can easily speed up things, in that
example it does not even seem necessary to use trees or other
techniques. But in a situtation as Steve has described, which is in
some sense the other extreme because comparison and operation is very
cheap, 4-3-trees, red/black-trees, avl-trees, hashing, or bit-lists
could and should be used. But doing this in 'AddSet' will not help in
your sitution. So one has to do it in 'Orbit' (in which it will work
for both your and Steve's situation). This does not mean that one has
to do it in such a way that it can only be used in 'Orbit'. For
instance, if one would write a small package implementing functions to
create a tree structure, to add a new element, to test membership, and
to test and add, this could easily be used in various places.
Presumably all of this can be done with the existing datastructures
lists and records. But as I said, at the moment none of us here in
Aachen has the time to do it, so experiments/comments (like Steve's
'FastOrbit') from elsewhere are more than welcome. Wouldn't you like
to write such a package that would perform optimally for applications
like yours? We would surely appreaciate that.

best wishes
  Frank Celler and Joachim Neubueser
            ^
            ^

> < [top]