Message 65030

Author
nicoe
Date
2021-03-01.09:32:52
Message id
65030

Content

* C├ędric Krier  [2021-02-27 14:32 +0100]: 

>> >The Model definition can not depend on the backend.
>>
>> From my point of view the Model definition does not depend on the
>> backend even with this addition.
>
>It is as soon as you store on it an object that are different
>depending on the backend.

What I mean by definition is what you read.
What you mean is what is executed.

>> Because it is only when the definition is realized (ie executed), that
>> the index it contains turn into their implementations.
>
>No because the class which store the index object is different per
>backend. It is not the execution that is different.

The class is different as soon as the python interpreter has executed
the __setup__. From my POV, it is the execution that is different.

>> >> >Indeed by looking at syntax of both backends they share almost the
>> >> >same syntax/parameters.
>> >>
>> >> SQLite is like the minimum of index creation.
>> >> I also looked at MariaDB and Oracle's version of CREATE INDEX (that's
>> >> why 'WHERE' is in the kwargs instead of the arguments). The similarity
>> >> of the syntax is kind of an artefact of SQLite simplicity.
>> >
>> >There is no point to have arguments that depend on the backend. They can
>> >never be used when defining a Model because it does not depend on the
>> >backend.
>>
>> In a perfect world there would be no index information in the models
>> and this work would be left to DBAs.
>
>I do not agree with this statement. The DBA is the last resort but
>developer create queries that must be optimized by indexes. So it is
>the developer who knows what index must be created.

It would be a priori knowledge that have to be confronted to reality.

So if we really want purity we shouldn't rely on those and let the
reality speak because in the end we don't know.

>> So since we are compromising with the purity of the model we might as
>> well make this explicit.
>
>I do not want to compromise on purity. Model must never depend directly
>on the backend.

We compromise on purity on a lot of things: using lists instead of
sets in a lot of places, the TableHandler class which could be split,
the 2 phase commit protocol, in the GTK client we do not adhere to
GTK's way of doing stuffs in some places, etc.

And it's not an issue, if we have good reasons to do so. And btw
purity could be restored by clearing the list after it has done its
job (although I don't see the point in doing it so I consider it a bad
idea).

>> >> First I think that the TableHandler class is already way too big. It's
>> >> why I chose to make another class.
>> >
>> >I do not think it simplify anything to move two methods from one class
>> >to another in the same file.
>>
>> At least it's a start. Having those backend information structured
>> instead of having everything packed in the same class seems like a
>> good idea.
>
>I see no improvement there, just backward compatibility breakage.

Yeah sometimes people update their code when there is a release. It's
not a big deal.

Moreover it will break anyway because we can not keep the same
signature for the function handling the index.

>Also with this separation we loose the optimisation that prevent to run
>index creation query if the index already exist. Of course there is the
>"IF NOT EXISTS" but it is an extra query. (Now we could argue that
>retrieving the indexes can be complex).

The database is probably way better than us at managing this also.

>But as index is indeed a manipulation of the table (index can not be
>cross-table), it is logical that the table handler handles them.

I don't say that it does something else than manipulating the table.
What I say is that there's something like 20 public methods it could
be way more simple.

>> >But I really think it is not good to have to define the method to use
>> >but we must define which properties we need for the index and the
>> >backend should choose the type that works the best.
>>
>> I don't think that creating another abstraction will help. It will
>> just add another indirection and the mapping between our "langage" and
>> the different indexes implementation will be awkward.
>
>So with such reasoning let's just throw away the ORM, it is just an
>indirection to SQL.

You're comparing apples and pears. The ORM is a central piece of the
framework but indeed it's not a perfect mapping and that's why there
is python-sql.

>This is precisely the added value of the framework to provide a good
>abstract to such thing.

What is the added value of such abstraction? What use case does it
solve?

>> Moreover it will add another point where beginners will have to learn
>> how we do stuffs and make the learning curve steeper.
>
>I do not see why a beginner will know better the specific type of
>indexes of PostgreSQL and in which case which one should be used.

Because he might know already SQL but not Tryton.

>> >> >I would say that only the method is a specific that is good to
>> >> >support but it is mainly related to the kind of operation for which
>> >> >the index it created.
>> >>
>> >> I don't understand. Are you saying that you don't think the people
>> >> should specify the method but they should rather define the operation?
>> >
>> >Indeed, I can see this kind of operations:
>> >
>> >    - equality
>> >    - range (<, >, etc)
>> >    - begin pattern
>> >    - 2d or vector
>> >    - multi (inverted)
>> >
>> >For postgresql it would result in:
>> >
>> >    ['equality'] -> HASH
>> >    ['equality', 'range'] -> BTREE
>> >    ['begin pattern'] -> BTREE
>> >    ['vector'] -> GIST
>> >    ['multi'] -> GIN
>>
>> I think this is trying to be more clever than what the feature
>> requires.
>
>This is the added value of the framework.

As said, I don't think it's the role of Tryton to add value wrt the
indexes. Our strength is somewhere else.

>Probably the naming and definition must be refined. We could also have
>good default value depending of the type column used.

And it's even more code for something that could be simple. With more
code comes more bugs. Moreover if there is a default how will it work
when it will be an index on multiple columns each with different
defaults?

I don't think we could / should do this.

>> >Now if we have ['range', 'vector'], this may be difficult to choose.
>> >We have three options:
>> >
>> >    - create no index
>> >    - create multi index for each group of cases
>> >    - create one index as best effort
>> >
>> >I think we should do the last one, just find the type of index that
>> >provide the most of operations (+ internal preferences for equality
>> >ex: we prefer BTREE over HASH).
>>
>> This is precisely the kind of difficulty I want to avoid so that we
>> don't have to make a guess. If we let the people decide according to
>> their knowledge of their Model than this question is not there at all.
>
>Again this is the added value otherwise we just let developer create the
>indexes themselves.

Indeed it's somehow the idea: allow to create the indexes themselves
while at the same time being kind of database independent.
History
Date User Action Args
2021-03-01 09:32:52nicoesetrecipients: + ced, yangoon, albertca, pokoli, jcavallo, reviewbot
2021-03-01 09:32:52nicoelinkissue5757 messages
2021-03-01 09:32:52nicoecreate

Showing 10 items. Show all history (warning: this could be VERY long)