[Regression] No easy way to access IndexMetaData from a ColumnMetaData

Description

From version 2.1.x and before, we can access an indexMetadata from the columnMetadata very easily : columnMeta.getIndex()

From version 2.2.x and 3.0.x, it's nearly impossible. With ColumnMetadata we can get the name of the column. From the TableMetadata object we can have a list of IndexMetadata.

However, there is no easy way to match an IndexMetadata with a ColumnMetadata (to answer the very simple question: does this column has an index on it or not ?).

One solution could be to match columnMetadata.getName() with indexMetadata.getTarget() but because of the type of index (KEYS, FULL, ENTRIES), the index target contains not only column name but also the type of index.

Furthermore, it is not easy to rely on indexMetadata.Kind because it only offers 3 types: KEY, CUSTOM and COMPOSITES, which does not reflect the KEYS, FULL and ENTRIES types of index ...

This issue is critical for frameworks & libraries that use the driver to parse/validate/display C* table metadata

Environment

None

Pull Requests

None

Activity

Show:
Alexandre Dutra
December 3, 2015, 9:56 AM
Edited

When I implemented with the assistance of (which I'm pinging just in case he has more accurate info on this) we had to make some difficult choices.

Secondary indexes have been completely redesigned in 3.0 and the association between an index and a column is gone. The relationship now is really a one-to-many between a table and its indexes.

The "target" field in IndexMetadata comes directly from the schema tables (index_options) and is basically the cql string used to create the index. For built-in indexes, target is a single column; but a custom index can target multiple columns or even (I think) the entire row, or functions as well; e.g. CREATE CUSTOM INDEX my_index ON ks.t1(a, keys(b), foo(c)) USING 'indexclass' – here the value for "target" would be a, keys(b), foo(c).

Actually not even Cassandra would know that such an index is on the given columns; it's down to the Index implementation to do any parsing of that/figure out how to deal with it. The driver did the same choice: we do not parse the target column. However for builtin indexes the parsing should be straightforward.

Another problem is that the driver cannot support both models at the same time, so we chose to adopt the new model and to retrofit legacy models (pre-3.0) into this new model.

So in fact the question "does this column has an index on it or not ?" is really not simple anymore

DOAN DuyHai
December 5, 2015, 1:09 PM
Edited

So in fact the question "does this column has an index on it or not ?" is really not simple anymore

I disagree. For you complex index example

Even in this case, we can establish a 1:N relationship between the index and the column:

  • index -> a

  • index -> b (with its key)

  • index -> c (using UDF foo)

I'm going to create a JIRA on C*. You can close this one as Not A Bug since the issue is at server side

Sylvain Lebresne
December 5, 2015, 1:32 PM

I'm going to create a JIRA on C*. You can close this one as Not A Bug since the issue is at server side

It's not, so please don't.

Even in this case, we can establish a 1:N relationship between the index and the column

In that example, maybe, but Alexandre is right in the sense that this kind of definition is allowed:

because the index target for a custom is basically opaque to the server and passed as is to the custom index implementation. And this is a feature, not a bug.

But back to the more general problem, it could make sense for the driver to parse the target for non-custom indexes, which is fairly simple right now, and to expose the column on which the index is.

That said, you'll want to keep in mind that we will support sooner or later functional indexes as well as index on multiple columns, so we'll likely have definitions like this in the future:

It's of course still possible to extract that the index is on a and b, but it's less clear how to expose the full relationship with the UDFs in a way that is pro grammatically convenient (I mean, it's possible, but it gets a bit heavy).

Alex Popescu
December 7, 2015, 11:20 PM

While offering a uniform API to access metadata, this isn't the primarily use of the driver and thus the complexity of supporting such features should always be weighted on what can be done. I'd prefer a simple API that is stable rather than trying to expose everything across all versions.

Alexandre Dutra
June 21, 2020, 5:51 PM

This ticket has been closed on 2020-06-22 due to prolonged inactivity, as part of an automatic housekeeping procedure.

If you think that this was inappropriate, feel free to re-open the ticket. If possible, please provide any context that could explain why the issue described in this ticket is still relevant.

Also, please note that enhancements and feature requests cannot be accepted anymore for legacy driver versions (OSS 1.x, 2.x, 3.x and DSE 1.x).

Thank you for your understanding.

Assignee

Unassigned

Reporter

DOAN DuyHai

Labels

None

PM Priority

None

Reproduced in

2.2.0-rc3
3.0.0alpha5

Affects versions

Fix versions

None

Pull Request

None

Doc Impact

None

Size

None

External issue ID

None

External issue ID

None

Components

Priority

Major
Configure