Add metadata.schema.ignored-keyspaces option, and ignore all system keyspaces by default
The token map is often a source of memory issues with large clusters. The refreshed-keyspaces configuration option can help, but it's opt-in, the driver loads everything by default. It would be better to take a more proactive approach.
It's probably safe to assume that applications don't usually need metadata about system keyspaces. So one thing we could do is ignore them by default. We need a new option that is the opposite of refreshed-keyspaces, proposed name: advanced.metadata.schema.ignored-keyspaces. By default it should be set to every system keyspace in Cassandra and DSE.
If a keyspace is present in both refreshed-keyspaces and ignored-keyspaces, we should include it, but log a warning.
From an implementation perspective, I don't think we can handle excludes in the WHERE clause like we do for includes. But we can filter on the client side, possibly in CassandraSchemaRows.Builder.
For refreshed_keyspaces I'd like to keep the server-side filtering in order to avoid fetching too much data
Hrmm I don't like the asymmetry of having patterns on one side and not the other... OK, I'll allow it, but I'll add a recommendation in the docs to prefer name inclusions if possible.
No. One risk is that the pattern could be too eager, but with something like system, system_.* and dse_.* it should be pretty safe.
For refreshed_keyspaces I'd like to keep the server-side filtering in order to avoid fetching too much data.
That wouldn’t future-proof the token map against more system keyspaces that could be added in the future, would it?
It's an exclusion, we can add all the names that have ever existed.
By default it should be set to every system keyspace in Cassandra and DSE.
The exact contents and names of system keyspaces evolved across C* versions (e.g. system_schema appeared in C* 3.0).
I think it would be safer and more future-proof to introduce the ability to filter by regular expressions, that is, ignored-keyspaces = [ “system.*” ].
I’d also add that ability to refreshed-keyspaces for consistency. However in this case I guess all the filtering will have to be done client-side.