Skip to content

Conversation

jedrazb
Copy link
Member

@jedrazb jedrazb commented Dec 9, 2024

Soft deletes for connectors

Add support for soft-deletes of connectors. Why?

  • In order to improve UX on connectors on agentless we need to track what connectors were deleted to detect and propagate state change to fleet to add/remove corresponding policies

Changes

  • define new system index .connectors-deleted that stores deleted connectors
  • adapt delete, get and list operations logic to support this feature
  • this change is backward compatible, nothing will be broken for existing users
  • permission-wise it's also backward compatible, we still rely on index-level access to .elastic-connectors-* pattern
    • Although this is likely to change soon once we actually enforce manage_connector and monitor_connector privileges
  • update docs
  • add yaml tests and unit tests

Why new system index and not new is_deleted flag in existing index

The existing connector index is not a system index. Instead, we rely on index templates with non-dynamic mappings. This means that even if we update the template and incorporate logic to use the is_deleted field in the current index, it won’t be backward compatible, since the new field wouldn’t be present in mappings for users who created their connector indexes in the past. The is_deleted field will be included in the source, but we wouldn’t be able to run any queries against it, making it useless.

By using a separate index, we maintain backward compatibility for users who depend on direct access to the connector index. Deleted connectors will actually disappear from the original connector index and be moved into the deleted connectors index.

Followup work:

  • update specification for 9.0 CLI

Other work we might need to do in the future:

  • support force deletes, e.g pass force=true to delete endpoint to completely remove the connector

Validation

  • tested locally
  • yaml e2e tests
  • unit tests

@jedrazb jedrazb changed the title [Connector API] Add interface for soft-deletes [Connector API] Support soft deletes of connectors Dec 10, 2024
@jedrazb
Copy link
Member Author

jedrazb commented Dec 11, 2024

@elasticmachine merge upstream

@elasticsearchmachine
Copy link
Collaborator

Hi @jedrazb, I've created a changelog YAML for you.

@jedrazb jedrazb marked this pull request as ready for review December 11, 2024 10:45
@elasticsearchmachine elasticsearchmachine added the Team:SearchOrg Meta label for the Search Org (Enterprise Search) label Dec 11, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-eng (Team:SearchOrg)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-extract-and-transform (Team:Search - Extract & Transform)

Copy link
Member

@artem-shelkovnikov artem-shelkovnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel I can review Java code in the PR well, but I left a couple comments/questions about features introduced by the PR and other minor stuff

Comment on lines -100 to -101
- class: org.elasticsearch.xpack.application.connector.ConnectorIndexServiceTests
issue: https://github.com/elastic/elasticsearch/issues/116087
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about unmuting the tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

Copy link
Member Author

@jedrazb jedrazb Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It got muted due couple of flakey runs recently ... I checked the logs and it seemed to be weird ES error. It has been running for almost a year now fine

"deleted": {
"type": "boolean",
"default": false,
"description": "A flag indicating whether to list connectors that have been soft-deleted."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say that it's listing "only" deleted connectors, not all together?

@@ -0,0 +1,307 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have a test that checks that both deleted and non-deleted indices have the same schema?

Additionally, does it make sense to make a lot of properties here searchable? Or we wanna have filters on deleted connectors too?

cluster:
- manage_search_application
- manage_behavioral_analytics
- manage_connector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment on why this and below permissions are needed? What does it practically mean for users, if it does anything?

- do:
connector.list: {}

- match: { count: 2 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - why 2? It's some sort of initial setup?


- match: { count: 3 }

# Alphabetical order by index_name for results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - is it possible to do same matching without relying on sorting being stable?

* @param listener The action listener to invoke on response/failure.
*/
public void getConnector(String connectorId, ActionListener<ConnectorSearchResult> listener) {
public void getConnector(String connectorId, Boolean isDeleted, ActionListener<ConnectorSearchResult> listener) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One interesting side effect of it is that it's possible to have a non-deleted and deleted connectors with same id that will actually point to really different connectors. That can cause some confusion, but probably nothing flow-breaking, other than customer restoring deleted connector manually breaking something, but that's on them? Should we document this more?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in same scenario, deleting a live connector would erase previous record of a deleted connector. I guess it's also fine given our limitations, but might be nice to document.

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this to be a little confusing. As a user I would assume that passing a deleted flag into a list call would have included all of the connectors including the soft deleted ones - ideally with metadata indicating which ones were deleted. Or, I would have expect a different API call to get only deletions. Why did we choose to combine these in an exclusive way?

Another option we could have taken is to add a deleted flag to the connectors index. Was this determined to be unmanageable?

Comment on lines -100 to -101
- class: org.elasticsearch.xpack.application.connector.ConnectorIndexServiceTests
issue: https://github.com/elastic/elasticsearch/issues/116087
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@jedrazb
Copy link
Member Author

jedrazb commented Dec 12, 2024

We’ve discussed this with Jim, and there might be a better way to implement logic to handle soft-deleted documents, with mappings update. I’m implementing his suggestion as a separate branch of this PR. Suggested approach involves updating the template and applying new mappings if needed using the onClusterChange listener.

@jedrazb
Copy link
Member Author

jedrazb commented Dec 12, 2024

@kderusso Thank you for looking into this!

Another option we could have taken is to add a deleted flag to the connectors index. Was this determined to be unmanageable?

Jim made me aware that there is a way to upgrade connector index template mappings in clusterChanged logic, so we can just add a new field to the existing index. I’m following this approach and will have something to share soon.

As a user I would assume that passing a deleted flag into a list call would have included all of the connectors including the soft deleted ones - ideally with metadata indicating which ones were deleted

I don’t have a strong opinion here. For me, passing deleted=true could be interpreted as either:

  1. “Show me (filter)” only the deleted ones, or
  2. Include the deleted ones in the response with a flag indicating they were deleted.

Do you know if ES endpoints conventions prefer either approach?

@jedrazb
Copy link
Member Author

jedrazb commented Dec 13, 2024

Closing in favour of #118669

@jedrazb jedrazb closed this Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>feature :SearchOrg/Extract&Transform Label for the Search E&T team Team:Search - Extract & Transform Team:SearchOrg Meta label for the Search Org (Enterprise Search) v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants