-
Notifications
You must be signed in to change notification settings - Fork 6
[WIP] Permanently delete files #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for contributing / working on this. I will also say if you add this, it'd be good to add a single hash purge as well just to keep the api consistent.
src/index.ts
Outdated
const [rootHash, generation] = await this.#getRootHash(refresh); | ||
// Get the raw text of the root entry | ||
|
||
const lines = rootHash.split("\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the root hash should be just a string hash. Similar to other bulk actions, you should need to get the list of entries at that hash with:
const entries = await this.raw.getEntries(rootHash)
then to do a bulk purge you just need to remove the hashes from that entry set, and create a new entry with only the remaining ones, a. la. the last part of the bulk modify actions.
This way you can take advantage of the api parsing, rather than doing the raw line manipulation yourself, which is what it looks like you're doing.
This also does the v3 check (or if it doesn't, should, so some more code you can rely on general utilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not able to figure out how to get the entries list back to the root hash, is there a function that I missed? Which is why I settled for the raw manipulation, because it is just a matter of removing lines, not recalculating anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just look a few lines up:
const [rootEntry, uploadRoot] = await this.raw.putEntries(
"root",
newEntries,
);
await uploadRoot;
await this.#putRootHash(rootEntry.hash, generation);
the api may not be super clear, but the rootEntry
has the new root hash, and uploadRoot
is a promise that resolves when everything is actually uploaded. By splitting these up it allows more efficient async operations, but since you're only doing one upload, you might as well just await uploadRoot immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, now I realized why I did not go this route. the put entries also uploads the docSchema file, which the delete operation did not do, I just calls the putRootHash
I have made a new small method that creates the hashstring, to skip the putEntries so it functions just like the remarkable app. I will do some testing today on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaving a comment inline, after you change
…sh method, added interface definitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more thoughts. I'm also happy to wait until you take this out of draft. I get notified, and so I have an urge to see what's up.
const newEntries = entries.filter( | ||
(entry) => !hashes.includes(entry.hash)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make hashes a Set
so this check is O(1) not O(n)
src/index.ts
Outdated
const [rootHash, generation] = await this.#getRootHash(refresh); | ||
// Get the raw text of the root entry | ||
|
||
const lines = rootHash.split("\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaving a comment inline, after you change
} | ||
|
||
/** permanently delete an entry */ | ||
async purge(hash: string, refresh: boolean = false): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make this have a similar return type, e.g. [SimpleEntry, string]
although I left a comment on the bulk purge return type as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I did not return the simpleentry is because it no longer exists, but it would be a simple change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, lets wait until the more important details are hashed out.
async bulkPurge( | ||
hashes: readonly string[], | ||
refresh: boolean = false, | ||
): Promise<[SimpleEntry[], string]> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the similar methods return the entries that were modified. Similarly it seems to make sense to return something relating to the entries that were actually deleted. These don't tend to return the root hash.
This API has evolved over a bit of time, so I'm open to an argument of why it helps to return something different, but baring that, I lean towards a consistent interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong feelings on the return type. I was thinking about what could be useful data to return in this scenario and the only thing I could come up with was the root hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One benefit of doing something like the the other returns is that it'll tell you which were actually removed, which would indicate that maybe things were already purged.
const hash = this.raw.makeListHash(newEntries); | ||
await this.#putRootHash(hash, generation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from this, it looks like your setting the roothash to a raw entries file, rather than the hash to a file. I've never seen a rootHash specified this way. It may work, but if writing the entries first and then writing that hash to that file also works, I strongly lean towards that option.
I also want you to be careful since this api while somewaht being non-destructive, can be destructive if you're not careful, also ideally keep backups of any documents, and a backup of the current rootHash before attempting any of this.
In short, I'm skeptical that this is the right thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am doing this exactly like the remarkable app does, I added fiddler to my system in order to make sure I did it the same way. I can double check and possibly make a dump of the files so there is nothing I have missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I guess try double checking. Does the rest of the api work as intended? If so, I can't imagine that the root hash would change from a hash to a raw file.
Looking at the beginning of your function, the rootHash is a normal hash. You pass that into raw.getEntries
which reads the entries list at that hash. If that step works, then presumably rootHash should contain a hash and not the full entries list. If root hash did contain an entries file, then raw.getEntries
should fail, probably at validation before even making the request since I think it checks for a hash of the appropriate length.
* create a list hash from a set of entries | ||
* @param entries - the entries to hash | ||
*/ | ||
makeListHash(entries: RawEntry[]): string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's much utility in exposing this, and it doesn't rely on the state of raw, so I'd move to a utility function. It could still be placed within this file, I just wouldn't put it in the RawApi.
That said, I wouldn't make any changes until there's agreement on the rootHash, as this would be moot.
WARNING!!! UNTESTED
I did a little reverse engineering of the Remarkable Application, and I am pretty sure that this is the way the windows app permanently deletes files from the remarkable cloud (and from all devices)
I have a script that updates a pdf file, and the only way right now is delete create, which ends up filling the trash folder completely.
I will run a few tests, but before I wanted to do that I wanted to see if anyone had looked at this previously and came to the same solution?