feat(koplugin): bulk sync annotations #84

tku137 · 2026-01-11T14:25:55Z

This adds the possibility to bulk-sync annotations. It is probably a rarely used feature, I used it to have an initial sync of all book annotations without the need to open each book individually.

Unfortunately, getting annotations is only available for the currently opened book in koreaders api, see #79. Crawling the whole library directory for sidecar files would be pretty heavy. So in this PR, I use koreaders history api, which gives a list of books opened in the past. This includes read, reading, paused and so on books. So my thinking is: for a book to have highlights, it must have been opened at one point and thus is listed in the history.

Having a list of books from koreaders history, we then can loop over all these books and sync their annotations one by one.

Several improvements have been made along the way, for example:

only open docsettings once (instead of three times)
reduce function calling (extract common functions)
use md5 from sidecar files, no need to query the statistics database for the md5 value

This reverts commit f7ec420.

We do not need to query the statistics database for book md5 values. We can just use the ones from the sidecar files, which we already have anyways.

…ionReader.cleanAnnotations

tku137 · 2026-01-11T14:37:23Z

Hm, need to look into why the test was failing. Not sure why any of the changes I made here affected ths.

GeorgeSG · 2026-01-11T14:40:44Z

The test is flaky :/ I hit this earlier this morning.

GeorgeSG · 2026-01-11T14:42:24Z

plugins/koinsight.koplugin/annotation_reader.lua

+  local logger = require("logger")
+  local DocSettings = require("docsettings")


This is me being unfamiliar with lua, but can't we define these once outside the functions and use them? Seems counter-intuitive to require every time.

Ah true, good catch! I had an issue with ReaderUI, which is only available when reading a book. So I guess I put all requires inside functions for no reason. Just improved it.

I agree that requiring ReaderUI in functions is sub-optimal. I found the reason for my issues with it and implemented lazy-loading it, works perfectly fine on my side.

While at it, I found a better KoReader API for doc_settings, we use that now.

GeorgeSG · 2026-01-11T14:50:48Z

@tku137 can we chat directly on another platform? Discord or something else?

tku137 · 2026-01-11T15:04:02Z

@tku137 can we chat directly on another platform? Discord or something else?

Sure, I have a Discord account. I could send you a mail with my handle if you like? Or just send me an invite.

GeorgeSG · 2026-01-11T15:18:23Z

@tku137 can we chat directly on another platform? Discord or something else?

Sure, I have a Discord account. I could send you a mail with my handle if you like? Or just send me an invite.

Yup, perfect! Shoot an email at [email protected]

GeorgeSG · 2026-01-11T16:48:57Z

plugins/koinsight.koplugin/annotation_reader.lua

+    highlights = (stats and stats.highlights) or 0,
+    notes = (stats and stats.notes) or 0,
+    last_open = (summary and summary.modified) or os.time(),
+    total_read_time = 0,


Seems like total_read_time will always be 0? Wouldn't we lose that data if we sync like that?

This took some time. I now remember where this comes from: I have a user-patch that stores total_read_time and total_read_pages to the sidecar files because I wanted to use these for something. Thus I added those while looking at my sidecar files. HOWEVER, this is actually dangerous to sync during annotation sync and I will remove it.

Please check my solution. In my understanding, I now need to distinguish between statistics.db sync and annotations sync, thus total_read_time and total_read_pages in the upload service need to be updated only on the statistics.db sync path, right?

GeorgeSG · 2026-01-11T16:51:50Z

The current sync method also bulk syncs all books, it just doesn't bulk sync with highlights, correct? I'm wondering if we should be keeping both of the syncs - seems kinda confusing. If this implementation bulk syncs all books with annotations, we can probably keep only that and remove the old one?

Also, maybe it's a nice idea to sync only the current book on suspend? So maybe it's worth implementing a "Sync current book" alongside "Sync all books with all highlights"?

…for bulk operations There were too many requires inside functions all over the place. Also, I saw a nicer Koreader API to get doc_settings. But I encountered an error once, turns out that after rebooting KoReader, there is a chance that ReaderUI is not yet there, failing requires at module level. I used lazy loading, which seems to run just fine. Its still nicer to require once, not for every function call, ESPECIALLY during bulk sync! This prevents crashes when requiring the module outside the reader, preserves live in-memory settings during normal usage, and makes bulk annotation reads safe and non-destructive.

tku137 · 2026-01-21T19:19:34Z

The current sync method also bulk syncs all books, it just doesn't bulk sync with highlights, correct? I'm wondering if we should be keeping both of the syncs - seems kinda confusing. If this implementation bulk syncs all books with annotations, we can probably keep only that and remove the old one?

Also, maybe it's a nice idea to sync only the current book on suspend? So maybe it's worth implementing a "Sync current book" alongside "Sync all books with all highlights"?

This is an excellent idea. I think we should have a single "Synchronize data" button in the menu that syncs statistics (only available in bulk) and annotations in bulk, while keeping a "sync statistics as usual and annotations for opened book" version in the sync-on-suspend path.

I am still somehow worried about the performance of bulk-syncing annotations, but given how "bulky" a statistics sync already is and how basically instantaneous an annotations sync is, I guess even for a couple hundred books, this still is pretty fast, since we utilize the Koreader history of opened books without crawling through a huge library.

coveralls · 2026-01-21T19:24:54Z

coverage: 53.861% (+0.2%) from 53.632%
when pulling 21ae613 on tku137:bulk-sync
into 1eb7dcd on GeorgeSG:master.

…tistics in full sync - rename sync functions for clarity: sync() → syncCurrentBook(), bulkSync() → syncAllBooks() - merge "Bulk Sync All Books" into single "Synchronize data" menu option - fix syncAllBooks() to include full statistics sync from database (was only syncing annotations) - sync-on-suspend now explicitly use syncCurrentBook() for fast, targeted sync Manual sync now properly syncs all statistics + all annotations. Suspend sync remains fast, syncing all statistics + current book only. I am unsure about the performance impact on syncing all anotations actually is though...

GeorgeSG

Thanks again, this looks very good! Added some questions, this is more for my understanding of what's going on :D

GeorgeSG · 2026-01-24T15:46:05Z

packages/common/types/book.ts

@@ -1,5 +1,5 @@
 export type KoReaderBook = {
-  id: number;
+  id: number; // Optional for annotation-only sync


should we mark as ?: number then?

Nah, I thought this is optional, removed the ? but not the comment, will do this.

GeorgeSG · 2026-01-24T15:48:31Z

plugins/koinsight.koplugin/annotation_reader.lua

+  -- when inside reader
+  local ui = get_live_ui()
+  if ui and ui.doc_settings then
+    return ui.doc_settings:readSetting("partial_md5_checksum")


I'm concerned about #90, #91, #92. Do you think this will address the problem? Are the book md5s identical when syncing?

Should have read these comments before spending a significant amount of time :D Yes, I suspect this could be in fact the culprit.

Specifically: there might be a mismatch between the statistics and annotation sync path, which would explain the duplication of books, one with correct statistics and one with correct annotations, as some users reported. I don't know why not all users end up with such duplications, though. Unfortunately: "it works in my machine" makes debugging this really hard here :D

Actually, I was wrong: the mismatch between statistics and annotations path happens on master! Using md5 should absolutely solve the title matching. I was looking on the bulk-sync branch, but that is of course not the code users have reported matching issues with 🤦‍♂️

However, there are other issues I found that could still lead to "ghost books" and will investigate...

GeorgeSG · 2026-01-24T15:51:38Z

plugins/koinsight.koplugin/annotation_reader.lua


  if not book_md5 then
    logger.warn("[KoInsight] Could not determine MD5 for current book, skipping annotations")
    return annotations_by_book


we should return nil or {} to be sure this is hardcoded empty (just in case something somewhere pushes data to annotations_by_book.

GeorgeSG · 2026-01-24T15:53:33Z

plugins/koinsight.koplugin/annotation_reader.lua

+  -- Get MD5
+  local md5 = doc_settings:readSetting("partial_md5_checksum")
+  if not md5 then
+    logger.warn("[KoInsight] No MD5 found in sidecar for:", file_path)


🤔 Idea for the future:

Should we push these logs (warnings & errors) to KoInsight server? I'm thinking it will be useful to debug issues from users - easier to get them to share docker logs than KoReader logs.

Thats a good idea. Would make bug-hunting a lot easier.

GeorgeSG · 2026-01-24T15:58:21Z

plugins/koinsight.koplugin/main.lua

    sorting_hint = "tools",
    sub_item_table = {
-      -- 1) Synchronize data
+      -- 1) Synchronize data (all books)


Should we add a menu entry for sync current book?

I think its ok to have only one sync button that syncs all books, statistics is all books anyways, and annotations are synced for all books. So the "(all books)" makes sense for the user. But we use a sync current book function internally for sync on suspend, where we sync all statistics as usual, and annotations for the currently opened book. Thats how I implemented it in the latest version of this PR.

However, I can happily add a sync current book back (it was there till one of the latest commits).

GeorgeSG · 2026-01-24T16:00:53Z

We should also probably bump the plugin version

tku137 added 9 commits January 11, 2026 13:34

feat(koplugin): have bulk-syncing

7b47cad

feat(koplugin): use sidecar file md5

f7ec420

Revert "feat(koplugin): use sidecar file md5"

b4d356a

This reverts commit f7ec420.

fix(koplugin): flush currently opened book befure bulk sync

b487b83

refactor(koplugin): use sidecar file md5

a33ea8b

We do not need to query the statistics database for book md5 values. We can just use the ones from the sidecar files, which we already have anyways.

refactor(koplugin): single docsetting open in getAllBooksWithAnnotations

94909f3

refactor(koplugin): extract annotation cleaning into KoInsightAnnotat…

f81faba

…ionReader.cleanAnnotations

refactor(koplugin): getCurrentBookMd5 directly opens sidecar

b2896e5

feat(koplugin): register BulkSync as koreader action (for gestures)

bb2095e

GeorgeSG reviewed Jan 11, 2026

View reviewed changes

perf(koplugin): require docsettings and logger once in annotationreader

58b65f5

GeorgeSG reviewed Jan 11, 2026

View reviewed changes

GeorgeSG mentioned this pull request Jan 12, 2026

Cannot see any annotations for any book #89

Open

tku137 added 3 commits January 21, 2026 20:08

fix: trick flawed test to run

c78d074

fix: remove statistics values from annotation sync

bc56b34

Merge branch 'master' into bulk-sync

0c64c90

tku137 requested a review from GeorgeSG January 21, 2026 19:52

style(koplugin): add separator to sync button

21ae613

tku137 mentioned this pull request Jan 22, 2026

Book with annotations doubling and showing annotations on wrong book #91

Open

GeorgeSG reviewed Jan 24, 2026

View reviewed changes

		local logger = require("logger")
		local DocSettings = require("docsettings")

feat(koplugin): bulk sync annotations #84

Are you sure you want to change the base?

feat(koplugin): bulk sync annotations #84

Uh oh!

Conversation

tku137 commented Jan 11, 2026

Uh oh!

tku137 commented Jan 11, 2026

Uh oh!

GeorgeSG commented Jan 11, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tku137 Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GeorgeSG commented Jan 11, 2026

Uh oh!

tku137 commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GeorgeSG commented Jan 11, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GeorgeSG commented Jan 11, 2026

Uh oh!

tku137 commented Jan 21, 2026

Uh oh!

coveralls commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GeorgeSG left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GeorgeSG commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tku137 Jan 11, 2026 •

edited

Loading

tku137 commented Jan 11, 2026 •

edited

Loading

coveralls commented Jan 21, 2026 •

edited

Loading