-
Notifications
You must be signed in to change notification settings - Fork 128
dbbackup: A minimal database backup plugin #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Awesome work, I'll review this asap ^^ |
Just tested in production and it works. Now yes, someone more capable should review it. |
That's something that I have been wanting to look into. Basically the I'm sorry you had to jump through these hoops to get testing working, but great work getting testing working 👍 |
f0c6b63
to
7ebc996
Compare
Modified README.md (nothing else) and squashed everything into two commits. The As a side note, I run this plugin with Toshiba 16GB usb flash drive for the backup-file. Little concerned about flash (p/e cycles) lifetimes, |
'db-backup-file': backup}) | ||
|
||
# rename backup file will cause db_write failure and crash lightningd | ||
os.rename(backup, backup + '_') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this cause a failure writing to the DB? I thought sqlite3 keeps an fd open to the db file, which is unaffected by moves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea, but it works . The error returned by sqlite3: attempt to write a readonly database
Any other ideas how to emulate a disk-full scenario? (maybe loop device)
How can we use this to backup the db to another computer? |
Since I've started to run this plugin I've been seeing hundreds of crashes per day:
|
You'd likely have to mount a remote disk over the network and write the backup file to that disk. |
@fiatjaf Can you add a print of the executed SQL command before it is being applied to the replica? Might be a difference in the sqlite3 version used by |
…ned write error As suggested by @cdecker Fix README
Incorporated the suggestions and some fixups in the tests (some
@fiatjaf On a write failure, the returned error now includes the SQL statement that was tried to be written. Maybe this helps in debugging? |
@plugin.hook('db_write') | ||
def db_write(plugin, writes): | ||
if not plugin.initted: | ||
plugin.sqlite_pre_init_cmds += writes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a potential (but unlikely) race condition where db_write
is invoked before init
, then C-lightning crashes before it is able to init
the plugins. It may be safer to save these in a temporary file that you clear out once init
is invoked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, though writes before the init
call will all be migrations iirc, so these are not catastrophic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not keeping up with C-lightning upgrades is "not catastrophic"? Upgrades are a risk by themselves as they represent changed code, and the user might be running on some obscure combination of system components that happen to work on older releases of C-lightning but not on newer ones, so a new release (which is when we are likely to have migrations running) is at increased risk of crashes before init
... shrug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZmnSCPxj Thanks, there is also a potential dead-lock when the plugin makes the rpc.stop()
call from init
while at the same time lightningd
is waiting for a db_write
hook to return. TBH I was totally unaware that plugin with (sync) hooks shouldn't make any (sync) RPC calls, at least not from the same threat that also handles the hook.
If I understand correctly, RPC calls can be send after jsonrpc_listen but are not be answered until the main io_loop_with_timers
To solve these and similar issues, I am exploring a sync init in plugins_config where lightningd
waits for the init
response of certain (perhaps all static?) plugins before continue startup, in a similar way as it does for getmanifest
. And maybe shutdown CL when a plugin's init
returns False? Will make a PR if that doesn't break too much.
os.rename(backup_copy, backup) | ||
plugin.log("Existing db-backup-file OK and successfully synced") | ||
else: | ||
plugin.log("Existing db-backup-file differs from original database, i.e. applying" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better handling of this would be to just outright copy the original over the backup. Though it would not be safe to do so in the init
as that is async. Instead, on the first db_write
after init
, that is when you do your overwriting the backup with the original. Though this is sensitive to the order in which C-Lightning writes to its own original copy (whether it does the write before db_write
or after db_write
). Might be more complicated than what you are willing to implement and I suppose the operator can manually do the copy in this edge case while C-Lightning is not running.
Just a heads up that with PR ElementsProject/lightning#2924 a couple of things will change:
This was done partially because it allows us to safe some data transferred, simplifies the flush logic (empty list of statements means we don't call |
I am closing this PR, because I am not really happy about the design, mostly the use of Or maybe a |
Here a simple plugin that keeps a synchronized backup of CL's database.
I heard rumors that similar (more advanced ) backup plugins are being developed. But I needed this now to sleep better, please comment if that is justified or not.
Some tests are included that can be run in c-lightnings
pytest
framework by linking this plugin directory into CL's/tests
. It is not ideal, but couldn't find another solution.Attempts to fix #2400
Edit: Last
test_dbbackup_plugin_kill
requires a logline in CL that's not there yet (will create PR)