[New servers] - Migrate data from Knopi to Ramashka #512
Labels
No Label
administration
Akkoma
Android
Bare metal
bug
Communication
Community
Cryptpad
Discussion
Documentation
duplicate
enhancement
etherpad
Feature request
Feedback
finances
Fixed
forgejo
fun_project
Goal 2024
help wanted
Howto
🤔️ Investigate
ios
jitsi
lacre
Lacre Test
ldap
Lemmy
LibreTranslate
low prio
Lufi
macos
Mail
Merch
monitoring
movim
needs_refine
New Auth
Nextcloud
nice to have
on hold
proposal
question
Ready
refined
Roundcube
searX
spam-protection
Staging Server
Themes
TOR
Urgent!
Website
windows
wontfix
xmpp
Yearly Report
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Disroot/Disroot-Project#512
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
There are two major filestores that need to be migrated email and nextcloud.
Email
Estimated time of migration is about 4 days. We are rsyncing each individual mbox to speed up the calculation process. We sync mboxes in parallel (64) to make things go faster. This means there might be some discrepancies between final sync as there is 4 day difference however this may not be that much of a deal as after the initial migration, secondary one should take less time (we need to test that still). If the time of migration does indeed take more time and we want to avoid mbox state discrepancy we could implement idea from nextcloud sync
Nextcloud
Nextcloud file store is huge. Not only that it is also tightly connected to database as filecache is stored in it. This in combination with server side per used key encryption is vary much prone to problems. From our experience, this is very fragile and delicated issue. For example:
Some years ago we have had situation where user accidentally wiped his nextcloud storage by removing directory form nextcloud sync client. We did have the data (files) in the backup so we wanted to test and see how easy it is to restore. However this turned out to be a rather disaster situation. We had the data which we could restore but all keys and filecache in the db have been removed. We had a daily db backup but user reported issue too late and we could not recover db state from the day the files have vanished. Took us days to manage to recover only portion of the data.
Moving datastore to another server does bring up some of the problems we need to face. There are two ways to do the move.
With long downtime - We could do a sync similar to mailbox in terms of procedure: move files per user in parallel, measure the time, repeat moving. This could give us idea after at least two runs, how long we would need to put nextcloud offline (or at leeast disable files app) before we can do the last move and re-enable everything.
With no or minimal per user downtime - This approach would require both new and old datastores mounted on the cloud server. Then we should commence the following per user:
This procedure would mean that although user can experience short downtime (time it takes to re-sync data which can take minutes), service as a whole can work without disturbance for everyone else. However we do need to consider few things (and probably quite few more that were thought of yet).
In both cases we should start testing the migration to see how long with it take. For second option, we need to do a lot of tests before we commence this operation in production.
Sounds like a pain in the butt!
Obviously the second proposal for NC sounds great: if user can have no more that a day without using his/her NC, that would be awesome.
Concerning email, does that mean that there is no stop of the service?
I have started migration of mailboxes. Goes pretty fast. In two/three days initial sync should be complete and then we could run and measure the secondary syncs. This will give us insight into how long this could take. Once we know this we could make decission on how to do the switch from one filestore to the other.
Mailbox migration is done. Initial sync is complete. Re-syncing now takes about 5 hours. We should get it down to just 2-3 hours.
So Weekend (saturday 15th) we should be able to migrate to new server.
Time to get back on nextcloud migration.
Initial sync of nextcloud data has started.
Second sync after initial took about 11.5 hours. I think this can be lowered when running the sync in a loop for few days. Once we have more realistic time on how long will it take to do last sync we can decide on how to do the last sync. I think there are three options though the third one seems to be an overkill at this moment and could cause ton of issues so I woudn't even take in to consideration:
The reason this is rather delicate operation is mainly the fact the files are encrypted and possible inconsistency between database, keys and files could cause files to be not decryptable. Having to deal with such issues in the past, I would rather go for option 1 or 2 as it will be stressful as is.
Probably safest bet is disabling entire nextcloud for the duration of the move but that could cause several hours of downtime. Disabling only files app could be a good compromise as only files would be affected. however we need to make sure no clean cronjob or anything like that would trigger changes to filecache and other file related tables in the database.
At the end it all depends on how long will the last sync take. So lets wait a week or so to have better idea.
I also think that "safest bet is disabling entire nextcloud for the duration of the move" even if it takes several hours. I prefer that rather than the issues the other options could generate.
I agree to. As long as we warn users about it, I guess that it something they can understand.
Got it down to 7.20h I think we could get it down to 5 easily. maybe more. I will do some more tweaking.
We are almost ready but decided on last meeting to move it to january to prevent downtime around end of the year when people would rather have access to their data and its generally not the best period to do this kind of stuff.
Ufffff.... finally. Looks like things are done!
I will keep Knopi in the rack for few more days to make sure things are doing fine. Once we think its ok to put it down, we can hang Simo in it's place.