I use rsnapshot for all of the *nix backups at our company. It's perfect for us.
http://rsnapshot.org/
Discussion on:
View:
Show:
But from reading the rsnapshot link, it has a problem that is all too familiar to me.
Let me explain.
I have a web server that obviously is facing the world. I've made every attempt to secure it, with my fairly limited knowledge, but so far, have been successful. Part of the securing process has been to adopt very long and random passwords that would hopefully take considerable effort to break.
I need to backup that server on a regular basis to another computer on my network. At the moment, I do it manually, using ssh, which requires me to enter my password at the appropriate time. And yes, I fall into all the pitfalls that Chad has outlined, including not doing it as regularly as I should.
Now, it seems to me, not only rsnapshot, but all the automatic backups I've researched so far, need key based logins without a passphrase or password and I feel really uncomfortable with that.
Am is I missing something or misunderstanding something and becoming paranoid about a problem that perhaps, only exists in my mind?
Let me explain.
I have a web server that obviously is facing the world. I've made every attempt to secure it, with my fairly limited knowledge, but so far, have been successful. Part of the securing process has been to adopt very long and random passwords that would hopefully take considerable effort to break.
I need to backup that server on a regular basis to another computer on my network. At the moment, I do it manually, using ssh, which requires me to enter my password at the appropriate time. And yes, I fall into all the pitfalls that Chad has outlined, including not doing it as regularly as I should.
Now, it seems to me, not only rsnapshot, but all the automatic backups I've researched so far, need key based logins without a passphrase or password and I feel really uncomfortable with that.
Am is I missing something or misunderstanding something and becoming paranoid about a problem that perhaps, only exists in my mind?
Using public key authentication for SSH basically authenticates one machine with another, because (generally speaking) SSH keys are generated on a given machine, rather than (presumably) carried around with you between machines over time like OpenPGP keys. The public keys for SSH are generally meant to authenticate a machine rather than a person, in other words, which makes them ideal for purposes like backups and less ideal for purposes like secure communication between people.
SSH establishes an encrypted connection with a given remote system before exchanging authentication data, and the key itself is not actually exchanged; rather, the private key operates on some data that is sent to the remote machine, and if the remote machine possesses the corresponding public key it can then reverse the operation performed by the private key to authenticate the machine that possesses the private key.
Does that make sense? Do you have concerns that are not addressed like this -- such as the ability for a local user to use a machine's private key to access the remote machine via SSH?
If you are running automated backups, the remote machine must by definition "trust" the machine connecting to it. Otherwise, either the backup will not happen, or you will have to be on-hand to authenticate as a user every time a backup runs.
SSH establishes an encrypted connection with a given remote system before exchanging authentication data, and the key itself is not actually exchanged; rather, the private key operates on some data that is sent to the remote machine, and if the remote machine possesses the corresponding public key it can then reverse the operation performed by the private key to authenticate the machine that possesses the private key.
Does that make sense? Do you have concerns that are not addressed like this -- such as the ability for a local user to use a machine's private key to access the remote machine via SSH?
If you are running automated backups, the remote machine must by definition "trust" the machine connecting to it. Otherwise, either the backup will not happen, or you will have to be on-hand to authenticate as a user every time a backup runs.
Thank you so much for that personalised reply.
I think the second sentence in your first paragraph addressed the concern I had, which it seems was unfounded.
My concern, was that somehow, a malicious user could use the automated process to gain access to the server; something that I've jealously guarded against.
In terms of a security risk from within - there is none, as it's a family operation and no one from outside has access.
As I think I may have said before, you have a wonderful ability to take complex subjects and break them down to an understandable level (even for me :-)) And yes, your post made perfect sense.
Many thanks indeed. I shall progress along this path further.
I think the second sentence in your first paragraph addressed the concern I had, which it seems was unfounded.
My concern, was that somehow, a malicious user could use the automated process to gain access to the server; something that I've jealously guarded against.
In terms of a security risk from within - there is none, as it's a family operation and no one from outside has access.
As I think I may have said before, you have a wonderful ability to take complex subjects and break them down to an understandable level (even for me :-)) And yes, your post made perfect sense.
Many thanks indeed. I shall progress along this path further.
I'm quite pleased that you found my response so helpful. A grateful reply like yours makes my day.
I think it's looking for certificate login over ssh. In general, certificate login is better; password check happens on your local machine and private certificate does not get used unless the password is correct.
With automated stuff, the issue is the password. Normally programs want a password-less certificate rather than having to use the cert and store the password for it. In that case, the challenge becoming protecting your private certificate.
The central server gets a dedicated "remote connecting" user with ssh certs. Both public and private certs can be set Read Only by Owner ("- r-- --- ---" or "chmod 400"). I go so far as to then remove the password from the connecting user; one must log in as a regular user, su root, su connecting user - (then all they get is that user's certificates since it hasn't rights to do much anything on the system).
On each remote system, create a dedicated "incoming connections" user. I don't believe this user needs ssh certificates since it is only for recieving connections. ssh-copy-id the connecting user's certificate into the each remote system's incoming user. Once the certificate is in place, delete the incoming user's password and confirm that you can still connect by certificate. Again; one must log in as regular user, su root, su incoming user since this user should never have need to login by password.
Each remote system has a dedicated incoming user and the central user's public certificate. The core serve has the connecting user's public and private certificate. On both central and remote systems, the dedicated user has no password; one must su to user. The only account that can log into the remote user's is the central user with it's private certificate.
It sounds like your setup would have two machines; central server and webserver. Your central server would have the public and private keys. You want it to reach out to the webserver to harvest backups not wait for webserver to reach in and feed it backups. On your webserver, you have the "backup connection user" account and the central server's public certificate.
If your webserver is broken, they may get your central server's public certificate. They can't reverse the private cert out of it or reverse out a usable password. They can't initiate connections from the public cert side. You just need to keep your central server protected since it contains the private cert which would let one initiate connections with other related systems.
In the end, you do have to decide if you trust unprotected certificates in your chosen OS's file permissions. Can you limit the valuable cert to one machine and can you get at that certificate without being Root or the valid certificate owner?
With automated stuff, the issue is the password. Normally programs want a password-less certificate rather than having to use the cert and store the password for it. In that case, the challenge becoming protecting your private certificate.
The central server gets a dedicated "remote connecting" user with ssh certs. Both public and private certs can be set Read Only by Owner ("- r-- --- ---" or "chmod 400"). I go so far as to then remove the password from the connecting user; one must log in as a regular user, su root, su connecting user - (then all they get is that user's certificates since it hasn't rights to do much anything on the system).
On each remote system, create a dedicated "incoming connections" user. I don't believe this user needs ssh certificates since it is only for recieving connections. ssh-copy-id the connecting user's certificate into the each remote system's incoming user. Once the certificate is in place, delete the incoming user's password and confirm that you can still connect by certificate. Again; one must log in as regular user, su root, su incoming user since this user should never have need to login by password.
Each remote system has a dedicated incoming user and the central user's public certificate. The core serve has the connecting user's public and private certificate. On both central and remote systems, the dedicated user has no password; one must su to user. The only account that can log into the remote user's is the central user with it's private certificate.
It sounds like your setup would have two machines; central server and webserver. Your central server would have the public and private keys. You want it to reach out to the webserver to harvest backups not wait for webserver to reach in and feed it backups. On your webserver, you have the "backup connection user" account and the central server's public certificate.
If your webserver is broken, they may get your central server's public certificate. They can't reverse the private cert out of it or reverse out a usable password. They can't initiate connections from the public cert side. You just need to keep your central server protected since it contains the private cert which would let one initiate connections with other related systems.
In the end, you do have to decide if you trust unprotected certificates in your chosen OS's file permissions. Can you limit the valuable cert to one machine and can you get at that certificate without being Root or the valid certificate owner?
Thank you so much for this very comprehensive reply regarding certificate protection.
The idea of creating a dedicate backup user is superb and something I hadn't considered.
That along with severely restricting permissions is a path well worth taking and something as I progress this, will build into the overall process.
In essence, although I remain sceptical about all security issues, both are Debian computers, which remain (in my view) really solid machines that I don't have any major concerns about.
Thanks again for taking the time to reply. It's most appreciated.
The idea of creating a dedicate backup user is superb and something I hadn't considered.
That along with severely restricting permissions is a path well worth taking and something as I progress this, will build into the overall process.
In essence, although I remain sceptical about all security issues, both are Debian computers, which remain (in my view) really solid machines that I don't have any major concerns about.
Thanks again for taking the time to reply. It's most appreciated.
Rsync is fantastic and only gets better when you add in ssh. Transfer between osX, Debian, the NAS box, my remote servers.. it "just works" (tm). The primary reason I continue to mention the lack of native SSH service in windows is primarily because of how slick SSH and Rsync over SSH are.
The only thing I miss with rsync is syncronization. It will go from A to B. It'll delete things at B which do not exist at A. In a single processing step, it won't copy both directions between A and B to balance out what has changed in both locations. I've even put time into trying to figure out ways to sync using an intermediary location but no such luck. Am I missing something? Can Rsync do a true sync between two locations rather than a one way sync with "--delete"?
For now, Unison provides a cross platform tool with synchronization between both side but it means having the GUI layer too.
But.. for one-way A to B transfers, rsync if fantastic. I use it to sync media to my palmtop and even to sync PIM data from the palm top to desktop and notebook (GPE). I'm not yet to the point where rsync is my choice for copying around on the same machine but it's used where possible between machines.
The only thing I miss with rsync is syncronization. It will go from A to B. It'll delete things at B which do not exist at A. In a single processing step, it won't copy both directions between A and B to balance out what has changed in both locations. I've even put time into trying to figure out ways to sync using an intermediary location but no such luck. Am I missing something? Can Rsync do a true sync between two locations rather than a one way sync with "--delete"?
For now, Unison provides a cross platform tool with synchronization between both side but it means having the GUI layer too.
But.. for one-way A to B transfers, rsync if fantastic. I use it to sync media to my palmtop and even to sync PIM data from the palm top to desktop and notebook (GPE). I'm not yet to the point where rsync is my choice for copying around on the same machine but it's used where possible between machines.
Unfortunately (for you), rsync is a backup tool -- which assumes a one-way relationship. If you want to be able to synchronize between multiple locations, especially without ugly kludges and an intermediary server, you need something like Unison or a DVCS such as Mercurial.
I've been considering Subversion but that means keeping a central server rather than more arbitrary relationships.
Unison is my tool of choice. When I was last comparing, it had the best management of sync and conflict resolution. It's also available across *nix and Windows with a portableapps version.
I was hoping I'd missed an Rsync command switch but so be it. Rsync is is great for what it does.
Unison is my tool of choice. When I was last comparing, it had the best management of sync and conflict resolution. It's also available across *nix and Windows with a portableapps version.
I was hoping I'd missed an Rsync command switch but so be it. Rsync is is great for what it does.
The primary reason I continue to mention the lack of native SSH service in windows is primarily because of how slick SSH and Rsync over SSH are.
I have said exactly this a bazillion times. I'm pretty sure Microsoft is actively interested in preventing such an incredibly useful, free, cross platform functionality as ssh and rsync.
Windows backup solutions suck, unless you want to fork over to a third party vendor. If ssh were easier to implement on Windows this would be a much easier occupation...
I tried unison across differing OSs, but never got over what appears to be very strict versioning. I couldn't find the same version # for both Linux and Windows, and any attempt to use it threw an error to the effect that the versions must be the same.
I have said exactly this a bazillion times. I'm pretty sure Microsoft is actively interested in preventing such an incredibly useful, free, cross platform functionality as ssh and rsync.
Windows backup solutions suck, unless you want to fork over to a third party vendor. If ssh were easier to implement on Windows this would be a much easier occupation...
I tried unison across differing OSs, but never got over what appears to be very strict versioning. I couldn't find the same version # for both Linux and Windows, and any attempt to use it threw an error to the effect that the versions must be the same.
Unison should just be taking two directories and keeping them in sync. I've not relied on it's rsync, just it's sync between two locations.
With Windows, I use portable Unison; keeps it's profiles clean and in one place and I normally have it on the flashdrive it's syncing folders to/from. With *nix, I install Unison then create it's relevant profiles. Never had a conflict over versions but I'm not sharing the profiles config between versions.
With Windows, I use portable Unison; keeps it's profiles clean and in one place and I normally have it on the flashdrive it's syncing folders to/from. With *nix, I install Unison then create it's relevant profiles. Never had a conflict over versions but I'm not sharing the profiles config between versions.
Since I'd given up on unison I was unaware of 'portable unison.' Looks like the solution to those pesky version errors.
Thanks for the tip... as usual.
Thanks for the tip... as usual.
What would be the most effective way to maintain incremental backups? I'd like to be able to keep a monthly backup for a year, weekly backups for a quarter, and incremental daily backups from the prior weekly backup for a month. But I don't have enough space for all those daily backups in full.
Great article, BTW.
Great article, BTW.
With my scripted backup, I squash directories into tarballs. It includes the date as part of the filename.tar.bz2. For April, cleanup will involve removing March; "rm *-201003*.tar.bz2"
Now the part I haven't automated is the actual file removal. My backup script does not yet remove the outdated copies.
Alternatively, I know some of the rsync based backup options use links. This is to save on space; if a file hasn't changed, simply include a link from the current backup to that older file
backup 1 = file1.ext
backup 2 = link-file1.ext, file2.ext
backup 3 = link-file1.ext, file3,ext
Now the part I haven't automated is the actual file removal. My backup script does not yet remove the outdated copies.
Alternatively, I know some of the rsync based backup options use links. This is to save on space; if a file hasn't changed, simply include a link from the current backup to that older file
backup 1 = file1.ext
backup 2 = link-file1.ext, file2.ext
backup 3 = link-file1.ext, file3,ext
If you have enough room for three full backups plus the incremental diffs, you can just maintain three rsync backup lineages -- one of them updated monthly, another weekly, and another daily.
I've been using a cp argument in the scripts to move the last daily sync to a backup that rsync won't touch first, before doing the sync.
So there's the 'live' copy the sync will update, and a rolling one day old copy. The -f switch overwrites the daily-old without prompting. (eg automated with cron)
I do the same with a weekly, monthly or any other requirement. Just run the "monthly" script once a month, which cp's the current to a separate "month" copy.
Arguments in the cp can point the given archival copy anywhere, off site, on a removable device etc.
Now that you all got me thinking, I should just use a separate rsync for weekly, monthly etc as needed, rather than copying to an archive. It'll use a lot less overhead and I can dump the 'wait' I have to put between the cp and rsync to insure the cp completes before the rsync changes the source.
Duh, duh and duh. The only reason to use a cp would be to place the archive on a windows system, where rsync and ssh aren't welcome.
Thankfully all the backup solutions I've deployed are on Linux.
Funny how the obvious can sit there unseen, right in front of you basically indefinitely, until Chad writes an article about it. =D
Problem is the cp also just plain works(tm). A heck of a lot more inefficient...
So there's the 'live' copy the sync will update, and a rolling one day old copy. The -f switch overwrites the daily-old without prompting. (eg automated with cron)
I do the same with a weekly, monthly or any other requirement. Just run the "monthly" script once a month, which cp's the current to a separate "month" copy.
Arguments in the cp can point the given archival copy anywhere, off site, on a removable device etc.
Now that you all got me thinking, I should just use a separate rsync for weekly, monthly etc as needed, rather than copying to an archive. It'll use a lot less overhead and I can dump the 'wait' I have to put between the cp and rsync to insure the cp completes before the rsync changes the source.
Duh, duh and duh. The only reason to use a cp would be to place the archive on a windows system, where rsync and ssh aren't welcome.
Thankfully all the backup solutions I've deployed are on Linux.
Funny how the obvious can sit there unseen, right in front of you basically indefinitely, until Chad writes an article about it. =D
Problem is the cp also just plain works(tm). A heck of a lot more inefficient...
Just ran into this, and it seems remotely related, so I thought I'd post the link in case it is of interest to anyone with Linux servers.
http://www.r1soft.com/tools/linux-hot-copy/
http://www.r1soft.com/tools/linux-hot-copy/
- Keyboard Shortcuts:
- Prev
- Next
- Toggle

































