For a company creating a disaster recovery plan, one of the often overlooked issues is how much data you really need at your DR site. The easy answer is all of it, but this means that your company has to spend money on the storage at the DR site and on the replication solution to get ALL of the data to the DR site, as well as keeping it updated. A more diligent DR planner would work with the business units in a company and the IT department to determine what data the company really needs to run its daily operations in the case of a disaster.
The fact is, not all the data on the servers, SAN, NAS, etc., are required for the business to run its daily operations. Most data is simply being kept as a CYA or is required by regulations. This type of data should be stored on low-cost disk in an archive or on tape for long-term storage. This means that it's available when needed, just not right now. You don't need this "long-term" data taking up valuable space in your replication solution.
Data that is required in the case of a disaster can usually be split into tiers. The first tier is made up of the most critical systems that generally have to be running within 24 hours. The second tier consists of the systems that need to be running within 72 hours, and the third tier 96+ hours. Defining which systems are belong in which tier can involve a lengthy and debated discussion. But simply put, every business needs to find this out for itself.
The data, no matter what tier it exists in, must be stored. The storage between sites doesn't have to be the same and the systems could actually be different. (This, of course, depends on the type of replication you use.) You could replicate a production server that has local storage to the DR site to a virtual server that uses SAN or NAS storage. You could also replicate many production servers' data to a single server at the DR site. It depends on what resources you have and how much you really need to have at the DR site.
The goal for anyone in charge of storage is to make sure that data gets to the DR site and is accessible. The system team will have their hands full getting systems up and running — you don't want them looking at you and wondering where the data is. And you don't want the boss looking at the team wondering why the systems are running but he can't do anything.
So what's the simple solution? Replicate the data to an always-on storage solution. This could be a few servers with a lot of local storage or a NAS/SAN device that has storage space available. When you limit the amount of data (remove the stale data) that you have to have available at the DR site, you can cut down the amount of capacity and the amount of bandwidth you need to move the data to the DR site.