|Name||WANdisco LiveMigrator Demo, Paul Scott-Murphy|
|Description||Join WANdisco's Paul Scott Murphy for Demo on the LiveMigrator feature. The world’s first non-blocking technology that seamlessly migrates petabytes of unstructured data from on-premise data centers to any cloud vendor in one pass. Rather than implementing costly and risky multiple passes to migrate data, applications can continue to access the on-prem environment—even as data moves to the cloud—with users directing new workloads or queries at cloud assets.|
Hello my name's Paul Scott-Murphy I'm the VP of Product Management for WANdisco and what I'm going to show you today is our LiveMigrator product which is an extension to our WANdisco Fusion platform which explicitly targets the migration of large scale data from one environment to another. We've built this product to support customers who want to take their on-remises data at-scale and bring it into the cloud but to do that without disrupting business operations against that data while it's in migration. One of the key challenges in migrating large volumes of data from one environment to another is dealing with the changes that occur on the donor side of that migration any organization who wants to move massive quantities of their information from an on-premises environment to one of the public cloud providers needs to deal with this problem in some way it's not always feasible to stop your business operations on-premises while the migration is underway so you need technology that can cater explicitly for the migration of data from one environment to another while that data undergoes change and that's exactly what the LiveMigrator product is designed to do. The concept behind the LiveMigrator product deals with the replication of data from a donor environment to a beneficiary it uses a single scan iterator that visits every file or object once and once only in doing so replicating that content to the beneficiary environment and making sure that it's got a complete equivalent copy of that data but what does the the technology do if data is undergoing change how is LiveMigrator handled changing data if an application modifies content that exists that the iterator hasn't seen yet that change will eventually be reached by the iterator and replicated to the target environment The end result of that is a completely consistent copy at the completion of the migration task.If the data changes in an area from the donor environment that's already been visited by the iterator LiveMigrator needs to transfer that change as soon as it occurs so with these two things in combination along with iterator a data that reside in a location that's already been visited or one that's yet to be visit we'll always be replicated to the beneficiary environment and you have a completely consistent copy of your data there on the completion of the migration tasks to show you the LiveMigrator product in operation I'm going to demonstrate migration from a large collection of data in an on-premises Hadoop cluster to cloud storage my demonstration environment consists of my HDFS cluster and the cloud storage environment. Within my LiveMigrator configuration I've got a collection of replication rules we're going to use one of these as the the location of our migration task .I I navigate to this this folder ian_3gb it shows us that we've got about 10 gigabytes of data in our on-premises cluster we can view that from the command line in my Hadoop client to show a listing of directories each of which contains sub directories with many many files within them to make up this 10 gigabytes of data. The target environment the beneficiary of all of this migration is a particular bucket in my cloud storage account called PSM hyper migrator demo at the moment this bucket is completely devoid of any content it contains no files we're going to initiate the migration task by navigating to my replication rule in the Fusion user interface selecting the Live Migrator tab and just clicking the start button. Once I've done that the migration task is in progress and we get visibility of the current status of that migration shown where replication is occurring. Refresh the bucket to see that it's making progress against the collection of directories. The more interesting thing though that LiveMigrator does is it handles the changing data in the donor environment. So I can modify content in the donor environment here's an example I'll put a new hundred megabyte file into a directory that's already been visited by the iterator that LiveMigrator is progressing through to conduct its replication will place a new file in that A folder and we should see that in the target storage environment that particular file is already replicated. If I never get into the A folder alongside the existing directories that exist there the new 100 megabyte file is being copied across and we can see it in the copying state at the moment. Refreshing that bucket , will eventually show me here that the full content of that 100 megabyte file has been replicated to my target environment. Equivalently I can be placing files into locations that the iterator is not yet reached in this case into the Z folder. Placing a new 100 megabyte file into a location that the iterator is yet to get to will simply mean that that operation doesn't take place immediately but the live migration task awaits that reaching the Z folder in which case that change will be replicated. The outcome of all of this is that on completion of my migration which I can monitor through the status page on the LiveMigrator tab in the fusion user interface I'll have a completely consistent copy of the data that existed in my Hadoop cluster present in my storage account. So once the live migration task reaches completion we see here it's statuses finishing. We can review the content that have been migrated to our Cloud Storage bucket and we should see here the full collection of directories and in my Z directory again in addition to all of the directories within it that contain content the additional 100 megabyte file that was placed there in my Hadoop cluster to begin with. So migration task is is complete at this stage and what we have is a completely consistent copy of the data in our Cloud Storage bucket that existed previously in the on-premises Hadoop cluster so to summarize what LiveMigrator is done it's allowed us to migrate a large collection of data from an on-premises environment into the cloud even while applications continue to work against that data in the on-premises storage system in our case we've migrated from a Hadoop cluster into cloud storage we've done that for a volume of data at about 10 gigabytes and while that migration has been underway we've modified up or modified the application data in our on-premises environment and demonstrated that those changes are also made present in the cloud storage.