Migrate to Google Cloud: Transfer your large datasets  |  Cloud Architecture Center (2024)

Last reviewed 2023-11-13 UTC

For many customers, the first step in adopting a Google Cloudproduct is getting their data into Google Cloud. This document exploresthat process, from planning a data transfer to using best practices inimplementing a plan.

Transferring large datasets involves building the right team, planning early,and testing your transfer plan before implementing it in a productionenvironment. Although these steps can take as much time as the transfer itself,such preparations can help minimize disruption to your business operationsduring the transfer.

This document is part of the following multi-part series about migrating toGoogle Cloud:

  • Migrate to Google Cloud: Get started
  • Migrate to Google Cloud: Assess and discover your workloads
  • Migrate to Google Cloud: Plan and build your foundation
  • Migrate to Google Cloud: Transfer your large datasets(this document)
  • Migrate to Google Cloud: Deploy your workloads
  • Migrate to Google Cloud: Migrate from manual deployments toautomated, containerized deployments
  • Migrate to Google Cloud: Optimize your environment
  • Migrate to Google Cloud: Best practices for validating a migration plan
  • Migrate to Google Cloud: Minimize costs

What is data transfer?

For the purposes of this document, data transfer is the process of moving datawithout transforming it, for example, moving files as they are into objects.

Data transfer isn't as simple as it sounds

It's tempting to think of data transfer as one giant FTP session, where you putyour files in one side and wait for them to come out the other side. However,in most enterprise environments, the transfer process involves many factors suchas the following:

  • Devising a transfer plan that accounts for administrative time,including time to decide on a transfer option, get approvals, and deal withunanticipated issues.
  • Coordinating people in your organization, such as the team that executesthe transfer, personnel who approve the tools and architecture, andbusiness stakeholders who are concerned with the value and disruptions thatmoving data can bring.
  • Choosing the right transfer tool based on your resources, cost, time,and other project considerations.
  • Overcoming data transfer challenges, including "speed of light" issues(insufficient bandwidth), moving datasets that are in active use,protecting and monitoring the data while it's in flight, and ensuring thedata is transferred successfully.

This document aims to help you get started on a successful transferinitiative.

Other projects related to data transfer

The following list includes resources for other types of data transfer projectsnot covered in this document:

  • If you need to transform your data (such as combining rows, joiningdatasets, or filtering out personal identifiable information), you shouldconsideran extract, transform, and load (ETL) solution that can deposit data into a Google Cloud data warehouse.
  • If you need to migrate a database and related apps (for example, to migratea database app), seeDatabase migration: Concepts and principles.

Step 1: Assembling your team

Planning a transfer typically requires personnel with the following roles andresponsibilities:

  • Enabling resources needed for a transfer: Storage, IT, and networkadmins, an executive sponsor, and other advisors (for example, a GoogleAccount team or integration partners)
  • Approving the transfer decision: Data owners or governors (for internalpolicies on who is allowed to transfer what data), legal advisors (fordata-related regulations), and a security administrator (for internalpolicies on how data access is protected)
  • Executing the transfer: A team lead, a project manager (forexecuting and tracking the project), an engineering team, and on-sitereceiving and shipping (to receive appliance hardware)

It's crucial to identify who owns the preceding responsibilities for yourtransfer project and to include them in planning and decision meetings whenappropriate. Poor organizational planning is often the cause of failed transferinitiatives.

Gathering project requirements and input from these stakeholders can bechallenging, but making a plan and establishing clear roles and responsibilitiespays off. You can't be expected to know all the details of your data. Assemblinga team gives you greater insight into the needs of the business. It's a bestpractice to identify potential issues before you invest time, money, andresources to complete the transfers.

Step 2: Collecting requirements and available resources

When you design a transfer plan, we recommend that you first collectrequirements for your data transfer and then decide on a transfer option. Tocollect requirements, you can use the following process:

  1. Identify what datasets you need to move.
    • Select tools likeData Catalog to organize your data into logical groupings that are moved and usedtogether.
    • Work with teams within your organization to validate or update thesegroupings.
  2. Identify what datasets you can move.
    • Consider whether regulatory, security, or other factorsprohibit some datasets from being transferred.
    • If you need to transform some of your data before you move it(for example, to remove sensitive data or reorganize your data),consider using a data integration product likeDataflow orCloud Data Fusion,or a workflow orchestration product likeCloud Composer.
  3. For datasets that are movable, determine where to transfer each dataset.
    • Record which storage option you select to store your data. Typically, the target storage system onGoogle Cloud isCloud Storage.Even if you need more complex solutions after your applications are upand running, Cloud Storage is a scalable and durable storage option.
    • Understand what data access policies must be maintained after migration.
    • Determine if you need to store this data in specific regions.
    • Plan how to structure this data at the destination. Forexample, will it be the same as the source or different?
    • Determine if you need to transfer data on an ongoing basis.
  4. For datasets that are movable, determine what resources are availableto move them.
    • Time: When does the transfer need to be completed?
    • Cost: What is the budget available for the team and transfer costs?
    • People: Who is available to execute the transfer?
    • Bandwidth (for online transfers): How much of your availablebandwidth for Google Cloud can be allocated for a transfer, and forwhat period of time?

Before you evaluate and select transfer options in the next phase of planning,we recommend that you assess whether any part of your IT model can be improved,such as data governance, organization, and security.

Your security model

Many members of the transfer team might begranted new roles in your Google Cloud organization as part of your datatransfer project. Data transfer planning is a great time to review yourIdentity and Access Management (IAM) permissions andbest practices for using IAM securely.These issues can affect how yougrant access to your storage. For example, you might place strict limits on write access todata that has been archived for regulatory reasons, but you might allow manyusers and applications to write data to your test environment.

Your Google Cloud organization

How you structure your data on Google Cloud depends on how you plan to useGoogle Cloud. Storing your data in the same Google Cloud project whereyou run your application may work, but it might not be optimal froma management perspective. Some of your developers might not have privilege toview the production data. In that case, a developer could develop code on sampledata, while a privileged service account could access production data. Thus, youmight want to keep your entire production dataset in a separateGoogle Cloud project, and then use a service account to allow access to thedata from each application project.

Google Cloud is organized around projects. Projects can be grouped intofolders, and folders can be grouped under your organization. Roles areestablished at the project level and the access permissions are added to theseroles at the Cloud Storage bucket levels. This structure aligns withthe permissions structure of otherobject store providers.

For best practices to structure a Google Cloud organization, seeDecide a resource hierarchy for your Google Cloud landing zone.

Step 3: Evaluating your transfer options

To evaluate your data transfer options, the transfer team needs to considerseveral factors, including the following:

  • Cost
  • Transfer time
  • Offline versus online transfer options
  • Transfer tools and technologies
  • Security

Cost

Most of the costs associated with transferring data include the following:

  • Networking costs
    • Ingress to Cloud Storage is free. However, if you'rehosting your data on a public cloud provider, you can expectto pay anegress charge and potentially storage costs (for example, read operations) fortransferring your data. This charge applies for data coming from Googleor another cloud provider.
    • If your data is hosted in a private data center that youoperate, you might also incur added costs for setting up morebandwidth to Google Cloud.
  • Storage and operation costs for Cloud Storage during and afterthe transfer of data
  • Product costs (for example, a Transfer Appliance)
  • Personnel costs for assembling your team and acquiring logistical support

Transfer time

Few things in computing highlight the hardware limitations of networks astransferring large amounts of data. Ideally, you can transfer 1GB ineight seconds over a 1Gbps network. If you scale that up to a huge dataset(for example, 100TB), the transfer time is 12 days. Transferring hugedatasets can test the limits of your infrastructure and potentially causeproblems for your business.

You can use the following calculator to understand how much time a transfermight take, given the size of the dataset you're moving and the bandwidthavailable for the transfer. A certain percentage of management time isfactored into the calculations. Additionally, an effective bandwidth efficiencyis included, so the resulting numbers are more realistic.

You might not want to transfer large datasets out of your company networkduring peak work hours. If the transfer overloads the network, nobody else willbe able to get necessary or mission-critical work completed. For this reason,the transfer team needs to consider the factor of time.

After the data is transferred to Cloud Storage, you can use a number oftechnologies to process the new files as they arrive, such asDataflow.

Increasing network bandwidth

How you increase network bandwidth depends on how you connect toGoogle Cloud.

In a cloud-to-cloud transfer between Google Cloud and other cloudproviders, Google provisions the connection between cloud vendor data centers,requiring no setup from you.

If you're transferring data between your private data center andGoogle Cloud, there are several approaches, such as:

  • A public internet connection by using a public API
  • Direct Peering by using a public API
  • Cloud Interconnect by using a private API

When evaluating these approaches, it's helpful to consider your long-termconnectivity needs. You might conclude that it's cost prohibitive to acquirebandwidth solely for transfer purposes, but when factoring in long-term use ofGoogle Cloud and the network needs across your organization, the investmentmight be worthwhile. For more information about how to connect your networks toGoogle Cloud, seeChoose a Network Connectivity product.

If you opt for an approach that involves transferring data over the publicinternet, we recommend that you check with your security administrator onwhether your company policy forbids such transfers. Also, check whether the publicinternet connection is used for your production traffic.Finally, consider that large-scale data transfers might negatively impact theperformance of your production network.

Online versus offline transfer

A critical decision is whether to use an offline or online process for your datatransfer. That is, you must choose between transferring over a network, whetherit's a Cloud Interconnect or the public internet, or transferring by usingstorage hardware.

To help with this decision, we provide atransfer calculatorto help you estimate the time and cost differences between these two options.The following chart also shows some transfer speeds for various dataset sizesand bandwidths. A certain amount of management overhead is built into thesecalculations.

Migrate to Google Cloud: Transfer your large datasets | Cloud Architecture Center (1)

As noted earlier, you might need to consider whether the costto achieve lower latencies for your data transfer (such as acquiring networkbandwidth) is offset by the value of that investment to your organization.

Options available from Google

Google offers several tools and technologies to help you perform a datatransfer.

Deciding among Google's transfer options

Choosing a transfer option depends on your use case, as the following tableshows.

Where you're moving data fromScenarioSuggested products
Another cloud provider (for example, Amazon Web Services or Microsoft Azure) to Google CloudStorage Transfer Service
Cloud Storage to Cloud Storage (two different buckets)Storage Transfer Service
Your private data center to Google CloudEnough bandwidth to meet your project deadlinegcloud storage command
Your private data center to Google CloudEnough bandwidth to meet your project deadlineStorage Transfer Service for on-premises data
Your private data center to Google CloudNot enough bandwidth to meet your project deadlineTransfer Appliance

gcloud storage command for smaller transfers of on-premises data

The gcloud storage command is the standard tool for small- to medium-sized transfers over a typicalenterprise-scale network, from a private data center or from another cloudprovider to Google Cloud. While gcloud storage supports uploading objects upto themaximum Cloud Storage object size,transfers of large objects are more likely to experience failures than short-running transfers.For more information about transferring large objects to Cloud Storage,see Storage Transfer Service for large transfers of on-premises data.

The gcloud storage command is especially useful in the following scenarios:

  • Your transfers need to be executed on an as-needed basis, or duringcommand-line sessions by your users.
  • You're transferring only a few files or very large files, or both.
  • You're consuming the output of a program (streaming output toCloud Storage).
  • You need to watch a directory with a moderate number of files and sync anyupdates with very low latencies.

Storage Transfer Service for large transfers of on-premises data

Like the gcloud storage command,Storage Transfer Service for on-premises data enables transfers from network file system (NFS) storage toCloud Storage. Storage Transfer Service for on-premises data is designedfor large-scale transfers (up to petabytes of data, billions of files). Itsupports full copies or incremental copies, and it works on all transfer optionslisted earlier inDeciding among Google's transfer options. Italso has a managed graphical user interface; even non-technically savvyusers (after setup) can use it to move data.

Storage Transfer Service for on-premises data is especially useful in the followingscenarios:

  • You have sufficient available bandwidth to move the data volumes (seetheGoogle Cloud Data Transfer Calculator).
  • You support a large base of internal users who might find a command-linetool challenging to use.
  • You need robust error-reporting and a record of all files and objectsthat are moved.
  • You need to limit the impact of transfers on other workloads in yourdata center (this product can stay under a user-specified bandwidth limit).
  • You want to run recurring transfers on a schedule.

You set up Storage Transfer Service for on-premises data by installingon-premises software (known as agents) onto computers in your data center.

After setting up Storage Transfer Service, you can initiate transfers in theGoogle Cloud console by providing a source directory, destination bucket, and timeor schedule.Storage Transfer Service recursively crawls subdirectories and files in thesource directory and creates objects with a corresponding name inCloud Storage (the object /dir/foo/file.txt becomes an object in thedestination bucket named /dir/foo/file.txt). Storage Transfer Serviceautomatically re-attempts a transfer when it encounters any transient errors.While the transfers are running, you can monitor how many files are moved andthe overall transfer speed, and you can view error samples.

When Storage Transfer Service completes a transfer, it generates a tab-delimitedfile (TSV) with a full record of all files touched and any error messagesreceived. Agents are fault tolerant, so if an agent goes down, the transfercontinues with the remaining agents. Agents are also self-updating andself-healing, so you don't have to worry about patching the latest versions orrestarting the process if it goes down because of an unanticipated issue.

Things to consider when using Storage Transfer Service:

  • Use an identical agent setup on every machine. All agents shouldsee the same Network File System (NFS) mounts in the same way (samerelative paths). This setup is a requirement for the product to function.
  • More agents results in more speed. Because transfers are automaticallyparallelized across all agents, we recommend that you deploy many agentsso that you use your available bandwidth.
  • Bandwidth caps can protect your workloads. Your other workloads mightbe using your data center bandwidth, so set a bandwidth cap to preventtransfers from impacting your SLAs.
  • Plan time for reviewing errors. Large transfers can often result inerrors requiring review. Storage Transfer Service lets you see a sample ofthe errors encountered directly in the Google Cloud console. If needed,you can load the full record of all transfer errors to BigQuery tocheck on files or evaluate errors that remained even after retries. Theseerrors might be caused by running apps that were writing to the source whilethe transfer occurred, or the errors might reveal an issue that requirestroubleshooting (for example, permissions error).
  • Set up Cloud Monitoring for long-running transfers.Storage Transfer Service lets Monitoring monitor agent healthand throughput, so you can set alerts that notify you when agents are downor need attention. Acting on agent failures is important for transfers thattake several days or weeks, so that you avoid significantslowdowns or interruptions that can delay your project timeline.

Transfer Appliance for larger transfers

For large-scale transfers (especially transfers with limited network bandwidth),Transfer Appliance is an excellent option, especially when a fast network connection is unavailableand it's too costly to acquire more bandwidth.

Transfer Appliance is especially useful in the followingscenarios:

  • Your data center is in a remote location with limited or no access tobandwidth.
  • Bandwidth is available, but cannot be acquired in time to meet yourdeadline.
  • You have access to logistical resources to receive and connectappliances to your network.

With this option, consider the following:

  • Transfer Appliance requires that you're able to receiveand ship back the Google-owned hardware.
  • Depending on your internet connection, the latency for transferring datainto Google Cloud is typically higher withTransfer Appliance than online.
  • Transfer Appliance is available only in certain countries.

The two main criteria to consider with Transfer Appliance arecost and speed. With reasonable network connectivity (for example, 1 Gbps),transferring 100TB of data online takes over 10 days to complete. If this rateis acceptable, an online transfer is likely a good solution for your needs. Ifyou only have a 100Mbps connection (or worse from a remote location), the sametransfer takes over 100 days. At this point, it's worth considering anoffline-transfer option such as Transfer Appliance.

Acquiring a Transfer Appliance is straightforward. In theGoogle Cloud console, you request a Transfer Appliance,indicate how much data you have, and then Google ships one or more appliances toyour requested location. You're given a number of days to transfer your data tothe appliance ("data capture") and ship it back to Google.

Storage Transfer Service for cloud-to-cloud transfers

Storage Transfer Service is a fully managed, highly scalable service to automate transfers from otherpublic clouds into Cloud Storage. For example, you can useStorage Transfer Service to transfer data fromAmazon S3 to Cloud Storage.

For HTTP, you can give Storage Transfer Service a list of public URLs ina specified format.This approach requires that you write a script providing the size of eachfile in bytes, along with a Base64-encoded MD5 hash of the file contents.Sometimes the file size and hash are available from the source website. Ifnot, you need local access to the files, in which case, it might be easier touse the gcloud storage command, as described earlier.

If you have a transfer in place, Storage Transfer Service is a great way to getdata and keep it, particularly when transferring from another public cloud.

If you would like to move data from another cloud not supported byStorage Transfer Service, you can use the gcloud storage command from acloud-hosted virtual machine instance.

Security

For many Google Cloud users, security is their primary focus, and thereare different levels of security available. A few aspects of security toconsider include protecting data at rest (authorization and access to the sourceand destination storage system), protecting data while in transit, andprotecting access to the transfer product. The following table outlines theseaspects of security by product.

ProductData at restData in transitAccess to transfer product
Transfer ApplianceAll data is encrypted at rest.Data is protected with keys managed by the customer.Anyone can order an appliance, but to use it they need access to the datasource.
gcloud storage commandAccess keys required to access Cloud Storage, which is encryptedat rest.Data is sent over HTTPS and encrypted in transit.Anyone can download and run the Google Cloud CLI. They must havepermissions to buckets and local files in order to move data.
Storage Transfer Service for on-premises dataAccess keys required to access Cloud Storage, which is encryptedat rest. The agent process can access local files as OS permissions allow.Data is sent over HTTPS and encrypted in transit.You must have object editor permissions to access Cloud Storagebuckets.
Storage Transfer ServiceAccess keys required for non-Google Cloud resources (for example, Amazon S3). Access keys are required to access Cloud Storage, which is encrypted at rest.Data is sent over HTTPS and encrypted in transit.You must have IAM permissions for the service account toaccess the source and object editor permissions for anyCloud Storage buckets.

To achieve baseline security enhancements, online transfers toGoogle Cloud using the gcloud storage commandare accomplished over HTTPS, data is encrypted in transit, and all data inCloud Storage is, by default, encrypted at rest.If you useTransfer Appliance,security keys that you control can help protect your data. Generally, werecommend that you engage your security team to ensure that your transfer planmeets your company and regulatory requirements.

Third-party transfer products

For advanced network-level optimization or ongoing data transfer workflows, youmight want to use more advanced tools. For information about more advancedtools, seeGoogle Cloud partners.

Step 4: Evaluating data migration approaches

When migrating data, you can follow these general steps:

  1. Transfer data from the legacy site to the new site.
  2. Resolve any data integration issues that arise—for example, synchronizing the same data from multiple sources.
  3. Validate the data migration.
  4. Promote the new site to be the primary copy.
  5. When you no longer need the legacy site as a fallback option, retire it.

You should base your data migration approach on the following questions:

  • How much data do you need to migrate?
  • How often does this data change?
  • Can you afford the downtime represented by a cut-over window whilemigrating data?
  • What is your current data consistency model?

There is no best approach; choosing one depends on the environment and on yourrequirements.

The following sections present four data migration approaches:

  • Scheduled maintenance
  • Continuous replication
  • Y (writing and reading)
  • Data-access microservice

Each approach tackles different issues, depending on the scale and therequirements of the data migration.

The data-access microservice approach is the preferred option in amicroservices architecture. However, the other approaches are useful for datamigration. They're also useful during the transition period that's necessary inorder to modernize your infrastructure to use the data-access microserviceapproach.

The following graph outlines the respective cut-over windows sizes, refactoringeffort, and flexibility properties of each of these approaches.

Migrate to Google Cloud: Transfer your large datasets | Cloud Architecture Center (2)

Before following any of these approaches, make sure that you've set upthe required infrastructure in the new environment.

Scheduled maintenance

The scheduled maintenance approach is ideal if your workloads can afford acut-over window. It's scheduled in the sense that you can plan when yourcut-over window occurs.

In this approach, your migration consists of these steps:

  1. Copy data that's in the legacy site to the new site. Thisinitial copy minimizes the cut-over window; after this initial copy, youneed to copy only the data that has changed during this window.
  2. Perform data validation and consistency checks to compare data in thelegacy site against the copied data in the new site.
  3. Stop the workloads and services that have write access to the copieddata, so that no further changes occur.
  4. Synchronize changes that occurred after the initial copy.
  5. Refactor workloads and services to use the new site.
  6. Start your workloads and services.
  7. When you no longer need the legacy site as a fallback optionanymore, retire it.

The scheduled maintenance approach places most of the burden on the operationsside, because minimal refactoring of workload and services is needed.

Continuous replication

Because not all workloads can afford a long cut-over window, you can build onthe scheduled maintenance approach by providing a continuous replicationmechanism after the initial copy and validation steps. When you design amechanism like this, you should also take into account the rate at which changesare applied to your data; it might be challenging to keep two systemssynchronized.

The continuous replication approach is more complex than the scheduledmaintenance approach. However, the continuous replication approach minimizes thetime for the required cut-over window, because it minimizes the amount of datathat you need to synchronize. The sequence for a continuous replicationmigration is as follows:

  1. Copy data that's in the legacy site to the new site. Thisinitial copy minimizes the cut-over window; after the initial copy, youneed to copy only the data that changed during this window.
  2. Perform data validation and consistency checks to compare data in thelegacy site against the copied data in the new site.
  3. Set up a continuous replication mechanism from the legacy site to thenew site.
  4. Stop the workloads and services that have access to the data to migrate(that is, to the data involved in the previous step).
  5. Refactor workloads and services to use the new site.
  6. Wait for the replication to fully synchronize the new site with the legacysite.
  7. Start your workloads and services.
  8. When you no longer need the legacy site as a fallback optionanymore, retire it.

As with the scheduled maintenance approach, the continuous replication approachplaces most of the burden on the operations side.

Y (writing and reading)

If your workloads have hard high-availability requirements and you cannotafford the downtime represented by a cut-over window, you need to take adifferent approach. For this scenario, you can use an approach thatin this document is referred to as Y (writing and reading), which is a form ofparallel migration.With this approach, the workload is writing and reading data in both the legacysite and the new site during the migration. (The letter Y is used here as agraphic representation of the data flow during the migration period.)

This approach is summarized as follows:

  1. Refactor workloads and services to write data both to the legacy siteand to the new site and to read from the legacy site.
  2. Identify the data that was written before you enabled writes in the newsite and copy it from the legacy site to the new site. Along with thepreceding refactoring, this ensures that the data stores are aligned.
  3. Perform data validation and consistency checks that compare data in thelegacy site against data in the new site.
  4. Switch read operations from the legacy site to the new site.
  5. Perform another round of data validation and consistency checks tocompare data in the legacy site against the new site.
  6. Disable writing in the legacy site.
  7. When you no longer need the legacy site as a fallback optionanymore, retire it.

Unlike the scheduled maintenance and continuous replication approaches, theY (writing and reading) approach shifts most of the efforts from the operationsside to the development side due to the multiple refactorings.

Data-access microservice

If you want to reduce the refactoring effort necessary to follow the Y (writingand reading) approach, you can centralize data read and write operations byrefactoring workloads and services to use a data-access microservice. Thisscalable microservice becomes the only entrypoint to your data storage layer, and it acts as a proxy for thatlayer. Of the approaches discussed here, this gives you the maximum flexibility,because you can refactor this component without impacting other components ofthe architecture and without requiring a cut-over window.

Using a data-access microservice is much like the Y (writing and reading)approach. The difference is that the refactoring efforts focus on thedata-access microservice alone, instead of having to refactor all the workloadsand services that access the data storage layer. This approach is summarized asfollows:

  1. Refactor the data-access microservice to write data both in the legacysite and the new site. Reads are performed against the legacy site.
  2. Identify the data that was written before you enabled writes in the newsite and copy it from the legacy site to the new site. Along with thepreceding refactoring, this ensures that the data stores are aligned.
  3. Perform data validation and consistency checks comparing data in thelegacy site against data in the new site.
  4. Refactor the data-access microservice to read from the new site.
  5. Perform another round of data validation and consistency checkscomparing data in the legacy site against data in the new site.
  6. Refactor the data-access microservice to write only in the new site.
  7. When you no longer need the legacy site as a fallback optionanymore, retire it.

Like the Y (writing and reading) approach, the data-access microserviceapproach places most of the burden on the development side. However, it'ssignificantly lighter compared to the Y (writing and reading) approach, becausethe refactoring efforts are focused on the data-access microservice.

Step 5: Preparing for your transfer

For a large transfer, or a transfer with significant dependencies, it'simportant to understand how to operate your transfer product. Customerstypically go through the following steps:

  1. Pricing and ROI estimation. This step provides many options to aidin decision making.
  2. Functional testing. In this step, you confirm that the product can besuccessfully set up and that network connectivity (where applicable) isworking. You also test that you can move a representative sample of yourdata (including accompanying non-transfer steps, like moving a VM instance)to the destination.

    You can usually do this step before allocating all resources such astransfer machines or bandwidth. The goals of this step include the following:

    • Confirm that you can install and operate the transfer.
    • Surface potential project-stopping issues that block datamovement (for example, network routes) or your operations (for example,training needed on a non-transfer step).
  3. Performance testing. In this step, you run a transfer on a largesample of your data (typically 3–5%) after production resources areallocated to do the following:

    • Confirm that you can consume all allocated resources and can achievegetting the speeds you expect.
    • Surface and fix bottlenecks (for example, slow source storage system).

Step 6: Ensuring the integrity of your transfer

To help ensure the integrity of your data during a transfer, we recommendtaking the following precautions:

  • Enable versioning and backup on your destination to limit the damage ofaccidental deletes.
  • Validate your data before removing the source data.

For large-scale data transfers (with petabytes of data and billions of files), abaseline latent error rate of the underlying source storage system as low as0.0001% still results in a data loss of thousands of files and gigabytes.Typically, applications running at the source are already tolerant of theseerrors, in which case, extra validation isn't necessary. In some exceptionalscenarios (for example, long-term archive), more validation is necessary beforeit's considered safe to delete data from the source.

Depending on the requirements of your application, we recommend that you runsome data integrity tests after the transfer is complete to ensure that theapplication continues to work as intended. Many transfer products have built-indata integrity checks. However, depending on your risk profile, you might wantto do an extra set of checks on the data and the apps reading that data beforeyou delete data from the source. For example, you might want to confirm whether achecksum that you recorded and computed independently matches the data writtenat the destination, or confirm that a dataset used by the applicationtransferred successfully.

What's next

  • Learn how todeploy your workloads.
  • Learn how to find help for your migrations.
  • Explore reference architectures, diagrams, and best practices about Google Cloud.Take a look at ourCloud Architecture Center.
Migrate to Google Cloud: Transfer your large datasets  |  Cloud Architecture Center (2024)

FAQs

How do I migrate a large database to GCP? ›

Migrate to Google Cloud: Transfer your large datasets
  1. On this page.
  2. What is data transfer? ...
  3. Step 1: Assembling your team.
  4. Step 2: Collecting requirements and available resources. ...
  5. Step 3: Evaluating your transfer options. ...
  6. Step 4: Evaluating data migration approaches. ...
  7. Step 5: Preparing for your transfer.
Nov 13, 2023

How do we transfer large amount of data from on premise to cloud? ›

Getting started with Transfer Service for on-premises data

First, install and start the on-premises software (the agent), then go to the Cloud Console and submit directories to transfer to Cloud Storage.

Which service makes it easy to import large datasets into Google Cloud storage? ›

Data center migration

Utilize our Storage Transfer Service solutions to migrate your workloads and datasets quickly to Google Cloud and start saving. Learn about all of Google Cloud's migration services and solutions.

How do I convert my data center to the cloud? ›

Take an application and rewrite it entirely for the cloud. It is often easier to build an application from scratch than it is to refactor its old code to work in a cloud environment.

What is the maximum file size for GCP cloud storage? ›

You can add objects of any kind and size, and up to 5 TB. Find Google Cloud Storage in the left side menu of the Google Cloud Platform Console, under Storage.

What's the best option to transfer 100 TB files into the cloud? ›

Top Suggestions: Best Large Cloud Storage

pCloud — Does not limit file sizes for transfers or sharing. Icedrive — Handles file sizes up to 100TB; strong security and privacy. Google Drive — 5TB single file size limit; not great with user privacy. MEGA — No file size limit with the desktop app; imposes transfer limits.

What is the best way to transfer large amounts of data? ›

How to send large files
  1. Upload your files to cloud storage. Uploading files to cloud storage is a great way to transfer large files such as photos and video. ...
  2. Use a Chat App like Telegram. ...
  3. Compress the Files. ...
  4. Use a VPN. ...
  5. USB flash drive. ...
  6. FTP. ...
  7. SFTP. ...
  8. FTPS.

How much does it cost to migrate from on-premise to cloud? ›

The cost of cloud migration ranges from $5,000 to $100,000 depending on the scope of needed application modification. The number and complexity of workloads to migrate. Chosen migration strategy (rehosting, re-architecting, etc.).

How do I move big data to the cloud? ›

How does the cloud migration process work?
  1. Understand the purpose. ...
  2. Determine the target application(s). ...
  3. Choose the cloud target. ...
  4. Select a proven cloud partner. ...
  5. Evaluate migration costs and needs. ...
  6. Choose the appropriate architecture. ...
  7. Create the migration plan. ...
  8. Perform the migration.

Which cloud provider is best for big data? ›

13 Top Cloud Service Providers In 2024
  1. Amazon Web Services (AWS) – Infrastructure-as-a-Service (IaaS) ...
  2. Microsoft Azure – Hybrid cloud and enterprise cloud services. ...
  3. Google Cloud Platform – AI, ML, and Kubernetes. ...
  4. Alibaba — Largest cloud service provider in Asia. ...
  5. IBM Cloud – Multi-cloud CSP.
Jul 18, 2024

How to use Google Data Transfer Tool? ›

  1. Connect both devices wirelessly. Turn on your new Android phone and follow the on-screen prompts. ...
  2. Select your data. Follow the on-screen instructions to choose what comes with you — contacts, photos, videos and calendar events. ...
  3. Transfer. That's it.

What is the data transfer service in GCP? ›

The Google Cloud Platform (GCP) Data Transfer Service is a fully-managed service that makes it easy to move data to Google Cloud Platform. The service can be used to transfer data from a variety of sources including on-premises data centers, cloud storage providers and SaaS applications.

Why move your data center to the cloud? ›

Why use cloud computing instead of only on-premises infrastructure? Cloud computing allows organizations to innovate faster and offers flexible resources and enhanced resilience compared to on-premises resources. With cloud services, you usually pay only for what you need and when you need it.

What are the five cloud migration strategies? ›

The 7 cloud migration strategies are: rehost, relocate, replatform, refactor, repurchase, retire, and retain. These strategies provide a roadmap for determining the best approach to moving applications and data from on-premises infrastructure to the cloud.

What are the five phases of cloud migration? ›

The execution of a cloud migration strategy typically unfolds in five primary stages: preparation, planning, migration, operation, and optimization.

How to transfer petabytes of data into the cloud? ›

FAQs
  1. Create a data import job using the Amazon Web Services Console. Afterwards, Amazon Web Services will prepare and ship a Snowball appliance to the address you specified.
  2. Transfer your data onto the Snowball using the client. ...
  3. See your data securely imported into the cloud and stored in Amazon S3.

How do I import a database into Google Cloud? ›

Step 4: Import the SQL File into the New Instance
  1. From the Google Cloud Console, access the new Cloud SQL instance.
  2. Select the Import option from the sidebar.
  3. Navigate to the SQL file stored in the new project's Cloud Storage bucket.
  4. Select the target database for the import.
  5. Start the import process.
Aug 12, 2024

How do I migrate my premise database to the cloud? ›

10 Steps for a successful data migration: On-premise to cloud migration
  1. Understand your current state.
  2. Define the migration strategy.
  3. Choose your cloud provider.
  4. Data modeling and architecture.
  5. Security and compliance.
  6. Prepare for migration.
  7. Data migration.
  8. Validate the migration.

Top Articles
Current VA Loan Limits and Veterans Benefits By State
Kraken vs Bitvavo? Een vergelijking tussen twee machtige crypto exchanges
Pet For Sale Craigslist
Skycurve Replacement Mat
Jazmen Jafar Linkedin
La connexion à Mon Compte
Campaign Homecoming Queen Posters
Natureza e Qualidade de Produtos - Gestão da Qualidade
Toonily The Carry
Oriellys St James Mn
Cool Math Games Bucketball
“In my day, you were butch or you were femme”
Hijab Hookup Trendy
Huge Boobs Images
Crossword Nexus Solver
iLuv Aud Click: Tragbarer Wi-Fi-Lautsprecher für Amazons Alexa - Portable Echo Alternative
Mbta Commuter Rail Lowell Line Schedule
Plan Z - Nazi Shipbuilding Plans
Forest Biome
Marine Forecast Sandy Hook To Manasquan Inlet
All Breed Database
Craigs List Tallahassee
Tips and Walkthrough: Candy Crush Level 9795
[PDF] PDF - Education Update - Free Download PDF
Directions To Nearest T Mobile Store
Kirsten Hatfield Crime Junkie
Breckiehill Shower Cucumber
Tokyo Spa Memphis Reviews
Stockton (California) – Travel guide at Wikivoyage
Summoners War Update Notes
Tom Thumb Direct2Hr
lol Did he score on me ?
Experity Installer
Plato's Closet Mansfield Ohio
Maybe Meant To Be Chapter 43
Panchitos Harlingen Tx
Top-ranked Wisconsin beats Marquette in front of record volleyball crowd at Fiserv Forum. What we learned.
Gets Less Antsy Crossword Clue
Admissions - New York Conservatory for Dramatic Arts
The Thing About ‘Dateline’
Evil Dead Rise (2023) | Film, Trailer, Kritik
Postgraduate | Student Recruitment
Despacito Justin Bieber Lyrics
Nami Op.gg
Hkx File Compatibility Check Skyrim/Sse
Payrollservers.us Webclock
Rs3 Nature Spirit Quick Guide
BCLJ July 19 2019 HTML Shawn Day Andrea Day Butler Pa Divorce
Ups Customer Center Locations
Craigslist Pets Charleston Wv
Hy-Vee, Inc. hiring Market Grille Express Assistant Department Manager in New Hope, MN | LinkedIn
Gainswave Review Forum
Latest Posts
Article information

Author: Fr. Dewey Fisher

Last Updated:

Views: 6225

Rating: 4.1 / 5 (62 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Fr. Dewey Fisher

Birthday: 1993-03-26

Address: 917 Hyun Views, Rogahnmouth, KY 91013-8827

Phone: +5938540192553

Job: Administration Developer

Hobby: Embroidery, Horseback riding, Juggling, Urban exploration, Skiing, Cycling, Handball

Introduction: My name is Fr. Dewey Fisher, I am a powerful, open, faithful, combative, spotless, faithful, fair person who loves writing and wants to share my knowledge and understanding with you.