Data Replication

Sanjana Yadav
Last Updated: May 13, 2022

Introduction

In the back end of modern applications, a distributed database is used, where data is stored and processed across a cluster of systems rather than depending on a single machine.

Assume that an application user wants to save some information to a database.

This data is divided into many fragments, each of which is kept on a different node across the distributed system.

When a user wishes to access or read data, the database technology also gathers and integrates the many pieces.

A single system failure in such a setup can prevent the entire data from being retrieved.

Data replication comes to the rescue in this situation.

Data replication technology allows several pieces to be stored at each node, allowing read and write operations to be streamlined throughout the network.

Data Replication 

  • The technique of storing data at more than one location or node is known as data replication.
  • It is beneficial in terms of increasing data availability.
  • It transfers data from a database from one server to another so that all users may share the same data without inconsistencies.
  • Consequently, users may access data relevant to their jobs without interfering with the work of others in a distributed database.
  • Data replication entails the continuing duplicate of transactions such that the replicate is consistently updated and synced with the source.
  • However, data is available in several locations with data replication, but a specific relation must reside in just one location.

Types of Data Replication

Transactional Replication

Users using transactional replication receive complete initial copies of the database, followed by updates when data changes.

Data is replicated in real-time from the publisher to the receiving database (subscriber) in the same order as it occurs with the publisher, ensuring transactional consistency.

In server-to-server contexts, transactional replication is commonly utilized.

It does not merely duplicate the data changes but repeats each modification consistently and adequately.

Snapshot Replication

Snapshot replication delivers data precisely as it appears at a single point in time and does not check for data changes.

The whole snapshot is created and distributed to Users.

When data changes are rare, snapshot replication is commonly utilized.

It is a little slower than transactional since it moves many records from one end to the other on each try.

Snapshot replication is an effective method for establishing initial synchronization between the publisher and the subscriber.

Merge Replication

Data from many databases are integrated into a single database.

Merge replication is the most sophisticated replication since it allows the publisher and the subscriber to make changes to the database separately.

Merge replication is commonly employed in server-to-client scenarios.

It enables modifications to be distributed from a single publisher to several subscribers.

       

Replication Schemes

Full Replication 

The most severe instance is replicating the whole database across all distributed system sites.

This improves system availability since the system may continue to run as long as one location is operational.

 

 

Advantages of full replication 

  • Data availability is relatively high.
  • Improves the efficiency of global query retrieval since the result may be acquired locally from any local site.
  • Queries are executed more quickly.

 

Disadvantages of full replication

  • Concurrency in full replication is difficult to achieve.
  • Slow update procedure because a single update must be conducted at many databases to keep the copies consistent.

 

No Replication 

The alternative type of replication is no replication, in which each fragment is kept at just one location.

 

 

Advantages of No Replication 

  • The data is easily recoverable.
  • No replication is required to achieve concurrency.

 

Disadvantages of No replication

  • Because numerous users use the same server, query execution may be slowed.
  • Because there is no replication, the data is difficult to access.

Partial Replication 

Some database fragments may be replicated in this replication, while others are not.

The number of copies of the fragment in the distributed system might range from one to the whole number of locations.

The replication schema is a term used to describe how fragments are replicated.

 

Advantages of Partial Replication

The data's relevance determines the number of fragment copies.

Advantages of Data Replication 

  • Ensures that every database node has the exact copy of the data. 
  • Increases the amount of data available. 
  • By duplicating data, the data's reliability is increased. 
  • Data Replication is a high-capacity system that can handle many users. 
  • The databases are combined, and slave databases are updated with obsolete or partial data to remove any data redundancy. 
  • Because copies are made, there is a probability that the data will be discovered where the transaction is running.

Disadvantages of Data Replication 

  • More storage space is required since keeping replicas of the same data across many places takes up more space.
  • Data replication becomes expensive when all copies at different sites need to be updated.
  • Maintaining data homogeneity across all sites necessitates a series of complex tasks.

FAQs

  1. How do you handle data replication?
    Determine the source and destination of the data.
    To copy tables and columns from the source, choose them.
    Determine the frequency of updates.
    Choose between complete table, key-based, or log-based replication.
    Identify replication keys for key-based replication, which are columns that, if altered or updated in the source, cause the records they're a part of to be duplicated in the replication process.
    To perform the replication process, use custom code or a replication tool.
    For quality assurance, keep an eye on the extraction and loading procedures.
     
  2. What makes data available for replication?
    The publisher is a source database where replication starts.
     
  3. Does replication encrypt data?
    Encryption will not be replicated via replication. If we need to encrypt our data at the source, we will also need to encrypt our subscribers' data.
     
  4. Does replication affect latency?
    Replication does not meaningfully affect read or write latency to primary servers.

Key Takeaways

  • Cheers if you reached here! In this blog, we learned about Data Replication in Databases. 
  • We have covered the basic idea of Data Replication in DBMS.
  • We have also seen how it works.
  • Further, we saw the types of Data Replication with their pros and cons.
  • We have also witnessed the Advantages and the Disadvantages of Data Replication.


Don't stop here, Ninja; check out the Top 100 SQL Problems.

On the other hand, learning never ceases, and there is always more to learn. So, keep learning and keep growing, ninjas!

Good luck with your preparation!

Was this article helpful ?
1 upvote

Comments

No comments yet

Be the first to share what you think