What are the advantages of GFS?

30 Dec.,2024

 

Google File System - GeeksforGeeks

Google Inc. developed the Google File System (GFS), a scalable distributed file system (DFS), to meet the company&#;s growing data processing needs. GFS offers fault tolerance, dependability, scalability, availability, and performance to big networks and connected nodes. GFS is made up of a number of storage systems constructed from inexpensive commodity hardware parts. The search engine, which creates enormous volumes of data that must be kept, is only one example of how it is customized to meet Google&#;s various data use and storage requirements.

If you are looking for more details, kindly visit zhaoyang.

The Google File System reduced hardware flaws while gains of commercially available servers.

GoogleFS is another name for GFS. It manages two types of data namely File metadata and File Data.

The GFS node cluster consists of a single master and several chunk servers that various client systems regularly access. On local discs, chunk servers keep data in the form of Linux files. Large (64 MB) pieces of the stored data are split up and replicated at least three times around the network. Reduced network overhead results from the greater chunk size.

Without hindering applications, GFS is made to meet Google&#;s huge cluster requirements. Hierarchical directories with path names are used to store files. The master is in charge of managing metadata, including namespace, access control, and mapping data. The master communicates with each chunk server by timed heartbeat messages and keeps track of its status updates.

More than 1,000 nodes with 300 TB of disc storage capacity make up the largest GFS clusters. This is available for constant access by hundreds of clients.

To understand how distributed file systems like GFS are utilized in cloud and DevOps environments in detail, the DevOps Engineering &#; Planning to Production course offers practical insights and real-world examples.

Components of GFS

A group of computers makes up GFS. A cluster is just a group of connected computers. There could be hundreds or even thousands of computers in each cluster. There are three basic entities included in any GFS cluster as follows:

  • GFS Clients:

    They can be computer programs or applications which may be used to request files. Requests may be made to access and modify already-existing files or add new files to the system.

  • GFS Master Server:

    It serves as the cluster&#;s coordinator. It preserves a record of the cluster&#;s actions in an operation log. Additionally, it keeps track of the data that describes chunks, or metadata. The chunks&#; place in the overall file and which files they belong to are indicated by the metadata to the master server.

  • GFS Chunk Servers:

    They are the GFS&#;s workhorses. They keep 64 MB-sized file chunks. The master server does not receive any chunks from the chunk servers. Instead, they directly deliver the client the desired chunks. The GFS makes numerous copies of each chunk and stores them on various chunk servers in order to assure stability; the default is three copies. Every replica is referred to as one.

Features of GFS

  • Namespace management and locking.

  • Fault tolerance.

  • Reduced client and master interaction because of large chunk server size.

  • High availability.

  • Critical data replication.

  • Automatic and efficient data recovery.

  • High aggregate throughput.

Advantages of GFS

  1. High accessibility Data is still accessible even if a few nodes fail. (replication) Component failures are more common than not, as the saying goes.

  2. Excessive throughput. many nodes operating concurrently.

  3. Dependable storing. Data that has been corrupted can be found and duplicated.

Disadvantages of GFS

  1. Not the best fit for small files.

  2. Master may act as a bottleneck.

  3. unable to type at random.

  4. Suitable for procedures or data that are written once and only read (appended) later.


If you want to learn more, please visit our website GFS Tanks.

All you need to know about the Google File System

Assumptions

Google describes their assumptions when designing the GFS in more detail:

  • The system is built from many commodity components that often fail.
  • The system stores large files. They expect a few million files, each typically 100 MB or larger.
  • The read workloads are typically divided into large streaming reads (subsequence operations read through a contiguous file region) and small random reads (read at specific file offset).
  • The workloads have many large, sequential writes that append data to files.
  • The system must efficiently implement semantics for multiple clients that append to the same file concurrently.
  • High bandwidth is more important than low latency

Interface

Like other file systems, GFS provides a familiar interface but does not support standard APIs like POSIX. It organizes files in directories and identifies them by pathnames. GFS supports operations to create, delete, open, close, read, and write files. GFS also implements snapshot and record append operations:

  • Snapshot cheaply creates a copy of a file or a directory tree.
  • Record append allows multiple clients to append data to the same file concurrently while guaranteeing the atomicity of each client&#;s append.

Architecture

GFS&#;s high-level architecture. Image created by the author.

A GFS cluster has a single master, multiple chunkservers, and multiple clients. GFS divides files into fixed-size chunks. The system identifies each chunk by an immutable and globally unique 64-bit chunk handle assigned by the master at the creation time. Chunkservers store chunks on local disks and read/write chunks using a chunk handle and byte range. GFS replicates chunks on multiple chunkservers (three replicas by default) for reliability.

The master handles all file system metadata, including the namespace, access control information, mapping from files to chunks, and chunk locations. It also controls chunk lease management, garbage collection, and chunk migration between chunkservers. The master communicates with each chunkserver periodically through HeartBeat messages.

Lease management and garbage collection operations will be covered in upcoming sections.

The GFS client communicates with the master and chunkservers to read or write data. Clients interact with the master only for metadata operations; they communicate directly to the chunkservers for data-related operations.

Single Master

Having a single master simplifies the GFS design and allows the master to make sophisticated decisions using global knowledge. To prevent the master from being the bottleneck, Google minimizes its involvement in reads and writes. A client never reads and writes file data through the master. Instead, it asks the master, &#; Hey, which chunkservers should I contact?&#; then the client caches this information and interacts with the chunkservers directly for subsequent operations.

The interactions for a simple read. Image created by the author.

Chunk Size

Google decided on a chunk size of 64 MB, which was more significant than most file system block sizes at the time. GFS stores each chunk replica as a plain Linux file on a chunkserver. A large chunk size has several advantages:

  • Reducing clients&#; need to interact with the master because operations on the same chunk require only one initial request to the master for chunk location.
  • Reducing network overhead.
  • Reducing the size of the metadata stored on the master.

Still, the large-size chunk approach has a disadvantage: with a small file consisting of a few chunks, the chunkservers storing those chunks may become hot spots if many clients access the same file.

Image created by the author.

Metadata

There are three major types of metadata. Image created by the author.

The master stores three major types of metadata:

  • The file and chunk namespaces
  • The files-to-chunks mapping
  • The chunk&#;s replica locations

The master keeps the metadata in memory. It also persists the namespaces and file-to-chunk mapping metadata by logging the mutations to an operation log, which is stored on the master&#;s disk and replicated on remote machines. The log lets Google update the master state simply and reliably when a master crashes.

For the chunk location metadata, instead of storing on the master itself, it asks for this information from the chunkserver at the time the master startup and whenever a new chunkserver joins the cluster.

In-Memory Data Structures

Master operations are fast, thanks to the metadata stored in the memory. This allows the master to scan the entire state behind the scenes. Google uses this scanning to implement chunk garbage collection, re-replication, and chunk migration. However, storing all the metadata in the memory will be constrained to the amount of the master&#;s memory. Google states that the cost of adding extra memory to the master is an insignificant tradeoff for the system&#;s simplicity, reliability, and performance by storing the metadata in memory.

Chunk Locations

The master does not initially store the metadata of the chunk locations. It polls the chunkservers for this information at startup. The master can keep updated after that because it controls all chunk placement and monitors chunkservers with HeartBeat messages. This approach eliminated the need to keep the master and chunkservers in sync when chunkservers memberships change.

Operation Log

The operation log contains a historical record of the metadata changes. It is the only persistent record of metadata (stored on the master&#;s local disks) and serves as a logical timeline that records the order of concurrent operations. Due to its importance, Google stores the log redundantly outside the master; they replicate the log on multiple remote machines and respond to a client operation only after flushing the corresponding log record to the master&#;s local disk and the remote machines&#; disks.

The master recovers its state by replaying the operation log. Google keeps the log small to minimize the startup time. The master writes the checkpoints of its state whenever the log grows beyond a certain threshold. This helps the master recover by only loading the latest checkpoint from the disk and replaying for only a limited number of log records afterward. The checkpoint has a B-tree-like data structure that can be directly mapped into memory.

The Operation Log. Image created by the author.

The master&#;s internal state is carefully structured so a new checkpoint can be created without affecting incoming metadata mutations. The master switches to a new log file and creates the new checkpoint in a separate thread. The new checkpoint has all mutations before the switch. When completed, it is written to disk both locally and remotely. Recovery needs only the latest complete checkpoint and subsequent log files. Older checkpoints and log files can be deleted. A failure during checkpointing does not affect the correctness because the recovery process detects and skips incomplete checkpoints.

Consistency Model

GFS has a consistency model that well supports distributed applications but remains simple and efficient to implement

Guarantees by GFS

The state of a file region after a mutation depends on two kinds of factors:

  • Whether it succeeds or fails
  • Whether there are concurrent mutations.

Here is the file region state notion from the paper:

File Region State After Mutation. Table 1, The Google File System (). Source
  • A file region is consistent if all clients see the same data, regardless of which replicas they read from.
  • If a file data mutation is consistent, a region is defined after it, and clients will see what the mutation writes in its entirety. &#; If the data is defined, it must be consistent first.

When a mutation succeeds without concurrent writers, the affected region is defined (which means it is also consistent). Concurrent successful mutations leave the region consistent but undefined: all clients see the same data, but they may not reflect any mutation. Typically, it consists of mixed fragments from multiple mutations. A failed mutation makes the region inconsistent; clients see different data at different times.

Data mutations may be writes or record appends. The first writes the data at a specified file offset defined by the application. The later append records at an offset of GFS&#;s choosing atomically at least once, even if there are concurrent mutations. The system then returns the offset to the client and marks the beginning of a defined region. After a sequence of successful mutations, the mutated file region is guaranteed to be defined; GFS achieves this by:

  • Applying mutations to a chunk in the same order on all its replicas
  • Using chunk version to detect any replica that had become stale due to mutations missing when its chunkserver was down. Stale replicas will not be involved in a mutation or given to clients asking the master for chunk locations. GFS garbage collects these replicas as soon as possible.

Component failures can corrupt or destroy data after a successful mutation. GFS detects the failed chunkservers by regular handshakes between master and chunkservers and detects corruption by checksumming. GFS restores the data from a valid replica as soon as possible after the failures occur.

Implications for Applications

The applications can adapt to the GFS&#;s consistency model with simple techniques: relying on appends rather than overwrites, checkpointing, and writing self-validating, self-identifying records.

In one typical use case, a writer creates a file from beginning to end. It renames the file to a permanent name after writing all the data or periodically checkpoints how much has been successfully written. Checkpoints may also be included in the application-level checksums. Readers will verify and process only the file region until the last checkpoint. Checkpointing allows writers to restart incrementally and keeps readers from processing successfully written files.

In the other use case, writers concurrently append to a file as a producer-consumer queue. GFS preserves each writer&#;s output. Concurrent writers will add extra information like checksums in each record so that readers can verify its validity. The checksums allow the reader to detect and discard extra padding and record fragments.

Contact us to discuss your requirements of glass fused bolted steel tanks. Our experienced sales team can help you identify the options that best suit your needs.