Google Inc. developed the Google File System (GFS), a scalable distributed file system (DFS), to meet the companys growing data processing needs. GFS offers fault tolerance, dependability, scalability, availability, and performance to big networks and connected nodes. GFS is made up of a number of storage systems constructed from inexpensive commodity hardware parts. The search engine, which creates enormous volumes of data that must be kept, is only one example of how it is customized to meet Googles various data use and storage requirements.
If you are looking for more details, kindly visit zhaoyang.
The Google File System reduced hardware flaws while gains of commercially available servers.
GoogleFS is another name for GFS. It manages two types of data namely File metadata and File Data.
The GFS node cluster consists of a single master and several chunk servers that various client systems regularly access. On local discs, chunk servers keep data in the form of Linux files. Large (64 MB) pieces of the stored data are split up and replicated at least three times around the network. Reduced network overhead results from the greater chunk size.
Without hindering applications, GFS is made to meet Googles huge cluster requirements. Hierarchical directories with path names are used to store files. The master is in charge of managing metadata, including namespace, access control, and mapping data. The master communicates with each chunk server by timed heartbeat messages and keeps track of its status updates.
More than 1,000 nodes with 300 TB of disc storage capacity make up the largest GFS clusters. This is available for constant access by hundreds of clients.
To understand how distributed file systems like GFS are utilized in cloud and DevOps environments in detail, the DevOps Engineering Planning to Production course offers practical insights and real-world examples.
A group of computers makes up GFS. A cluster is just a group of connected computers. There could be hundreds or even thousands of computers in each cluster. There are three basic entities included in any GFS cluster as follows:
They can be computer programs or applications which may be used to request files. Requests may be made to access and modify already-existing files or add new files to the system.
It serves as the clusters coordinator. It preserves a record of the clusters actions in an operation log. Additionally, it keeps track of the data that describes chunks, or metadata. The chunks place in the overall file and which files they belong to are indicated by the metadata to the master server.
They are the GFSs workhorses. They keep 64 MB-sized file chunks. The master server does not receive any chunks from the chunk servers. Instead, they directly deliver the client the desired chunks. The GFS makes numerous copies of each chunk and stores them on various chunk servers in order to assure stability; the default is three copies. Every replica is referred to as one.
Namespace management and locking.
Fault tolerance.
Reduced client and master interaction because of large chunk server size.
High availability.
Critical data replication.
Automatic and efficient data recovery.
High aggregate throughput.
High accessibility Data is still accessible even if a few nodes fail. (replication) Component failures are more common than not, as the saying goes.
Excessive throughput. many nodes operating concurrently.
Dependable storing. Data that has been corrupted can be found and duplicated.
Not the best fit for small files.
Master may act as a bottleneck.
unable to type at random.
Suitable for procedures or data that are written once and only read (appended) later.
If you want to learn more, please visit our website GFS Tanks.
Google describes their assumptions when designing the GFS in more detail:
Like other file systems, GFS provides a familiar interface but does not support standard APIs like POSIX. It organizes files in directories and identifies them by pathnames. GFS supports operations to create, delete, open, close, read, and write files. GFS also implements snapshot and record append operations:
A GFS cluster has a single master, multiple chunkservers, and multiple clients. GFS divides files into fixed-size chunks. The system identifies each chunk by an immutable and globally unique 64-bit chunk handle assigned by the master at the creation time. Chunkservers store chunks on local disks and read/write chunks using a chunk handle and byte range. GFS replicates chunks on multiple chunkservers (three replicas by default) for reliability.
The master handles all file system metadata, including the namespace, access control information, mapping from files to chunks, and chunk locations. It also controls chunk lease management, garbage collection, and chunk migration between chunkservers. The master communicates with each chunkserver periodically through HeartBeat messages.
Lease management and garbage collection operations will be covered in upcoming sections.
The GFS client communicates with the master and chunkservers to read or write data. Clients interact with the master only for metadata operations; they communicate directly to the chunkservers for data-related operations.
Having a single master simplifies the GFS design and allows the master to make sophisticated decisions using global knowledge. To prevent the master from being the bottleneck, Google minimizes its involvement in reads and writes. A client never reads and writes file data through the master. Instead, it asks the master, Hey, which chunkservers should I contact? then the client caches this information and interacts with the chunkservers directly for subsequent operations.
The interactions for a simple read. Image created by the author.Google decided on a chunk size of 64 MB, which was more significant than most file system block sizes at the time. GFS stores each chunk replica as a plain Linux file on a chunkserver. A large chunk size has several advantages:
Still, the large-size chunk approach has a disadvantage: with a small file consisting of a few chunks, the chunkservers storing those chunks may become hot spots if many clients access the same file.
Image created by the author.The master stores three major types of metadata:
The master keeps the metadata in memory. It also persists the namespaces and file-to-chunk mapping metadata by logging the mutations to an operation log, which is stored on the masters disk and replicated on remote machines. The log lets Google update the master state simply and reliably when a master crashes.
For the chunk location metadata, instead of storing on the master itself, it asks for this information from the chunkserver at the time the master startup and whenever a new chunkserver joins the cluster.
Master operations are fast, thanks to the metadata stored in the memory. This allows the master to scan the entire state behind the scenes. Google uses this scanning to implement chunk garbage collection, re-replication, and chunk migration. However, storing all the metadata in the memory will be constrained to the amount of the masters memory. Google states that the cost of adding extra memory to the master is an insignificant tradeoff for the systems simplicity, reliability, and performance by storing the metadata in memory.
The master does not initially store the metadata of the chunk locations. It polls the chunkservers for this information at startup. The master can keep updated after that because it controls all chunk placement and monitors chunkservers with HeartBeat messages. This approach eliminated the need to keep the master and chunkservers in sync when chunkservers memberships change.
The operation log contains a historical record of the metadata changes. It is the only persistent record of metadata (stored on the masters local disks) and serves as a logical timeline that records the order of concurrent operations. Due to its importance, Google stores the log redundantly outside the master; they replicate the log on multiple remote machines and respond to a client operation only after flushing the corresponding log record to the masters local disk and the remote machines disks.
The master recovers its state by replaying the operation log. Google keeps the log small to minimize the startup time. The master writes the checkpoints of its state whenever the log grows beyond a certain threshold. This helps the master recover by only loading the latest checkpoint from the disk and replaying for only a limited number of log records afterward. The checkpoint has a B-tree-like data structure that can be directly mapped into memory.
The Operation Log. Image created by the author.The masters internal state is carefully structured so a new checkpoint can be created without affecting incoming metadata mutations. The master switches to a new log file and creates the new checkpoint in a separate thread. The new checkpoint has all mutations before the switch. When completed, it is written to disk both locally and remotely. Recovery needs only the latest complete checkpoint and subsequent log files. Older checkpoints and log files can be deleted. A failure during checkpointing does not affect the correctness because the recovery process detects and skips incomplete checkpoints.
GFS has a consistency model that well supports distributed applications but remains simple and efficient to implement
The state of a file region after a mutation depends on two kinds of factors:
Here is the file region state notion from the paper:
File Region State After Mutation. Table 1, The Google File System (). SourceWhen a mutation succeeds without concurrent writers, the affected region is defined (which means it is also consistent). Concurrent successful mutations leave the region consistent but undefined: all clients see the same data, but they may not reflect any mutation. Typically, it consists of mixed fragments from multiple mutations. A failed mutation makes the region inconsistent; clients see different data at different times.
Data mutations may be writes
or record appends
. The first writes the data at a specified file offset defined by the application. The later append records at an offset of GFSs choosing atomically at least once, even if there are concurrent mutations. The system then returns the offset to the client and marks the beginning of a defined region. After a sequence of successful mutations, the mutated file region is guaranteed to be defined; GFS achieves this by:
Component failures can corrupt or destroy data after a successful mutation. GFS detects the failed chunkservers by regular handshakes between master and chunkservers and detects corruption by checksumming. GFS restores the data from a valid replica as soon as possible after the failures occur.
The applications can adapt to the GFSs consistency model with simple techniques: relying on appends rather than overwrites, checkpointing, and writing self-validating, self-identifying records.
In one typical use case, a writer creates a file from beginning to end. It renames the file to a permanent name after writing all the data or periodically checkpoints how much has been successfully written. Checkpoints may also be included in the application-level checksums. Readers will verify and process only the file region until the last checkpoint. Checkpointing allows writers to restart incrementally and keeps readers from processing successfully written files.
In the other use case, writers concurrently append to a file as a producer-consumer queue. GFS preserves each writers output. Concurrent writers will add extra information like checksums in each record so that readers can verify its validity. The checksums allow the reader to detect and discard extra padding and record fragments.
Contact us to discuss your requirements of glass fused bolted steel tanks. Our experienced sales team can help you identify the options that best suit your needs.