Google File System (GFS)
Definition - What does Google File System (GFS) mean?
Google File System (GFS) is a scalable distributed file system (DFS) created by Google Inc. and developed to accommodate Google’s expanding data processing requirements. GFS provides fault tolerance, reliability, scalability, availability and performance to large networks and connected nodes. GFS is made up of several storage systems built from low-cost commodity hardware components. It is optimized to accomodate Google's different data use and storage needs, such as its search engine, which generates huge amounts of data that must be stored.
The Google File System capitalized on the strength of off-the-shelf servers while minimizing hardware weaknesses.
GFS is also known as GoogleFS.
Techopedia explains Google File System (GFS)
The GFS node cluster is a single master with multiple chunk servers that are continuously accessed by different client systems. Chunk servers store data as Linux files on local disks. Stored data is divided into large chunks (64 MB), which are replicated in the network a minimum of three times. The large chunk size reduces network overhead.
GFS is designed to accommodate Google’s large cluster requirements without burdening applications. Files are stored in hierarchical directories identified by path names. Metadata - such as namespace, access control data, and mapping information - is controlled by the master, which interacts with and monitors the status updates of each chunk server through timed heartbeat messages.
GFS features include:
- Fault tolerance
- Critical data replication
- Automatic and efficient data recovery
- High aggregate throughput
- Reduced client and master interaction because of large chunk server size
- Namespace management and locking
- High availability
The largest GFS clusters have more than 1,000 nodes with 300 TB disk storage capacity. This can be accessed by hundreds of clients on a continuous basis.