Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
The Hadoop Distributed File System (HDFS) is a distributed file system that runs on standard or low-end hardware. Developed by Apache Hadoop, HDFS works like a standard distributed file system but provides better data throughput and access through the MapReduce algorithm, high fault tolerance and native support of large data sets.
The HDFS stores a large amount of data placed across multiple machines, typically in hundreds and thousands of simultaneously connected nodes, and provides data reliability by replicating each data instance as three different copies - two in one group and one in another. These copies may be replaced in the event of failure.
The HDFS architecture consists of clusters, each of which is accessed through a single NameNode software tool installed on a separate machine to monitor and manage the that cluster's file system and user access mechanism. The other machines install one instance of DataNode to manage cluster storage.
Because HDFS is written in Java, it has native support for Java application programming interfaces (API) for application integration and accessibility. It also may be accessed through standard Web browsers.