ALERT

[LAST CHANCE] Data Layer: Modern Business, Defined

Hadoop Distributed File System (HDFS)

Definition - What does Hadoop Distributed File System (HDFS) mean?

The Hadoop Distributed File System (HDFS) is a distributed file system that runs on standard or low-end hardware. Developed by Apache Hadoop, HDFS works like a standard distributed file system but provides better data throughput and access through the MapReduce algorithm, high fault tolerance and native support of large data sets.

Techopedia explains Hadoop Distributed File System (HDFS)

The HDFS stores a large amount of data placed across multiple machines, typically in hundreds and thousands of simultaneously connected nodes, and provides data reliability by replicating each data instance as three different copies - two in one group and one in another. These copies may be replaced in the event of failure.

The HDFS architecture consists of clusters, each of which is accessed through a single NameNode software tool installed on a separate machine to monitor and manage the that cluster's file system and user access mechanism. The other machines install one instance of DataNode to manage cluster storage.

Because HDFS is written in Java, it has native support for Java application programming interfaces (API) for application integration and accessibility. It also may be accessed through standard Web browsers.

Techopedia Deals

Connect with us

Techopedia on Linkedin
Techopedia on Linkedin
Tweat cdn.techopedia.com
"Techopedia" on Twitter


'@Techopedia'
Sign up for Techopedia's Free Newsletter!

Email Newsletter

Join thousands of others with our weekly newsletter

Resources
The 4th Era of IT Infrastructure: Superconverged Systems
The 4th Era of IT Infrastructure: Superconverged Systems:
Learn the benefits and limitations of the 3 generations of IT infrastructure – siloed, converged and hyperconverged – and discover how the 4th...
Approaches and Benefits of Network Virtualization
Approaches and Benefits of Network Virtualization:
Businesses today aspire to achieve a software-defined datacenter (SDDC) to enhance business agility and reduce operational complexity. However, the...
Free E-Book: Public Cloud Guide
Free E-Book: Public Cloud Guide:
This white paper is for leaders of Operations, Engineering, or Infrastructure teams who are creating or executing an IT roadmap.
Free Tool: Virtual Health Monitor
Free Tool: Virtual Health Monitor:
Virtual Health Monitor is a free virtualization monitoring and reporting tool for VMware, Hyper-V, RHEV, and XenServer environments.
Free 30 Day Trial – Turbonomic
Free 30 Day Trial – Turbonomic:
Turbonomic delivers an autonomic platform where virtual and cloud environments self-manage in real-time to assure application performance.