Hadoop Distributed File System

Why Trust Techopedia

What Does Hadoop Distributed File System Mean?

The Hadoop Distributed File System (HDFS) is a distributed file system that runs on standard or low-end hardware. Developed by Apache Hadoop, HDFS works like a standard distributed file system but provides better data throughput and access through the MapReduce algorithm, high fault tolerance and native support of large data sets.

Advertisements

Techopedia Explains Hadoop Distributed File System

The HDFS stores a large amount of data placed across multiple machines, typically in hundreds and thousands of simultaneously connected nodes, and provides data reliability by replicating each data instance as three different copies – two in one group and one in another. These copies may be replaced in the event of failure.

The HDFS architecture consists of clusters, each of which is accessed through a single NameNode software tool installed on a separate machine to monitor and manage the that cluster’s file system and user access mechanism. The other machines install one instance of DataNode to manage cluster storage.

Because HDFS is written in Java, it has native support for Java application programming interfaces (API) for application integration and accessibility. It also may be accessed through standard Web browsers.

Advertisements

Related Terms

Margaret Rouse
Technology expert
Margaret Rouse
Technology expert

Margaret is an award-winning writer and educator known for her ability to explain complex technical topics to a non-technical business audience. Over the past twenty years, her IT definitions have been published by Que in an encyclopedia of technology terms and cited in articles in the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine, and Discovery Magazine. She joined Techopedia in 2011. Margaret’s idea of ​​a fun day is to help IT and business professionals to learn to speak each other’s highly specialized languages.