The NameNode is the most critical piece of the system of an HDFS file system. The NameNode manages the entire HDFS file system metadata (i.e owners of files, file permission, no of blocks, block locations, size etc.) and maintained it in main memory. Clients first contact point is the NameNode for file metadata and then perform actual file I/O directly with the DataNodes. If something goes wrong with the NameNode, then whatever metadata was there in main memory would get lost permanently.
hdfs federation
Hadoop HDFS Federation
In Hadoop 1.x, there is only one NameNode (i.e allow only one Active NameNode) in a cluster, which maintains a single namespace (single directory structure) for the entire cluster. Regarding that Hadoop cluster is becoming larger and larger one enterprise platform and stores the entire file system metadata is in NameNode memory (RAM), when there are more data nodes with many files, the NameNode memory will reach its limit and it becomes the limiting factor for cluster scaling (limiting number of files store in the cluster). Hadoop 1.x, the namespace can only be vertically (add more RAM) scaled on a single NameNode.