If you are tuned in to the latest technology concepts around big data, you’ve likely heard the term “data lake.” The image conjures up a large reservoir of water—and that’s what a data lake is, in concept: a reservoir. Only it’s for data.
Data lake defined
A data lake holds a vast amount of raw, unstructured data in its native format.
Therefore, all you need is a device that supports a flat file system, which means you can use a mainframe if you want. The data is moved to other servers for processing. Most enterprises go with the Hadoop File System (HDFS), because it is designed for fast processing of large data sets and is used in a big data environment where a data lake is likely to be used.