In today’s world all are connected to each other through internet world mainly by social media where we daily upload huge data in this social media sites in the form of video , text, photos etc. So the question is how this firm manages and store our data ?
According to the study of research based company they said that in social media we store each day 500+TB of data uploaded so here we discuss how this firm manages (store and access to us) the data uploaded by us.
Facebook processes 2.5 billion pieces of content and 500+ terabytes of data each day. It’s pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half-hour. For managing the big data, Facebook uses multiple Hadoop clusters. Around 100 petabytes of data are stored by Facebook in a single Hadoop Cluster.
The above approach helps Facebook to manage and process such a huge amount of data and to serve its users in an efficient and optimized way.
Google processes over 40,000 search queries every second on average, which translates to over 3.5 billion searches per day and 1.2 trillion searches per year worldwide.
Google currently processes over 20 petabytes of data per day through an average of 100,000 MapReduce jobs spread across its massive computing clusters. The average MapReduce job ran across approximately 400 machines in September 2007, crunching approximately 11,000 machine years in a single month.
The above approach helps Google to provide users with their search result in a faster way and also helps in the storing and processing of data received by users through various sources like search history, location history, YouTube watch history, etc and then to provide customized recommendations to users based on their past data.
According to statistics, people make 95 million new posts on Instagram every day, about 200 million Instagram users visit business profiles, and as much as 80% of users follow at least one business on this platform.
There are nearly 1 billion monthly active users on Instagram, 500 million daily active users, and the like button is hit an average of 4.2 billion times per day.
Here the term Big Data comes in play when we store data so lets look what is it ?
Big Data :- The data which is beyond the capacity of our existing storage available to our existing storage.
How Big Data Stored?
For this they uses the distributed storage model where these firm uses multiple server and connect to one centralized system so that these can be easily accessible to the user
Distributed Storage:- In this storage model they use master-slave model to store & access the data.Here we one centralized storage system called master and this master connect to multiple server called slave these slave system provide storage to master so the combination of multiple slave storage gathered master storage.
For example suppose we store 100GB data and it takes 100 min to store it so it too much time taken process for that we can take 10 different server (server mean different hardware having RAM,CPU,PROC. ) and they each have 10 GB capacity to store the data so in these scenario these 10 system can store the data in 10 min or may be less why because each of it has their own cpu, ram ,processor (Resources) so it is faster than single storage system. This model in called Distributed Storage model.
These whole master-slave model in whole called Hadoop cluster.
Every activity we do over the internet generates data, this means that Big Data and related technologies such as Hadoop, HBase and likewise are here to stay, as long as, the data is there.
I hope you found the post Informative, if something was missing or you think some more things could have been added, feel free to provide suggestions in the comments section.