Big data 4v data processing (Batch and Streming )
Types of data --- 1.Structured data 2.Semistructure 3. Unstructured data 4. disk -- data stored in block and sector
About apache beam
Run apache beam cluster on gcp cloud run apachebeam runtime on gcp
1.Top level apache open source projects
started at 2016
Apache beam is a unified programming model that can build portable
Beam=batch+streamming
Beam supports Python,java,go lang
It uses Map Reduce Model
Google developed map reduces models
Independently Hadoop born based on map reduce concept
Hardoop is open source ,can be installed in any Linux Platform
Flume,Flick and spark is used in real time case studies
Other apache project you can use Hbase,hive,pig and oozie
hadroop
HDFS (Hardoop distributed File system)500*1024