大数据, 数据分析培训, Big Data Hadoop, MongoDB, Tableau 第5期 18:00 to 20:30 9/05– 9/28
12 x 2.5 hour sessions, total 30 hours
上过第一期课同学们说: 这是一个高大上的课! 得到太多好东西了! 1、Cloudera reviewHDFS
hive
impala
HBase
Hue
2、Sqoopimport data from MySQL to Hive
import data from Hive to MySQL
Why is Sqoop used?
Where is Sqoop used?
Sqoop-Import
Importing a Table into HDFS
Importing Selected Data from Table
Importing Data from Query
Incremental Exports
Importing Data into Hive
Importing Data into HBase
Sqoop-Import-all-Tables
Sqoop-Export
Sqoop-Job
Sqoop-List-Database
Sqoop-List-Tables
3、Pyspark Basic Interaction with PySpark shell
Using External Database
Transformation and Actions in Apache Spark
RDD Partitions
Caching, Accumulators, and UDF
Running a Spark application in Standalone Mode
Launching Spark Application on a Cluster
project 1: Emergency - 911 Calls
project 2: House Prices - Data Exploration and Visualisation
project 3: Homicide Reports
project 4: Titanic: Machine Learning from Disaster
project 5: H-1B Visa Petitions 2011-2016
project 6: Person of the Year, 1927-Present
project 7: Full history of airplane crashes throughout the world, from 1908-present
project 8: Amazon Fine Food Reviews
4、MongoDBCreate Database
Drop Database
Create Collection
Drop Collection
Data Types
Insert Document
Query Document
Update Document
Delete Document
Limiting Records
Sorting Records
Indexing
Aggregation
Replication
Sharding
Create Backup
Deployment
5、TableauData Extraction
Data Blending
Calculation/Parameter
Sort and Filter
Formatting
Visualization
Dashboard