Comprehensive Video Tutorials

Hadoop Fundamentals for Data Scientists Training Video

CareerVision Training
Online

£ 118 - (138 )
+ IVA

Información importante

  • Curso
  • Online
  • Duración:
    Flexible
  • Cuándo:
    A elegir
  • Campus online
Descripción

The following course, offered by Career vision, will help you improve your skills and achieve your professional goals. During the program you will study different subjects which are deemed to be useful for those who want to enhance their professional career. Sign up for more information!

Información importante
Instalaciones y fechas

Dónde se imparte y en qué fechas

Inicio Ubicación
A elegir
Online

¿Qué aprendes en este curso?

IT
DVD and Download
Windows PC or Mac

Temario

Hadoop Fundamentals for Data Scientists Training Video

  • Duration: 6 hours - 33 tutorial videos
  • Date Released: 2015-04-29
  • Works on: Windows PC or Mac
  • Format: DVD and Download
  • Instructor: Jenny Kim,Benjamin Bengfort

A Practical Training Course That Teaches Real World Skills

In this project-based Hadoop Fundamentals for Data Scientists video tutorial series, you'll quickly have relevant skills for real-world applications.

Follow along with our expert instructor in this training course to get:

  • Concise, informative and broadcast-quality Hadoop Fundamentals for Data Scientists training videos delivered to your desktop
  • The ability to learn at your own pace with our intuitive, easy-to-use interface
  • A quick grasp of even the most complex Hadoop Fundamentals for Data Scientists subjects because they're broken into simple, easy to follow tutorial videos

Practical working files further enhance the learning process and provide a degree of retention that is unmatched by any other form of Hadoop Fundamentals for Data Scientists tutorial, online or offline... so you'll know the exact steps for your own projects.


Get a practical introduction to Hadoop, the framework that made big data and large-scale analytics possible by combining distributed computing techniques with distributed storage. In this video tutorial, hosts Benjamin Bengfort and Jenny Kim discuss the core concepts behind distributed computing and big data, and then show you how to work with a Hadoop cluster and program analytical jobs. You'll also learn how to use higher-level tools such as Hive and Spark. Hadoop is a cluster computing technology that has many moving parts, including distributed systems administration, data engineering and warehousing methodologies, software engineering for distributed computing, and large-scale analytics. With this video, you'll learn how to operationalize analytics over large datasets and rapidly deploy analytical jobs with a variety of toolsets. Once you've completed this video, you'll understand how different parts of Hadoop combine to form an entire data pipeline managed by teams of data engineers, data programmers, data researchers, and data business people.

- Understand the Hadoop architecture and set up a pseudo-distributed development environment
- Learn how to develop distributed computations with MapReduce and the Hadoop Distributed File System (HDFS)
- Work with Hadoop via the command-line interface
- Use the Hadoop Streaming utility to execute MapReduce jobs in Python
- Explore data warehousing, higher-order data flows, and other projects in the Hadoop ecosystem
- Learn how to use Hive to query and analyze relational data using Hadoop
- Use summarization, filtering, and aggregation to move Big Data towards last mile computation
- Understand how analytical workflows including iterative machine learning, feature analysis, and data modeling work in a Big Data context

Benjamin Bengfort is a data scientist and programmer in Washington DC who prefers technology to politics but sees the value of data in every domain. Alongside his work teaching, writing, and developing large-scale analytics with a focus on statistical machine learning, he is finishing his PhD at the University of Maryland where he studies machine learning and artificial intelligence. Jenny Kim, a software engineer in the San Francisco Bay Area, develops, teaches, and writes about big data analytics applications and specializes in large-scale, distributed computing infrastructures and machine-learning algorithms to support recommendations systems.

01. Hadoop Fundamentals For Data Scientists Overview Of The Video Course 02. A Distributed Computing Environment The Motivation For Hadoop A Brief History Of Hadoop Understanding The Hadoop Architecture Setting Up A Pseudo-Distributed Environment The Distributed File System - HDFS Distributed Computing With MapReduce Word Count - The Hello World Of Hadoop 03. Computing With Hadoop 0301 How A MapReduce Job Works 0302 Mappers And Reducers Into Detail 0303 Working With Hadoop Via The Command Line - Starting HDFS And Yarn 0304 Working With Hadoop Via The Command Line - Loading Data Into HDFS 0305 Working With Hadoop Via The Command Line - Running A MapReduce Job 0306 How To Use Our Github Goodies 0307 Working Into Python With Hadoop Streaming 0308 Common MapReduce Tasks 0309 Spark on Hadoop 2 0310 Creating A Spark Application With Python 04. The Hadoop Ecosystem 0401 The Hadoop Ecosystem 0402 Data Warehousing With Hadoop 0403 Higher Order Data Flows 0404 Other Notable Projects 05. Working With Data On Hive 0501 Introduction To Hive 0502 Interacting With Data Via The Hive Console 0503 Creating Databases, Tables, And Schemas For Hive 0504 Loading Data Into Hive From HDFS 0505 Querying Data And Performing Aggregations With Hive 06. Towards Last Mile Computing 0601 Decomposing Large Data Sets To A Computational Space 0602 Linear Regressions 0603 Summarizing Documents With TF-IDF 0604 Classification Of Text 0605 Parallel Canopy Clustering 0606 Computing Recommendations Via Linear Log-Likelihoods