Python + PySpark Training

python and pyspark programming

1. Python & PySpark Programming Gautam Verma
2. Before We Start  About your Trainer  About You  Housekeeping  Course Details 2
3. Introduction- Gautam Gautam is a Microsoft Certified Trainer (MCT) with over 10 years of technology industry experience. Gautam has consulted & trained companies for Microsoft Azure services from Retail, e-Commerce, CPG, Oil n Gas industries in the space of - Cloud Adoption, Digital Transformation, Artificial Intelligence, Machine Learning, Analytics • Corporate experience - highlights • Worked with companies such as KPMG, Microsoft on Data & Analytics projects in Telecom & Utilities space • Trained & consulted CXOs across North America, Europe, UK, Middle East & Australasia • Started career as an Associate/Data Analyst for US eCommerce company •Other Courses • Advanced Machine Learning using Python • Cloud based Artificial Intelligence on Azure • Artificial Intelligence for Executives •Certifications • Microsoft Certified Trainer • Microsoft Certified AI Associate • Microsoft Certified Data Science Associate •Hobbies • Gautam likes to sing, play synthesizer and with his 2-years old boy
4. About You  Your Name  Your Background  Your Objectives – Have you completed a Pre-Course Questionnaitre?  What is your current experience  How will you be using the knowedge you gain on this course  Are you currently working on a relevant project/application – Tools – Packages – …. etc 4
5. Housekeeping – Schedule  Start 08:30 GMT  Finish 15:30 GMT  Lunch – 5 45 mins
6. Course Details Course Title: Python and PySpark for Big Data  During - Learn Python Programming - Hands on with fundamental python packages- pandas - Understanding Big Data, Hadoop & Spark - Reviewing Spark components - Where Spark (Python API) shines over conventional ETL - Building ETL & Predictive notebooks using PySpark  After – Training Evaluation Form  Will be sent to you by one of the training coordinators – Course Attendance Certificate  Will be sent to you by one of the training coordinators 6
7. Agenda- Day 1 Deep Dive into Python Programming 1. Variables & data types 2. Lists, tuples& dictionaries 3. Functions  Lambda Functions 4. Loops 5. Classes & objects 7
8. Agenda- Day 2 Hands on with Pandas 1. Pandas Overview 2. Dataframes & Series 3. Object creation 4. Indexing, selection operations in pandas 5. Plotting & Visualization Introduction to Big Data 1. Distributed Processing 2. Hadoop vs Spark 8
9. Agenda- Day 3 Apache Spark Essentials 1. Apache Spark Architecture 2. Overview of Spark components 3. RDD vs Dataframes vs Datasets 4. Introduction to Python API 5. Hands on with PySpark 9
10. Agenda- Day 4 PySpark Deep Dive 1. Hands on with PySpark 2. ETL Jobs 3. Machine Learning Model training 4. Reading and writing Parquet format data Final Program Assessment 10
11. 11
12. 12
13. 13
14. 14
15. 15
16. 16
17. 17
18. 18
19. 19
20. 20
21. 21
22. 22
23. 23
24. 24
25. 25
26. 26
27. 27
28. 28
29. 29
30. 30
31. 31
32. 32
33. 33
34. 34
35. “Torture the data and it will confess to anything” Ronald Coase 35
36. Any questions? Q&A Questions & Answers 36
37. Thank you! 37
No comments...
none