Data analysis and visualization with Python and SQL

This introductory to intermediate course equips the learner with practical skills in data management using two of the most important data manipulation technologies, Python and SQL. Students are expected to have sufficient grounding in mathematics including descriptive statistics and some regression and correlation. A detailed outline of the training curricula is as follows:

Python basics and data types

The course introduces learners to their first Python program, Expressions and Variables types, Strings, lists and tuples, dictionaries and sets.

Python Functions

The topic introduces learners to a fundamental aspect of Python programming, functions in their different types, exception handling, objects and classes. A number of practice labs will buttress the trained concepts.

Data Management with Python.

The module crucially presents data management with Python and launches students on the path to reading writing data files. learners will be able to load and manipulate data with Pandas. One and two dimensional NumPy arrays will be explored. Further concepts investigated include APIs and Web scrapping.

Data Management with SQL.

Learners will be introduced to databases and supported to create a database instance on the local computer and on the cloud. Basic to intermediate SQL statements will be explored. Students will be writing and manipulating SQL queries practically in a live database. Relational databases will be extensively investigated including working with multiple tables.

Data analysis and Visualization with Python.

Introduction to the module challenges students to not only look at data as something which analytics can be applied on but to try and understand what the data means, ask questions on how it was collected and potential pitfalls which can result from data not adhering to data quality dimensions. Data wrangling and other preprocessing techniques are extensively explored. Exploratory data analysis including descriptive statistics – and association between variables is undertaken. An introduction to modelling (and model evaluation) and predictive analysis is undertaken. Python data visualization modules such as Matplotlib, Seaborn and others will be outlined before students will conduct practical exercises setting up charts such as area plots, bar charts, histograms. Very limited introduction to dashboards will be outlined. Further topics will be covered in the advanced data analysis and visualization with Python course .

Capstone project

A final project will enable students to apply to real world scenarios analytics and visualization concepts learnt throughout the course. From time to time, project application areas will alternate between financial markets data, disease surveillance data, economic data among other application areas