Course objectives
After completing this course, students will be able to:
- Write Python code, covering essential programming constructs like data types, control flow, and functions.
- Utilize Python libraries for network data analysis and visualization, including cleaning, transforming, and presenting data.
- Apply statistical methods and algorithms to identify patterns and trends in network data.
- Automate repetitive tasks involving file systems, spreadsheets (Excel & CSV), and email communication.
- Build robust and reusable Python scripts by incorporating error handling and debugging techniques.
- Gain a foundational understanding of the relationship between Data Science and Artificial Intelligence.
- Write Python code, covering essential programming constructs like data types, control flow, and functions.
Course outlines
- Crush course of python programming language
- Write the first program
- Data Types and Variables
- Type Conversion (Type Casting)
- Getting Input from the User
- Control Flow: Making Decisions and Repeating Actions
- Conditional Statements (if, elif, else)
- for loops
- while loops
- Loop Control Statements (break, continue)
- Types of Operators
- Arithmetic Operators
- Assignment Operators
- Comparison (Relational) Operators
- Logical Operators
- Bitwise Operators
- Data structure in python (list, Tuples sets and Dictionaries)
- Functions: Reusable Blocks of Code
- Defining and Calling Functions
- Arguments and Parameters (Default parameter values, Arbitrary arguments
- Scope of Variables (Local vs. Global scope)
- Lambda Functions (One-line functions)
- Modules and Packages (Built-in Modules, Creating Your Own Modules)
- Introduction to Network Data Analysis
- What is Network Data Analysis (NDA)
- Dealing with open datasets
- Tools used in Data Analysis
- Data analytics applications
- Data Analysis processdata analyst versus data scientist
- Who is a data analyst?
- Difference between data analyst and data scientist
- Steps involved in Data Analysis
- Network Data Cleaning, Transformation and Visualization
- Normalizing the network data
- Enriching, and structuring the network data to make it suitable for analysis.
- Applying statistical methods, algorithms, and logic to identify patterns, trends, correlations, and anomalies.
- This is using Python libraries like Pandas, NumPy, and Scapy.
- Presenting the insights using graphical formats (charts, dashboards) to aid decision-making.
- The most important libraries in python used in data analysis
- Numpy library
- Pandas library
- Matplotlib library
- Seaborn Library
- Dpkt library and Netmiko library
- Numpy library (part 1)
- Advantages of NumPy array over pure python data structure
- Create a 1D-Array
- Create 2D-Array
- Array indexing
- Array slicing
- Array shape and reshape
- Numpy library (part 2)
- Data types in Numpy library
- Array join and split
- Array sort and array search
- Matrix multiplication (matmul)
- random numbers in Numpy
- Generate random numbers and random arrays
- Matplotlib and Seaborn Libraries (part 1)
- What is Matplotlib?
- The key concepts in the design of matplotlib
- Pyplot, plotting and markers in Matplotlib
- linestyle, labels and grid in Matplotlib
- Understand subplots
- Visualize arrays with matplotlib
- Matplotlib and Seaborn Libraries (part 2)
- Draw simple line Plots
- Draw scatter Plots
- Histograms, Binnings, and Density
- Customizing Plot Legends
- Seaborn Versus Matplotlib
- Pandas library (part 1)
- What is Pandas? And why we use Pandas?
- Pandas DataFrame and how to create one.
- Reading and writing data to and from files by a Pandas Dataframe.
- How to access, modify, add, sort, filter, and delete data.
- Pandas library (part 2)
- computing some statistics using Pandas
- use Pandas on real-world data what is the time-series data
- How to work with time-series data
- Making Changes to time-series data such as (add Additional Columns
- Pandas library (part 3)
- Exporting and Importing Data combining DataFrames across rows or columns
- Load Files Into a DataFrame
- Data cleaning and removing duplicates
- Pandas Objects (Histograms, Density Plot and Scatterplot)
- Working With Missing Data
- Network Data Analysis and visualization using Python
- Why we use Python in Network Data Analysis,
- Data collection/acquisition.
- Data cleaning and preprocessing.
- Exploratory Data Analysis (EDA).
- Modeling/Advanced analysis.
- Interpretation and communication of results
- How to work with text file format and reading Data from other sources
- Using Pandas, NumPy and Matplotlib packages in the Network Data Analysis process
- Artificial Intelligence and Data Science
- AI Programming Languages
- The branches of Artificial Intelligence (Machine learning, Deep learning, expert system, etc)
- Overview of the main branches of Artificial Intelligence
- Overview of Machine learning
- Overview of Deep learning
- Machine learning versus deep learning
- Link between Data Science and Artificial Intelligence
- What is Data Science?
- Future of Data Science and Artificial Intelligence
- Key Similarities: Data Science and Artificial Intelligence
- Key Differences: Data Science and Artificial Intelligence
- Automating File System Operations
- The os and shutil Modules:
- Understanding file paths (absolute vs. relative).
- Getting the current working directory
- Changing directories
- Creating and deleting folders
- Listing files and folders
- Copying, Moving, and Deleting Files:
- shutil.copy(), shutil.copytree().
- shutil.move().
- os.unlink() (for files)
- os.rmdir() (for empty folders).
- Reading and Writing Plain-Text Files:
- Opening files (open()).
- Reading file content (read(), readline())
- Writing to files (write(), writelines()).
- Managing ZIP Files (zipfile module):
- Creating and adding files to ZIP archives.
- Extracting files from ZIP archives
- Automating Spreadsheet (Excel & CSV) with Python Using openpyxl for Advanced Excel Control: xlsxwriter for Rich Excel Reports:
- Opening and creating Excel workbooks.
- Accessing worksheets.
- Reading and Writing data from / to cells.
- Manipulating rows and columns
- Using the csv module for structured data.
- Reading data with csv.reader.
- Writing data with csv.writer
- Saving modified Excel & CSV files.
- Reading data from multiple sheets/files and consolidating.
- Updating values in spreadsheets based on logic.
- Modifying existing Excel files, formatting cells, adding formulas, conditional
- Automating Email Communication smtplib, email modules
- Sending plain text and HTML emails (smtplib).
- Attaching files to emails.
- Receiving and reading emails (imaplib)
- Automating report distribution via email.
- Working with Dates and Times (datetime module)
- Getting current date/time.
- Formatting dates and times.
- Performing date/time arithmetic
- Scheduling Tasks
- How to make your Python scripts executable by the scheduler.
- OS-level schedulers (such as Task Scheduler on Windows).
- Using Python libraries like schedule or
- APScheduler for in-script scheduling.
- Robustness and Reusability of Automated python Scripts
- Error Handling (try, except, finally):
- Debugging:
- Program Organization & Command Line Arguments