Assignments#
Module 1 Assignment: Cyber ML Model Experiments#
Goal: Step into the role of a cybersecurity data scientist.
Your assignment is to step into the role of a cybersecurity data scientist. You will apply the techniques from the lab to a new slice of the data, making and justifying your own feature engineering decisions to build the most effective classifier you can.
Task 1: Reload the full CICIDS2017 DDoS dataset, not the 2,000-sample subset used in the lab.
Task 2: Perform the same data cleaning steps: replace
infvalues and drop any rows withNaNvalues.Task 3: Select a different pair of two features from the dataset than the ones we used in the lab (’ Fwd Packet Length Mean’ and ‘ Flow Duration’). (Note: Be mindful of the leading spaces in the column names when selecting your features.) In a text block in your notebook, write a brief justification for why you chose these two features.
Task 4: Create a new training and testing split using your chosen features and the corresponding labels.
Task 5: Train two separate models on your new data: a
KNeighborsClassifierand aSVC(Support Vector Classifier). You must use aStandardScalerto scale the data for both models. For theSVC, use the best hyperparameters found during the lab (C=10, gamma=0.01).Task 6: For each of your two trained models, generate and display the
classification_reportand aconfusion_matrixplot.Task 7: In a final text block, write a short paragraph comparing the performance of your kNN and SVM models on your chosen features. Which model performed better? Why do you think that might be? Use the precision, recall, and F1-score from your classification reports to support your conclusion.
Planned Assignments#
(Assignments for later modules will be added here)
EDA and Parameter Optimization
Clustering Exploration Lab
TensorFlow Playground
Phishing Email Detection