CIS4035-N Machine Learning Application and Report
{` TEESSIDE UNIVERSITY School of Computing, Engineering and Digital Technologies Module Title: Machine Learning Module Code: CIS4035-N Assignment Title: Machine Learning Application and Report `}
Personal and Transferable
- Select, apply and defend the selection and application of machine learning methodologies and experiments in academic reports.
- Demonstrate a systematic understanding of machine learning algorithms and their selection for solving a specific problem.
Research, Knowledge and Cognitive Skills
- Investigate state-of-the-art machine learning algorithms.
- Design appropriate representations of machine learning problems for input into machine learning packages and critically evaluate their effectiveness.
- Design and evaluate neural network configurations and learning mechanisms for sample problems.
- Analyse empirical results of the selected machine learning algorithms and justify the performance.
Professional Skills
- Autonomously implement and evaluate appropriate machine learning technique for particular learning tasks, taking into consideration professional, ethical and legal issues.
Task Description
Problems in machine learning vary from domain to another. In this Assignmentwork, you will select a dataset related to a real-world problem that best suits your area of interest. There are abundant of websites that provide publicly available datasets. A categorised list of datasets from GitHub can be found at https://github.com/caesar0301/awesome-public-datasets. The UCI Machine Learning Repository at https://archive.ics.uci.edu/ml/index.php is another longstanding source of benchmark datasets for data mining and machine learning research. Kaggle https://www.kaggle.com/datasets has interesting real-world problems and datasets.
You can select a dataset from the above sources, or another one that is available online. The dataset should be publicly available. The chosen dataset should have a minimum of 1,000 instances (rows) and a minimum of 5 attributes (columns). You have to complete the following stages in this assignment:
- Define the problem for the selected data set and identify the machine learning algorithms that are applicable to this problem.
- Data exploration and preparation: The nature of the dataset may dictate some data exploration and preparation that can help inform the solutions. For example, higher dimensional datasets (those with too many attributes/columns) may require applying a data reduction method like Principal Component Analysis (PCA).
- Propose solutions: In this step, you will propose three machine learning algorithms that are applicable to the selected data set/problem.
- Design, implementation, modelling and evaluation: design, model and implement the proposed solutions and critically evaluate the solutions. Use appropriate visualisation for the results.
- Reflect on professional, ethical and legal issues in relation to the problem and the data set.
Element 1 Deliverable – Contribute 50% of the Module Mark
Element 1 will assess learning outcomes LO 2, 3, 4, 5 and 6
What to Hand In
- Online - file in a pdf format via Blackboard that includes all source code and screenshots from your experiments appropriately labelled and commented
You need to demonstrate your code and results in the practical sessions in the last week (w/c 27th April 2020).
The code and experiments will be assessed on
- Appropriateness of machine learning algorithm selected for the given task
- Quality of software architecture and implementation
- Quantitative performance of application
Element 2 – Contribute 50% of the Module Mark
Element 2 will assess learning outcomes LO 1, 2 and 7
What to Hand In
- A case study report maximum of 2000 words that documents the process of the entire case study, including data set, problem, data preparation and exploration, selected algorithms, critical evaluation and justification of the algorithms and findings.
- Online – file in a pdf format via Turnitin on Blackboard
The hand in is electronically via Blackboard, all deliverables shall be labelled with project name, your student name and university number.
The report will be assessed on:
- understanding of machine learning task
- review of relevant literature
- development methodology
- justification of design decisions
- consideration of professional, ethical and legal issues
The report could broadly include the following sections:
- Abstract
- Introduction (introduce the problem and its significance, write short literature review of related work)
- Data exploration and features selection
- Experiments
- Results
- Discussion, Conclusions and Future Work
- References
These are generic section titles, which you may adapt appropriately to the application/problem that is investigated. You may include sections describing modifications of algorithms or developments that are novel and specific to your work.
Marking Criteria
Grade |
SOURCE CODE DOCUMENTATION AND DEMO |
Excellent 70% and above |
Clear evidence of running the experiments with code that is excellently organised and commented. Machine learning algorithms selected are appropriate for the given task Excellent quality of software architecture and implementation Excellent quantitative performance of application Deep understanding shown. |
Very Good 60% - 69% |
Very good evidence of running the experiments with code that is well organised and commented. Machine learning algorithms selected are appropriate for the given task Very good quality of software architecture and implementation Very good quantitative performance of application Very good understanding. |
Satisfactory 50% - 59% |
Satisfactory evidence of running the experiments with code that is organised and commented. Machine learning algorithms selected are appropriate for the given task Satisfactory quality of software architecture and implementation Satisfactory quantitative performance of application Satisfactory understanding. |
Fail Less than 50% |
Little evidence of running the experiments with code that is not well organised and commented. Machine learning algorithms selected are not appropriate for the given task Poor quality of software architecture and implementation Poor quantitative performance of application Poor understanding. |
NS NON- SUBMISSION |
N/A |
Grade |
ACADEMIC QUALITY OF THE PAPER - 50% |
Excellent 70% and above |
Excellent technical quality (rigour of the experiments, data preparation, justification and correct application of the selected algorithms and suitability of the selection). Produced and demonstrated a comprehensive, high quality solution to the problem. Sufficient information for the reader is provided to reproduce the results. Outstanding evidence of systematic review using multiple high quality academic sources. Logical, clear development of narrative. High quality references and citations. Outstanding evaluation and discussion of the significance of the results (Why the results are important? How does the paper advance the state of the art? How would the results be useful to other researchers or practioners? Is this a “real” problem or a small “toy” problem?) |
Legal, social, ethical, security and professional issues fully considered. A paper, which could be, with minor modifications, suitable for a publication – or form the basis for a postgraduate project. There is some element of a novel approach to the problem or novel use of techniques. | |
Very Good 60% - 69% |
Very good technical quality. Produced and demonstrated very good quality solution to the problem. Sufficient information for the reader is provided to reproduce the results. Very good evidence of systematic review using multiple high quality academic sources. Logical, clear development of narrative. Appropriate references and citations. Very good evaluation and discussion of the significance of the results. Legal, social, ethical, security and professional issues fully considered. |
Satisfactory 50% - 59% |
Satisfactory technical quality. Produced and demonstrated good quality solution to the problem. Good evidence of reviewing multiple academic sources. Some references and citations. Good evaluation and discussion of the significance of the results. Legal, social, ethical, security and professional issues fully considered. |
Fail Below 50% |
Not adequate technical quality. Produced and demonstrated a solution to the problem, which is flawed, despite some effort. Poor evidence of reviewing academic sources. Little evaluation and discussion of the results. Little consideration of legal, social, ethical, security and professional issues. Narrative difficult to follow. Poor quality of references and citations. |
NS NON- SUBMISSION |
N/A |