✍️ Get Writing Help
WhatsApp

Subject Code: MA3022 – MA4022 – MA7022 MA3022 / MA4022 / MA7022 Data Mining and Neural Networks Computational Task 1

Subject Code: MA3022 – MA4022 – MA7022
MA3022 / MA4022 / MA7022 Data Mining and Neural Networks

Computational Task 1
Due till 10.02.2025
100 marks available

Psychological predisposition to nicotine use

Your work should answer the question: Does the psychological predisposition to drug consumption exist?

Nowadays, after many years of research and development, psychologists have largely agreed that the personality traits of the modern Five Factor Model (FFM) constitute the most comprehensive and adaptable system for understanding human individual differences. The FFM comprises Neuroticism (N), Extraversion (E), Openness to Experience (O), Agreeableness (A), and Conscientiousness (C).

The five traits can be summarized thus:

N – Neuroticism is a long-term tendency to experience negative emotions such as nervousness, tension, anxiety and depression (associated adjectives: anxious, self-pitying, tense, touchy, unstable, and worrying);

E – Extraversion manifested in characters who are outgoing, warm, active, assertive, talkative, and cheerful; these persons are often in search of stimulation (associated adjectives: active, assertive, energetic, enthusiastic, outgoing, and talkative);

O – Openness to experience is associated with a general appreciation for art, unusual ideas, and imaginative, creative, unconventional, and wide interests (associated adjectives: artistic, curious, imaginative, insightful, original, and wide interest);

A – Agreeableness is a dimension of interpersonal relations, characterized by altruism, trust, modesty, kindness, compassion and cooperativeness (associated adjectives: appreciative, forgiving, generous, kind, sympathetic, and trusting);

C – Conscientiousness is a tendency to be organized and dependable, strong-willed, persistent, reliable, and efficient (associated adjectives: efficient, organised, reliable, responsible, and thorough).

Two additional characteristics of personality are proven to be important for analysis of substance use: Impulsivity (Imp) and Sensation-Seeking (SS).

Imp – Impulsivity is defined as a tendency to act without adequate forethought;

SS – Sensation-Seeking is defined by the search for experiences and feelings that are varied, novel, complex and intense, and by the readiness to take risks for the sake of such experiences.

Seven psychological traits were used to characterise the participants: N, E, O, A, C, Imp, and SS.


Task 0. Prepare data for analysis

The dataset is online:
https://leicester.figshare.com/articles/dataset/Drug_consumption_database_quantified_categorical_attributes/7588409

Database description is available at:
https://leicester.figshare.com/articles/dataset/Drug_consumption_database_description/7588412

There are much more attributes than you need. Prepare the table. For every participant, leave the following information: 7 psychological traits and nicotine user/non-user (in the last year).

The user/non-user classification will be the main task.


Task 1. Descriptive statistics (20 marks)

For both classes (users and non-users), find the mean values of the 7 attributes and their standard deviations. Evaluate the 95% confidence intervals for mean values. (Take the definitions from any elementary textbook in statistics.)

A very simple online tutorial about 95% confidence interval is here:
http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm

A very simple textbook, The Little Handbook of Statistical Practice, is here:
https://forum.disser.ru/index.php?act=attach&type=post&id=638

Create graphical illustration (psychological profiles of nicotine users and non-users with confidence intervals).


Task 2. Significance of differences (10 marks)

Report which differences between these means for users and non-users are significant. For significance evaluation, use p-values.


Task 3. One attribute classifier (15 marks)

Try to create predictors: user/non-user by one attribute (7 such predictors). For this purpose, create histograms for each attribute and each class and select the best threshold for each attribute x for the decision rule:

  • if x > a then one class (users or non-users), and
  • if x < a then another class (non-users or users)

(the optimal cut). Find the classification error for each attribute. Which attribute gives the best prediction? Arrange the attributes in their prediction ability.


Task 4. k-NN classifier (20 marks)

Test 1-NN and 3-NN classification rules. Present the classification errors. Which rule is better?


Task 5. Fisher’s linear discriminant description (10 marks)

Find in the literature a description and explanation of Fisher’s linear discriminant. Read, understand and write a comprehensive description of the algorithm with main formulas and explanation (not more than 1 page!).


Task 6. Fisher’s linear discriminant usage (15 marks)

Apply Fisher’s linear discriminant to the prepared dataset. Analyse the quality of classification. Compare to 1-NN and 3-NN methods.


Extra 10 marks for clear and well-written report.

For faster services, inquiry about  new assignments submission or  follow ups on your assignments please text us/call us on +1 (251) 265-5102