CLPsych 2021 Shared Task

Despite years of research seeking to understand risk factors and improve prevention, suicide remains a leading cause of death worldwide [1,2]. Recent work using NLP and machine learning approaches shows strong potential to help, particularly in its ability to tap into the everyday thoughts, feelings, and experiences of individuals by looking at their activity on social media [3,4]. However, data connected with mental health is sensitive and difficult to obtain, and lack of community-level access to shared datasets is an obstacle to progress.

The 2021 Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2021) will be held on June 11 in conjunction with NAACL 2021. For the CLPsych 2021 Shared Task, we have created an opportunity for secure and ethical access to sensitive data in order to work as a community on the problem of predicting suicide risk from social media. The dataset for the task includes de-identified Twitter posts and ground-truth outcomes from individuals who have attempted or succeeded in a suicide attempt, along with control individuals who have not; these data were donated for research purposes at OurDataHelps. Teams participating in the shared task will do their experimentation on the UMD/NORC Mental Health Data Enclave, a secure computing environment that brings researchers to the data rather than vice-versa.

Resources in support of this task are being provided by Qntfy (which runs OurDataHelps and is providing the dataset), NORC at the University of Chicago (which operates the UMD/NORC Mental Health Data Enclave), and by Amazon (which has contributed AWS computing credits via an Amazon Machine Learning Research Award).

Timeline

Data

Teams will work on the Enclave with social media data donated by people who attempted suicide or loved ones of those lost to suicide at OurDataHelps.org. As discussed below, teams will also have access to a Practice Dataset that can be used at their own sites for development and debugging.

Data Format

The data are provided in in JSON-lines files (one for train and one for test), where each line represents a single user and their tweets. The format is as follows:

{
	"id": str, # anonymized user ID- used for submission
	"label": bool, # 1 for users with a known attempt, 0 for control (in the practice dataset: true for depression hashtag, false for control)
	"date_of_attempts": str, # the known date of attempt or empty string if no attempt
	"tweets": [
		{
			"id": str,
			"text": str,
			"created_at": str
		}
	]
}

Naturally, the date_of_attempts fields are not available in the test set.

Data Access (Enclave)

After signing a Data Use Agreement with NORC, participating teams will be given login credentials on the UMD/NORC Mental Health Data Enclave. Access to the Enclave is accomplished via a secure desktop client the participant will install locally. From their desktop, participants will be able to log in to a well-outfitted AWS EC2 instance allocated for their use. (Each team will be allocated a generous allowance of AWS credits that should be sufficient for participating in the shared task.) Within the Enclave on the desktop and AWS environment, no copying of data out of the environment is possible (not even via copy/paste).

To avoid wasting AWS credits while designing your systems, we recommend using the Practice Dataset.

Detailed information about access to the Enclave, the computing environment, available packages and resources, support, etc., will be provided to teams when they receive their login credentials.

Scoring and Submission Format

The shared task includes two subtasks:

The official task metrics are F1 score, F2 score (weights recall higher than precision), True Positive Rate, False Alarm Rate, and AUC. We provide our official scoring script here: scoring script. The script takes in the source file and a TSV file with your results. The TSV file should be formatted as follows:

[USER_ID] \t [LABEL] \t [SCORE]

Where USER_ID is the ID field from the source file, LABEL is either 1 for suicide or 0 for control, and SCORE is a real-valued score output score from your system, where larger numbers indicate the SUICIDE class and lower numbers indicate CONTROL. The scores allow us to compute AUC and ROC curves, whereas the label specifies the particular operating point you choose for your submission.

Practice Data

We provide a practice dataset to help build your system outside of the enclave. This practice dataset is based on a modified version of swcwang/depression-detection. The task is to identify users who have tweeted with a #depression (or similar) hashtag.

Note that although we performed spot checks to make sure this dataset seems reasonable, the practice dataset has not been validated by the community, so results from it should be approached with skepticism.

More information about using the Practice Dataset is found here

Baseline System

We provide a baseline system here. You are free to use or build upon this system as you wish.

Additional Data

We will also be making a copy of the UMD Reddit Suicidality Dataset available on the Enclave, in case some teams wish to make use of it, e.g. for feature selection or transfer learning. Because you will only have access to this dataset on the secure Enclave as part of the shared task, you will not be required to go through the standard application and approval process (e.g. obtaining your own organization’s IRB approval). (Teams are also welcome to go through the standard application process at any time to get a copy of the Reddit Suicidality Dataset on their own site; see How to Request Access on the dataset page linked above.)

Publication

All shared task participants will have the opportunity to contribute a short system description paper for inclusion in the official workshop proceedings. Note that the timeline, particularly for paper writing/reviewing/revision, is quite compressed, because we want to provide shared task participants with an official publication in the workshop proceedings and we have been given a strict, unmovable deadline by the conference organizers for sending final camera-ready shared task papers.

Organizers

You can contact the organizers here