Description
This project uses Dark Web discussion board data to examine changes in language surrounding crime/offending over time. This project employs natural language processing.
Duration
Total project length: 175 hours
Task ideas
- Extract and preprocess textual data from online forum posts for NLP analysis.
- Develop and train NLP models, such as LSTM or BERT, to understand the context, sentiment, and thematic elements of forum discussions.
Expected results
- Create a processed dataset of textual content from online forum posts, ready for NLP tasks.
- Train an NLP model capable of identifying key themes, sentiments, or user engagement patterns within forum discussions, based on the linguistic features of the posts.
- If time allows: Analyze the linguistic relationships and communication patterns between forum participants to uncover insights into community dynamics and discourse trends.
Requirements
Ability to code in R or Python; understanding of machine learning and/or natural language processing.
Project difficulty level
Intermediate
Test
Please use this link to access the test for this project.
Mentors
Please DO NOT contact mentors directly by email. Instead, please email human-ai@cern.ch with Project Title and include your CV and test results. The mentors will then get in touch with you.
Corresponding Project
Participating Organizations