Research Assistant in Machine Learning/Natural Language Processing

Date: Sep 12, 2022

Location: Washington, DC, US, 20016

Company: American University

About the Department/Unit

  • Countering Contemporary Antisemitism—A Data-Centered Approach.


The initiative is subdivided into three sub-teams: the Data Team, the Software Team, and the Analysis Team. In this position, the candidate will be working with the Software Team to design a machine learning approach that takes as input a small, carefully curated labeled set of social posts as a seed and induce a large repository of such posts using NLP and Machine Learning methods. An important aspect of the problem lies in the evolving nature of the problem as new terms, turns of phrases, and so on, emerge over time and the problem will possibly make use of continuous learning techniques to handle that issue. The RA will also be expected to provide light administrative support for the three groups.

Essential Functions

Essential Functions: Research: the candidate will have to 1) design and optimize an antisemitic text detection system using a lexicon of extremist terms, sentiment analysis techniques, and a transformer-based deep learning architecture such as BERT, RoBERTa, etc; 2)  Expand the approach proposed by Liu/Boukouvals/Japkowicz  to retrieve posts closely related to those present in the seed data set; 3) Adapt the Baron/Corizzo/Japkowicz project on continuous/lifelong learning to the text domain;  4) Combine 1), 2) and 3) to create a continuous learning antisemitic text detection and retrieval approach; 5) Possibly extend the approach to the simultaneous processing of text, images,  using multi-modal approaches such as those developed in Dr. Boukouvalas’ lab . Administration: Create and manage a repository of documents for use by the team; Create a document to track project deadlines.

Work Environment

  • Hybrid.
  • The candidate will be given space in the Computer Science Lab as well as access to some of the Computer Science High Performance Computing equipment.
  • S/He/They can work partially in the lab and partially remotely.
  • In-person and/or Zoom meetings will be scheduled to discuss the progress of the project.

Position Type/Expected Hours of Work

  • 16 hours/week.

Preferred Education and Experience

  • The candidate should have a solid background in Machine Learning with good knowledge of Python, Numpy, and Scikit-Learn.
  • The candidate should have some experience with Deep Learning and/or Deep Language Models.
  • Familiarity with natural language processing, computer vision, and/or continuous learning techniques is desirable but can also be acquired during the course of the project.

