Research

Query Word Labeling using Supervised Machine Learning
FIRE2014 Shared Task on Transliterated Search in collaboration with I.R.S.I. and Microsoft Research
 Identify words as belonging to an Indian language (L) or English (E) from sentences written in Roman script and if the word belongs to Indian language (L), transliterate the same to its Devanagari script equivalent.

Read the entire working notes here.

Normalization based stop-word approach to source code
plagiarism detection
FIRE2015  Task on Cross Language Plagiarism detection
 
We approach this task as text document plagiarism task, without considering formal programming language grammatical structure.
We use normalization of commonly used identifiers to detect pair of programs which have the same objective. We also find that entirely removing these normalized operations improves the system.

Read the entire working notes here.