Open Source and Personal Projects

Google Summer of Code 2014  : CPython IDLE Improvements Project
Will be working on extending test coverage, creating a non-buildbot test framework for IDLE, adding line numbering with breakpoints and creating an generic API to add 3rd part code checkers into IDLE.
More information available here and here

Cachediff - A tool for localized cache analysis
Code here. Paper here.
Cachediff is a tool to study the effect of cache performance between two versions (differing from each other by a small diff/delta) of the same C/C++ program. This is useful to students, educationist and professionals. Cachediff presents to the user a localized and global view of the cache and its statistics. It uses cache simulation based on instruction/memory tracing during execution. It can be extended to support n-versions of the same program.

FIRE2015 Shared Task on Source code plagiarism detection
System to detect plagiarism detection across programming languages(C and Java). Uses vector space modelling, similarity measures, and normalization. For the shared task, it had a precision of 100% and recall value of 74.

FIRE2014 Shared Task on Transliterated Search in collaboration with I.R.S.I. and Microsoft Research
To classify Indian language words in Roman scripts into Indian language/English classess and transliterate the Indian language words into corresponding Devanagiri script equivalents. Built using Python, scikit-learn and nltk

Developer for IDLE Python IDE as a part of CPython. I have contributed to the test framework, human test framework of IDLE. I have also submitted patches to enable code checking from within IDLE and to display linenumbering(with breakpoint integration).

TBookSummr at Ayana hackathon
An automatic notes and summary making application directly from physical mediums like textbooks, magazines etc. Uses Tesseract for OCR purposes and nltk for summarization purposes. Recieved a judge's special mention and a consolation prize