Experience
Education
Bio
I am a Researcher in Microsoft Research Lab India since 2007. My research interests cut across the areas of Linguistics, Cognition and Computation. Currently, I am working on script and code-mixing, especially in social media and web search. We have introduced the notion of Mixed-Script Information Retrieval, where the query and the documents can be in different, and possibly, more than one scripts but in the same language; the task is to retrieve the relevant documents across scripts. Such situations arise quite commonly for Indian languages, where the documents (say song lyrics or posts on discussion forums) can be either written in the native script or in Romanized form. In fact, a large amount of Indian language (and also Greek, Arabic, etc.) content on the Web is available in Romanized form. Mixed-script IR entails challenges such as indexing cross-script indexing, handling transliteration induced spelling variations in queries and documents, code-mixed query understanding and query completion.