Wednesday, July 20, 2011

A Technical Briefing on Text Analytics for Software Engineering at ESEC/FSE 2011

Register now for Technical Briefings at ESEC/FSE 2011, including

Lin Tan and Tao Xie. Text Analytics for Software Engineering: Applications of Natural Language Processing. A Technical Briefing at the 8th joint meeting of the Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2011), Szeged, Hungary, September 2011.


Software engineering data contains a rich amount of natural language text: requirements documents, code comments, identifier names, commit logs, release notes, mailing list discussions, etc. The natural language text is essential in the software engineering process to help software engineers and researchers better understand and maintain software. Given the overwhelming amount of available natural language text, there is a high demand of text analytics including natural language processing (NLP) and text mining techniques to automatically analyze the natural language text to improve software quality and productivity. The history of applying NLP and text mining techniques to analyze software engineering data can date back to about a decade ago. In recent five years, text analytics for software engineering has become an emerging topic in the software engineering area. Various recent studies showed that automated analysis of natural language text can improve software reliability, programming productivity, software maintenance, and software quality in general.

This technical briefing (1) provides a quick overview of major text mining techniques as well as NLP techniques (e.g., Part-Of-Speech tagging, chunking, semantic labeling, semantic pattern matching, and negative-expression identification), machine learning techniques (e.g., clustering and decision-tree-based classification), and data mining techniques (e.g., frequent itemset mining); (2) introduces popular text analysis tools (e.g., WordNet and Weka); (3) summarizes major research work done in the area of text analytics for software engineering; and (4) outlines future research directions and highlights research challenges. More information on the technical briefing could be found at

The ESEC/FSE program includes a complementary technical briefing on “Management of Unstructured Information during Software Evolution: Applications of Text Retrieval”, by Andrian Marcus. We recommend attending both of them.


  1. Thanks for sharing your info. I really appreciate your efforts and I will be waiting for your further write ups thanks once again.
    Vee Eee Technologies| Vee Eee Technologies|

  2. very nice thanks for sharing

    hey friend see snow on google
    Type “Let It Snow” on @Google If you click and drag you can wipe the snow away. It is great. source:

  3. Thank you for taking the time and sharing this information with us. It was indeed very helpful and insightful while being straight forward and to the point.