finished
Duration: 2011-01 – 2011-11
In this thesis we examine the automatic generation of training data in order to train a machine learning algorithm. We will use a rule-based approach to generate the training data which is build using the GATE Natural Language Processing framework. The machine learning algorithm is using a statistical model, the maximum entropy model (MEM) in our case, to do the information extraction task. We will introduce an architecture and an application for the automatic generation of training data. In order to test our approach we will introduce and adapt evaluation metrics. The implications of using automatic test data on the structure of the result will be elaborated. We will partition the error in different regions and see its impact. We see that under certain circumstances the statistical model outperforms the rule-based extraction algorithm, that was used to train that model.
Duration: 2009-04 – 2011-11
Duration: 2011-10 – 2011-10
A MediaWiki is a Social Web application which supports a group or users to collaboratively create, maintain and share content. The main functionalities of a MediaWiki are creating, editing and linking articles to supply navigation between them and to add articles to categories ([Barrett, 2009]).
In some cases it is necessary to measure or to increase the quality of articles in a MediaWiki. According to [Wang & Strong, 1996], poor data quality has a significant social and economic influence. This paper analyzes and summarizes scientific literature that deals with quality criteria of articles and data. Based on the result of that analysis and in collaboration with a company, a list of features were formulated that measure the quality of MediaWiki articles or help the user of a MediaWiki to increase the quality of articles. Finally, a prototype in form of a toolbar was developed which contains the identified quality indicator features. The prototype was implemented in Adobe Flex and can be easily integrated as an extension into a MediaWiki.
Duration: 2010-03 – 2011-09
Duration: 2009-02 – 2011-04
Duration: 2010-04 – 2011-04
Duration: 2010-03 – 2011-01
Duration: 2009-02 – 2010-11
Duration: 2009-02 – 2010-11
Duration: 2010-03 – 2010-10
Duration: 2008-11 – 2010-08
Duration: 2009-05 – 2010-06
Duration: 2009-02 – 2010-03
Duration: 2009-02 – 2010-02
Duration: 2008-09 – 2010-01
Duration: 2009-02 – 2009-10
Duration: 2008-10 – 2009-09
Duration: 2009-02 – 2009-09
Duration: 2008-07 – 2009-07
Duration: 2008-03 – 2009-05
Duration: 2008-04 – 2009-03
Duration: 2008-04 – 2009-01
Duration: 2008-04 – 2008-10
Duration: 2008-03 – 2008-09
Duration: 2008-04 – 2008-09
Duration: 2008-04 – 2008-09
Duration: 2008-04 – 2008-09
Duration: 2008-04 – 2008-09
Duration: 2008-04 – 2008-09
Duration: 2008-03 – 2008-08
Duration: 2008-03 – 2008-08
Duration: 2008-04 – 2008-07
Duration: 2008-02 – 2008-07
Duration: 2008-02 – 2008-07
Duration: 2007-02 – 2007-10
Duration: 2007-03 – 2007-08
Duration: 2006-03 – 2006-09
Duration: 2005-04 – 2005-09
ongoing
Begin: 2009-02
Begin: 2010-04


