Dr Diana Maynard
School of Computer Science
Senior Research Fellow
Deputy Head of the Natural Language Processing research group
  
  
    
         d.maynard@sheffield.ac.uk
    
          
          
        
      
    
  
  
      Regent Court (DCS)
  
Full contact details
        Dr Diana Maynard
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
          
      
  
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
- Research interests
 - 
    
- Information extraction
 - GATE
 - Social media analysis
 - Sentiment analysis
 - Online abuse and misinformation detection
 - Term recognition
 - Ontologies and semantic web
 - Freedom of the media
 - NLP for scientometrics
 
 
- Publications
 - 
    
Books
- The Chilling: A global study of online violence against women journalists. ICFJ.
 - . Springer International Publishing.
 - . Morgan & Claypool Publishers.
 - Text Processing with Gate (Version 6). GATE.
 - Preface.
 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface.
 - Preface.
 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface.
 
Journal articles
- . Engineering Applications of Artificial Intelligence, 142, 109931-109931.
 - . Information Sciences, 647.
 - . Frontiers in Artificial Intelligence, 3.
 - . PLoS ONE, 16(2).
 - . Scientometrics, 125(2), 1275-1290.
 - . Media and Communication, 8(1), 89-100.
 - . Technological Forecasting and Social Change, 137, 61-75.
 - . Journal of Web Science, 3(1).
 - . Journal of Web Semantics, 44, 75-88.
 - . Semantic Web Journal.
 - , 65-86.
 - . Information Processing & Management, 51(2), 32-49.
 - , 139-155.
 - . Journal of Web Semantics, 24, 1-2.
 - . Journal of Web Semantics.
 - . Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 7117 LNCS, 88-99.
 - Automatic detection of political opinions in tweets. Ceur Workshop Proceedings, 718, 81-92.
 - . Journal of Web Semantics, 9(3), 315.
 - Using lexico-syntactic ontology design patterns for ontology creation and population. Ceur Workshop Proceedings, 516, 39-52.
 - NLP-based support for ontology lifecycle development. Ceur Workshop Proceedings, 514.
 - Information extraction: Algorithms and prospects in a retrieval context. COMPUT LINGUIST, 34(2), 315-317.
 - NLP techniques for term extraction and ontology population. Frontiers in Artificial Intelligence and Applications, 167(1), 107-127.
 - . New Review of Hypermedia and Multimedia, 13(2), 211-237.
 - . IBM Syst. J., 45, 3-6.
 - . Literary and Linguistic Computing, 19(4), 509-524.
 - . Natural Language Engineering, 10(3-4), 349-373.
 - . Journal of Natural Language Engineering, 8(2-3), 257-274.
 - . Journal of Natural Language Processing, 8(1), 101-125.
 - . SSRN Electronic Journal.
 - . Future Internet, 6(3), 457-481.
 - . Future Internet, 6(3), 433-456.
 
Book chapters
- , Cognitive Technologies (pp. 127-130). Springer International Publishing
 - , Unlocking Environmental Narratives: Towards Understanding Human Environment Interactions through Computational Text Analysis (pp. 133-160). Ubiquity Press
 - Preface (pp. V-VII).
 - Challenges in Analysing Social Media. In Dusa A, Nelle D, Stock G & Wagner G (Ed.), Facing the Future: European Research Infrastructures for the Humanities and Social Sciences Berlin: SCIVERO Verlag.
 - Natural language processing, Perspectives on Ontology Learning (pp. 51-67).
 - In Weller K, Bruns A, Burgess J, Mahrt M & Puschmann C (Ed.), Twitter and Society USA: Peter Lang.
 - , Natural Language Processing (pp. 261-278). John Benjamins Publishing Company
 
Conference proceedings
- Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference. Nlp4dh 2024 4th International Conference on Natural Language Processing for Digital Humanities Proceedings of the Conference (pp 450-462)
 - . Findings of the Association for Computational Linguistics: EMNLP 2023 (pp 12194-12209). Singapore, 6 December 2023 - 6 December 2023.
 - Development of a benchmark corpus to support entity recognition in job descriptions. Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp 1201-1208). Marseille, France
 - . Text, Speech, and Dialogue: 23rd International Conference on Text, Speech and Dialogue (TSD 2020), Vol. 12284 (pp 3-10). Brno, Czech Republic, 8 September 2020 - 8 September 2020.
 - . Proceedings of 24th European Conference on Artificial Intelligence (ECAI 2020), Vol. 325 (pp 2054-2061). Santiago de Compostela, Spain, 29 August 2020 - 29 August 2020.
 - Using ontologies to map between research and policy data: opportunities and challenges. Proceedings of the 17th International Conference on Scientometrics & Informetrics, Vol. 1 (pp 535-540). Rome, Italy, 2 September 2019 - 2 September 2019.
 - Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan ¾Ã²Ý¸£Àû Detection using ELMo Sentence Representation Convolutional Network. Proceedings of the 13th International Workshop on Semantic Evaluation (pp 840-844). Minneapolis, Minnesota, USA, 6 July 2019 - 6 July 2019.
 - . Proceedings of the 13th International Workshop on Semantic Evaluation, June 2019 - June 2019.
 - Exploring knowledge production in Europe. The KNOWMAK tool. Proceedings of the 17th Conference of the International Society for Scientometrics and Informetrics (ISSI 2019), Vol. II (pp 2561-2562). Rome, Italy, 2 September 2019 - 2 September 2019.
 - . Procedia Computer Science, Vol. 137 (pp 102-108). Vienna, Austria, 10 September 2018 - 10 September 2018.
 - Twits, Twats and Twaddle: Trends in Online Abuse towards UK Politicians. Proceedings Of The Twelfth International Conference On Web And Social Media (pp 600-603). California, USA, 25 June 2018 - 25 June 2018.
 - Helping crisis responders find the informative needle in the tweet haystack. Proceedings of the 15th ISCRAM Conference (pp 649-662). Rochester, NY, USA, 20 May 2018 - 20 May 2018.
 - Ontologies as bridges between data sources and user queries: the KNOWMAK project experience. Proceedings of Science, Technology and Innovation indicators 2017. Paris, 6 September 2017 - 6 September 2017.
 - Comparing Attitudes to Climate Change in the Media using sentiment analysis based on Latent Dirichlet Allocation.. Proc. of EMNLP Workshop "Natural Language Meets Journalism"
 - Towards an infrastructure for understanding and interlinking knowledge co-creation in European research. CEUR Workshop Proceedings, Vol. 1878. Portoroz, Slovenia
 - . The semantic web: 14th International Conference, ESWC 2017, Portorož, Slovenia, May 28 – June 1, 2017, Proceedings, Part II, Vol. 10250 (pp V-VII). Portorož, Slovenia, 28 May 2017 - 28 May 2017.
 - GATE-time: Extraction of temporal expressions and events. Proceedings of the 10th International Conference on Language Resources and Evaluation Lrec 2016 (pp 3702-3708)
 - . Proceedings of the 8th ACM Conference on Web Science (pp 85-94)
 - Challenges of Evaluating Sentiment Analysis Tools on Social Media. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp 1142-1148). Portorož, 23 May 2016 - 23 May 2016.
 - . Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication (pp 1-6)
 - Extracting Relations between Non-Standard Entities using Distant Supervision and Imitation Learning. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp 747-757). Lisbon, Portugal, 17 September 2015 - 17 September 2015.
 - . Proceedings of the ACM Web Science Conference (pp 1-2)
 - (pp 26-41)
 - Introduction. Swaie 2014 3rd Workshop on Semanticweb and Information Extraction Proceedings of the Workshop (pp III)
 - . Ht 2013 Proceedings of the 24th ACM Conference on Hypertext and Social Media (pp 21-30)
 - TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text. Proceedings of the International Conference on Recent Advances in Natural Language Processing
 - . Procedia Computer Science, Vol. 22 (pp 231-240)
 - Multimodal sentiment analysis of social media. Ceur Workshop Proceedings, Vol. 1110 (pp 47-58)
 - Entity extraction and consolidation for social web content preservation. Ceur Workshop Proceedings, Vol. 912 (pp 18-29)
 - Knowledge extraction and consolidation from social media (KECSM 2012) :Preface. Ceur Workshop Proceedings, Vol. 895 (pp I-II)
 - Large Scale Semantic Annotation, Indexing, and Search at The National Archives. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (pp 3487-3494)
 - Using events for content appraisal and selection in Web archives. Ceur Workshop Proceedings, Vol. 779 (pp 98-107)
 - . 2009 IEEE Conference on Commerce and Enterprise Computing CEC 2009 (pp 476-482)
 - Evaluating Evaluation Metrics for Ontology-Based Applications: Infinite Reflection.. LREC
 - Benchmarking Textual Annotation Tools for the Semantic Web. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 (pp 20-25)
 - Ontology-based information extraction for business intelligence. SEMANTIC WEB, PROCEEDINGS, Vol. 4825 (pp 843-856)
 - Natural language technology for information integration in business intelligence. BUSINESS INFORMATION SYSTEMS, PROCEEDINGS, Vol. 4439 (pp 366-380)
 - Metrics for evaluation of ontology-based information extraction. Eon 2006 Evaluation of Ontologies for the Web 4th International Workshop Located at the 15th International World Wide Web Conference Www 2006
 - Metrics for evaluation of ontology-based information extraction. Ceur Workshop Proceedings, Vol. 179
 - Creating tools for morphological analysis of sumerian. Proceedings of the 5th International Conference on Language Resources and Evaluation Lrec 2006 (pp 1762-1765)
 - Ontology-based information extraction for market monitoring and technology watch. Ceur Workshop Proceedings, Vol. 137 (pp 33-42)
 - Extracting a domain ontology from linguistic resource based on relatedness measurements. 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings (pp 345-351)
 - A lightweight approach to coreference resolution for named entities in text. Anaphora Processing, Vol. 263 (pp 97-111)
 - Using parallel texts to improve recall in botany. Recent Advances in Natural Language Processing III, Vol. 260 (pp 237-246)
 - . DATA & KNOWLEDGE ENGINEERING, Vol. 48(2) (pp 247-264)
 - Populating a database from parallel texts using ontology-based information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 3136 (pp 254-264)
 - Automatic language-independent induction of gazetteer lists. Proceedings of the 4th International Conference on Language Resources and Evaluation Lrec 2004 (pp 709-712)
 - Creation of reusable components and language resources for Named Entity Recognition in Russian. Proceedings of the 4th International Conference on Language Resources and Evaluation Lrec 2004 (pp 309-312)
 - Automatic creation and monitoring of semantic metadata in a dynamic knowledge portal. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, Vol. 3192 (pp 65-74)
 - . ACM Trans. Asian Lang. Inf. Process., Vol. 2 (pp 295-300)
 - . Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - EACL '03, Vol. 2 (pp 219-219), 12 April 2003 - 17 April 2003.
 - NE recognition without training data on a language you don’t speak. ACL Workshop on Multilingual and Mixed-language Named Entity Recognition: Combining Statistical and Symbolic Models. Sapporo, Japan
 - GATE: A Unicode-based Infrastructure Supporting Multilingual Information Extraction. Proceedings of Workshop on Information Extraction for Slavonic and other Central and Eastern European Languages (IESL’03). Borovets, Bulgaria
 - . Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems - SEALTS '03, Vol. 8 (pp 17-24), 31 May 2003 - 31 May 2003.
 - . Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references -, Vol. 1 (pp 1-9), 31 May 2003.
 - Access to multimedia information through multisource and multilanguage information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 2553 (pp 160-171)
 - (pp 613-625)
 - Adapting a robust multi-genre NE system for automatic content extraction. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS AND APPLICATIONS, PROCEEDINGS, Vol. 2443 (pp 264-273)
 - GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02). Philadelphia, USA
 - A framework and graphical development environment for robust NLP tools and applications.. ACL (pp 168-175)
 - A unicode-based environment for creation and use of language resources. Proceedings of the 3rd International Conference on Language Resources and Evaluation Lrec 2002 (pp 66-71)
 - . Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics -, Vol. 1 (pp 54-62), 7 July 2002 - 7 July 2002.
 - Extracting information for automatic indexing of multimedia material. Proceedings of the 3rd International Conference on Language Resources and Evaluation Lrec 2002 (pp 669-676)
 - How feasible is the reuse of grammars for Named Entity Recognition?. Proceedings of the 3rd International Conference on Language Resources and Evaluation Lrec 2002 (pp 1412-1418)
 - . Proceedings of the ACL-02 Workshop on Automatic Summarization -, Vol. 4 (pp 19-26), 11 July 2002 - 12 July 2002.
 - GATE: an architecture for development of robust HLT applications. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE (pp 168-175)
 - Developing reusable and robust language processing components for information systems using GATE. 13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS (pp 223-227)
 - Named Entity Recognition from Diverse Text Types. Recent Advances in Natural Language Processing 2001 Conference (pp 257-274-257-274). Tzigov Chark, Bulgaria
 - . Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02 (pp 168-168), 7 July 2002 - 12 July 2002.
 - Experience of using GATE for NLP R&D. Proceedings of the Workshop on Using Toolsets and Architectures To Build NLP Systems at COLING-2000. Luxembourg
 - Creating and using domain-specific ontologies for terminological applications. 2nd International Conference on Language Resources and Evaluation Lrec 2000
 - . Proceedings of the 18th conference on Computational linguistics -, Vol. 1 (pp 530-536), 31 July 2000 - 4 August 2000.
 - . The Semantic Web – ISWC 2018, Vol. 11136 (pp 617-633). Monterey, CA, USA, 8 October 2018 - 8 October 2018.
 - . Advances in Computer Science Research, 7 September 2015 - 9 September 2015.
 - Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis. LREC 2014 Proceedings. Reykjavik, Iceland, 26 May 2014 - 26 May 2014.
 
Datasets
- .
 
Preprints
- , arXiv.
 - , arXiv.
 - Examining Temporal Bias in Abusive Language Detection..
 - , arXiv.
 - , arXiv.
 - , arXiv.
 - .
 - .
 
 
- Research group
 - 
    
Member of the research group.
 
- Grants
 - 
    
Current grants
- Influencing policy work on human rights violations against journalists, Research England, 09/2024 - 06/2025, £34,667, as PI
 - Toolkit for Analysing and Visualising Online Violence Against Female Journalists, EPSRC, 04/2024 - 03/2025, £45,363, as PI
 - Atrium: Advancing FronTier Research In the Arts and hUManities, Horizon Europe, 01/2024 - 12/2027, £370,950, as PI
 
Previous grants
- RISIS2: , EC H2020, 01/2019 - 12/2022, £476,741, as co-PI
 - Visualising the environmental impacts of plant-based recipes in Europe, Research England, 12/2021 - 05/2022, £18,407, as PI
 - Calculating the environmental impact of plant based recipes, Industrial, 01/2021 - 12/2021, £2,500, as PI
 - Pilot project on developing and trialling a toolkit for strengthening national context monitoring of violations against journalists, Free Press, 06/2020 - 12/2020, £29,094, as Co-PI
 - Pilot project on developing a database for the improved collection and systematisation of information on incidents of violations against journalists, Free Press, 04/2019 - 11/2019, £29,030, as Co-I
 - The Intelligent Automation of Contract Analysis of Collateral Warranties, Innovate UK, 03/2019 - 08/2020, £114,552, as PI
 - Social Understandings of Scale: The role of Print and Social Media in the EU Referendum Debate, British Academy, 01/2018 - 06/2019, £49,716, as Co-PI
 - Improving the monitoring of violence against journalists, Free Press, 12/2017 - 10/2018, £26,589, as Co-I
 - KNOWMAK: , EC H2020, 01/2017 - 12/2019, £196,654, as PI
 - COMRADES: , EC H2020, 01/2016 - 12/2018, £257,000, as PI