Professor Kalina Bontcheva
School of Computer Science
Professor of Text Analysis
Member of the Natural Language Processing (NLP) research group
Full contact details
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
- Profile
-
Kalina Bontcheva is a Professor in Natural Language Processing (NLP) in the School of Computer Science at the University of Sheffield, co-affiliated with Sheffields Center for Freedom of the Media. Her research lies at the intersection of Natural Language Processing (NLP), computational social media analysis, and web science, with applications to digital humanities, social science, digital journalism, business intelligence, and data science.
Between 2014 and 2016 Kalina conceived and led the , which was amongst the first EU projects to study computational methods for the detection and tracking of disinformation in social media. She is currently the scientific director of the project, and previously the , and international projects. Since 2015 Kalina has also been working on real-time social media analysis around United Kingdom general elections and the Brexit referendum, and on monitoring and analysis of online abuse towards UK politicians. She co-authored the ITU/UNESCO study , two , and the UNESCO study .
Kalinas research on online disinformation and abuse of target public figures (especially MPs and female journalists) has been covered by many UK media organisations (incl. the BBC, ITV, the Guardian and Buzzfeed UK) with reach exceeding 17.7 million. This includes real-time analysis of political engagement on social media for the 2015 UK general elections and the UK EU membership referendum; analysing Twitter abuse towards politicians in the 2017 UK general election campaign; commentary on the role of Russian bots and sock-puppet accounts in the UK EU membership referendum campaign; and online abuse towards UK MPs (2019 election), which was featured on the news and in accompanying BBC article.
Since 2006 Kalina has been awarded 17 grants as a PI (seven of these as a sole investigator), with a total award value for her team exceeding 瞿8.8 million, from diverse funding sources: EU funding (Frameworks 6 and 7, Horizon 2020, and Horizon Europe), UK EPSRC and ESRC funding councils, a Google Faculty Award, a Google Digital 壅翌腦瞳 Grant, a UK Department of Health and Social Care grant, and two active grants from the UK Foreign, Commonwealth & Development Office (FCDO).
- Research interests
-
Social media, misinformation, and abuse analysis AI-generated misinformation detection; machine learning methods for misinformation and disinformation detection, online abuse analysis, and hate speech detection; large-scale, real-time analysis of social media content; longitudinal research on online abuse.
- Publications
-
Books
- Preface.
- Preface.
- Preface.
- . Springer International Publishing.
- . Morgan & Claypool Publishers.
- Text Processing with Gate (Version 6). GATE.
Journal articles
- . EPJ Data Science, 14.
- . Expert Systems with Applications, 243.
- . CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 5380-5384.
- . PLoS ONE, 19(5).
- Examining temporalities on stance detection towards COVID-19 vaccination. 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings, 6732-6738.
- Navigating prompt complexity for zero-shot classification: a study of large language models in computational social science. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 12074-12086.
- Examining the limitations of computational rumor detection models trained on static datasets. 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings, 6739-6751.
- Large language models offer an alternative to the traditional approach of topic modelling. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 10160-10171.
- . Online Information Review, 48(5), 1045-1062.
- . EPJ Data Science, 12(1).
- . Findings of the Association for Computational Linguistics: EMNLP 2023, 5347-5355.
- . Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, 648-657.
- . Computational Linguistics, 49(3), 767-772.
- . Proceedings of the International AAAI Conference on Web and Social Media, 17(1), 1052-1062.
- . PLoS ONE, 16(9).
- . Semantic Web, 12(3), 403-421.
- . PLoS ONE, 16(2).
- . Scientific Reports, 10(1).
- Toxic language detection in social media for Brazilian Portuguese : new dataset and multilingual analysis. Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 914-924.
- Measuring what counts : the case of rumour stance classification. Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 925-932.
- . Journal of Computational Social Science, 3, 401-443.
- . EPJ Data Science, 9.
- Towards an interoperable ecosystem of AI and LT platforms : a roadmap for the implementation of different levels of interoperability. Proceedings of the 1st International Workshop on Language Technology Platforms, 96-107.
- . Information Processing & Management, 56(6).
- . Online Social Networks and Media, 13.
- . ACM Transactions on Information Systems, 37(2).
- . ACM Computing Surveys, 51(2).
- . Semantic Web, 9, 291-302.
- . Information Processing & Management, 54, 273-290.
- . Computer Speech & Language, 44, 61-83.
- . Information Processing & Management, 53(4), 989-1003.
- . Journal of Web Semantics, 44, 75-88.
- . Polibits, 54, 79-85.
- . ACM Transactions on Information Systems, 34(3), 1-5.
- . Information Processing & Management, 51(2), 32-49.
- Estimating collective judgement of rumours in social media.. CoRR, abs/1506.00468.
- . Journal of Web Semantics, 30, 52-68.
- . Language Resources and Evaluation, 47(4), 1007-1029.
- Making sense of social media streams through semantics: a Survey. Semantic Web Journal.
- Improving Habitability of Natural Language Interfaces for Querying Ontologies with Feedback and Clarification Dialogues. Journal of Web Semantics.
- Getting More out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics.. PLoS Computational Biology.
- GATECloud.net: a Platform for Large-Scale, Open-Source Text Processing on the Cloud. Philosophical Transactions of the Royal Society A. Mathematical, Physical and Engineering Sciences.
- Transition of legacy systems to semantically enabled applications: TAO method and tools. Semantic Web, 3(2).
- , 61-78.
- , 113-127.
- , 37-49.
- . Natural Language Engineering, 15(2), 241-271.
- Service-finder: Web scale semantic discovery. Ceur Workshop Proceedings, 367.
- . ACM Transactions on Asian Language Information Processing, 7(2).
- , 733-752.
- , 139-169.
- . Journal of Knowledge Management, 9(5), 64-84.
- . Journal of Knowledge Management, 9(5), 108-131.
- . Int. J. Digit. Libr., 5, 309-316.
- . USER MODEL USER-ADAP, 15(1), 135-168.
- Corpus Linguistics and South Asian Languages: Corpus Creation and Tool Development.. Lit. Linguistic Comput., 19, 509-524.
- . Natural Language Engineering, 10(3-4), 349-373.
- . Journal of Natural Language Engineering, 8(2-3), 257-274.
- . ACM Transactions on Intelligent Systems and Technology.
- . Language Resources and Evaluation.
- . Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 7, 135-143.
- . Journal of the Association for Information Science and Technology.
- . D-Lib Magazine, 21(1/2).
- Transition of Legacy Systems to Semantically Enabled Applications: TAO Method and Tools. Semantic Web Journal.
Book chapters
- In Maria Aiello L, Chakraborty T & Gaito S (Ed.), Lecture Notes in Computer Science (pp. 243-253). Springer Nature Switzerland
- , Cognitive Technologies (pp. 233-254). Springer International Publishing
- , Cognitive Technologies (pp. 127-130). Springer International Publishing
- , Handbook of Linguistic Annotation (pp. 229-256). Springer Netherlands
- , Working with Text (pp. ix-x). Elsevier
- , The SAGE Handbook of Social Media Research Methods (pp. 499-511). SAGE Publications Ltd
- , Working with Text (pp. 133-158). Elsevier
- In Bontcheva K, Ricci F, Conlan O & Lawless S (Ed.), User Modeling, Adaptation and Personalization (pp. V-VI). Springer International Publishing
- Natural language processing, Perspectives on Ontology Learning (pp. 51-67).
- Summarization of UGC, Mining User Generated Content (pp. 259-287).
- , Studies on the Semantic Web IOS Press
- (pp. 31-53).
- Crowdsourcing Named Entity Recognition and Entity Linking Corpora, The Handbook of Linguistic Annotation (Nancy Ide and James Pustejovsky, eds) Berlin: Springer
- , Handbook of Semantic Web Technologies (pp. 77-116). Springer Berlin Heidelberg
- In Devedzic V & Gasevic D (Ed.), Web 2.0 & Semantic Web (pp. 105-133). Springer
- , Current Issues in Linguistic Theory (pp. 35-44). John Benjamins Publishing Company
- Semantic Annotation and Human Language Technology In Davies J, Studer R & Warren P (Ed.), Semantic Web Technology: Trends and Research John Wiley and Sons
Conference proceedings
- . Companion Proceedings of the ACM on Web Conference 2025 (pp 1100-1103)
- . 2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) (pp 1253-1263), 28 February 2025 - 4 March 2025.
- . Proceedings of the Ninth Conference on Machine Translation (pp 1004-1010). Miami, Florida, USA, 15 November 2024 - 15 November 2024.
- A Lightweight Approach for User and Keyword Classification in Controversial Topics.. ASONAM (2), Vol. 15212 (pp 243-253)
- . Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024) (pp 2051-2056), June 2024 - June 2024.
- Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science.. LREC/COLING (pp 12074-12086)
- Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling.. LREC/COLING (pp 10160-10171)
- Examining the Limitations of Computational Rumor Detection Models Trained on Static Datasets.. LREC/COLING (pp 6739-6751)
- Examining Temporalities on Stance Detection towards COVID-19 Vaccination.. LREC/COLING (pp 6732-6738)
- EUvsDisinfo: A Dataset for Multilingual Detection of Pro-Kremlin Disinformation in 壅翌腦瞳 Articles.. CIKM (pp 5380-5384)
- . Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing (pp 556-567). Varna, Bulgaria, 8 September 2023 - 8 September 2023.
- . Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations (pp 145-151), May 2023 - May 2023.
- Its about Time: Rethinking Evaluation on Rumor Detection Benchmarks using Chronological Splits. Eacl 2023 17th Conference of the European Chapter of the Association for Computational Linguistics Findings of Eacl 2023 (pp 724-731)
- . Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023) (pp 1995-2008), July 2023 - July 2023.
- . Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (pp 5729-5741), December 2023 - December 2023.
- . Findings of the Association for Computational Linguistics: EACL 2023 (pp 736-743), May 2023 - May 2023.
- Don't waste a single annotation: improving single-label classifiers through soft labels.. EMNLP (Findings) (pp 5347-5355)
- Classifying COVID-19 Vaccine Narratives.. RANLP (pp 648-657)
- A Large-Scale Comparative Study of Accurate COVID-19 Information versus Misinformation.. CoRR, Vol. abs/2304.04811
- . Social Informatics: 13th International Conference, SocInfo 2022, Glasgow, UK, October 1921, 2022, Proceedings (pp 128-143). Glasgow, UK, 19 October 2022 - 19 October 2022.
- . Findings of the Association for Computational Linguistics: EMNLP 2022 (pp 2302-2317), December 2022 - December 2022.
- . Findings of the Association for Computational Linguistics: EMNLP 2022 (pp 4039-4054), December 2022 - December 2022.
- eTranslation's Submissions to the WMT22 General Machine Translation Task. Conference on Machine Translation Proceedings (pp 346-351)
- Comparative Analysis of Engagement, Themes, and Causality of Ukraine-Related Debunks and Disinformation.. SocInfo, Vol. 13618 (pp 128-143)
- . Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations (pp 221-230), April 2021 - April 2021.
- Introduction. Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Tutorials (pp III)
- . 2020 - 5th International Conference on Information Technology (InCIT) (pp 185-190), 21 October 2020 - 22 October 2020.
- . 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) (pp 1-4), 6 July 2020 - 10 July 2020.
- The European language technology landscape in 2020 : language-centric and human-centric AI for cross-cultural communication in multilingual Europe. Proceedings of the 12th Language Resources and Evaluation Conference (pp 3322-3332). Marseille, France, 11 May 2020 - 11 May 2020.
- . Computational Processing of the Portuguese Language, Vol. 12037 (pp 313-320). Evora, Portugal, 2 March 2020 - 2 March 2020.
- Using deep neural networks with intra- And inter-sentence context to classify suicidal behaviour. Lrec 2020 12th International Conference on Language Resources and Evaluation Conference Proceedings (pp 1303-1310)
- Measuring the impact of readability features in fake news detection. Lrec 2020 12th International Conference on Language Resources and Evaluation Conference Proceedings (pp 1404-1413)
- . Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (pp 914-924), December 2020 - December 2020.
- Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability.. IWLTP@LREC (pp 96-107)
- The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) (pp 3322-3332)
- Social Informatics - 12th International Conference, SocInfo 2020, Pisa, Italy, October 6-9, 2020, Proceedings. SocInfo, Vol. 12467
- Proceedings of the 1st International Workshop on Language Technology Platforms, IWLTP@LREC 2020, Marseille, France, May 2020. IWLTP@LREC
- On the Linguistic Linked Open Data Infrastructure.. IWLTP@LREC (pp 8-15)
- . Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (pp 925-932), December 2020 - December 2020.
- European Language Grid: An Overview.. LREC (pp 3366-3380)
- . Proceedings of the Conference for Truth and Trust Online 2019
- . Proceedings of the Conference for Truth and Trust Online 2019
- Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan 壅翌腦瞳 Detection using ELMo Sentence Representation Convolutional Network. Proceedings of the 13th International Workshop on Semantic Evaluation (pp 840-844). Minneapolis, Minnesota, USA, 6 July 2019 - 6 July 2019.
- (pp i-iii)
- . Proceedings of the 13th International Workshop on Semantic Evaluation, June 2019 - June 2019.
- . Proceedings of the 13th International Workshop on Semantic Evaluation, June 2019 - June 2019.
- . Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations (pp 115-120), November 2019 - November 2019.
- . Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1) (pp 320-326), August 2019 - August 2019.
- Credibility and Transparency of 壅翌腦瞳 Sources: Data Collection and Feature Analysis. Ceur Workshop Proceedings, Vol. 2411
- Investigating stability and reliability of crowdsourcing output. CEUR Workshop Proceedings, Vol. 2276 (pp 83-87). Z羹rich, Switzerland, 5 July 2018 - 5 July 2018.
- . Social Informatics, Vol. 11185 (pp 274-290). St. Petersburg, Russia, 25 September 2018 - 25 September 2018.
- Twits, Twats and Twaddle: Trends in Online Abuse towards UK Politicians. Proceedings Of The Twelfth International Conference On Web And Social Media (pp 600-603). California, USA, 25 June 2018 - 25 June 2018.
- Helping crisis responders find the informative needle in the tweet haystack. Proceedings of the 15th ISCRAM Conference (pp 649-662). Rochester, NY, USA, 20 May 2018 - 20 May 2018.
- . Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18 (pp 437-438), 23 April 2018 - 27 April 2018.
- 2nd International Workshop on Rumours and Deception in Social Media: Preface.. CIKM Workshops, Vol. 2482
- The Semantic Web - ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA, USA, October 8-12, 2018, Proceedings, Part II. ISWC (2), Vol. 11137
- The Semantic Web - ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA, USA, October 8-12, 2018, Proceedings, Part I. ISWC (1), Vol. 11136
- Can rumour stance alone predict veracity?. Coling 2018 27th International Conference on Computational Linguistics Proceedings (pp 3360-3370)
- Argumentation mining: Exploiting multiple sources and background knowledge. Proceedings of the 12th South - East European Doctoral Student Conference (pp 66-74). Thessaloniki, Greece, 9 May 2018 - 9 May 2018.
- . Proceedings of the 1st Workshop on Natural Language Processing and Information Retrieval associated with RANLP 2017 (pp 19-27). Varna, Bulgaria, 7 September 2017 - 7 September 2017.
- . Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017 (pp 31-39). Varna, Bulgaria, 4 September 2017 - 4 September 2017.
- . Social Informatics, Vol. 10540 (pp 53-64). Oxford, UK, 13 September 2017 - 13 September 2017.
- . Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) (pp 69-76). Vancouver, Canada, 3 August 2017 - 3 August 2017.
- SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) (pp 69-76). Vancouver, Canada, 3 August 2017 - 3 August 2017.
- . Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 (pp 195-202)
- . HT 2017 - Proceedings of the 28th ACM Conference on Hypertext and Social Media (pp 45-54). NY, USA, 4 July 2017 - 4 July 2017.
- Gold Standard Online Debates Summaries and First Experiments Towards Automatic Summarization of Online Debate Data.. CICLing (2), Vol. 10762 (pp 495-505)
- . Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp 876-885), 1 November 2016 - 5 November 2016.
- Challenges of Evaluating Sentiment Analysis Tools on Social Media. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp 1142-1148). Portoro鱉, 23 May 2016 - 23 May 2016.
- . Proceedings of the First Workshop on NLP and Computational Social Science (pp 43-48), November 2016 - November 2016.
- Broad twitter corpus: A diverse named entity recognition resource. Coling 2016 26th International Conference on Computational Linguistics Proceedings of Coling 2016 Technical Papers (pp 1169-1179)
- . Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), June 2016 - June 2016.
- Monolingual social media datasets for detecting contradiction and entailment. Proceedings of the 10th International Conference on Language Resources and Evaluation Lrec 2016 (pp 4602-4605)
- . Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), August 2016 - August 2016.
- USFD: Twitter NER with Drift Compensation and Linked Data. Proceedings of the ACL 2015 Workshop on Noisy User-generated Text (pp 48-53). Beijing, China
- . Proceedings of the 26th ACM Conference on Hypertext & Social Media - HT '15 (pp 95-99), 1 September 2015 - 4 September 2015.
- . Proceedings of the ACM Web Science Conference (pp 1-2)
- (pp 171-186)
- . WWW '15 Companion Proceedings of the 24th International Conference on World Wide Web (pp 347-353), 18 May 2015 - 22 May 2015.
- . Proceedings of the 24th International Conference on World Wide Web (pp 1111-1115)
- . Proceedings of the 24th International Conference on World Wide Web
- Towards Detecting Rumours in Social Media. AAAI 2015 Workshop On AI For Cities. Austin, Texas, USA
- . Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), July 2015 - July 2015.
- USFD: Twitter NER with Drift Compensation and Linked Data.. NUT@IJCNLP (pp 48-53)
- Towards Detecting Rumours in Social Media.. AAAI Workshop: AI for Cities, Vol. WS-15-04
- Topic models and n-gram language models for author profiling. CEUR Workshop Proceedings, Vol. 1391
- Topic models and n-gram language models for author profiling. Ceur Workshop Proceedings, Vol. 1391
- Recent Advances in Natural Language Processing, RANLP 2015, 7-9 September, 2015, Hissar, Bulgaria. RANLP
- . Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, September 2015 - September 2015.
- Efficient named entity annotation through pre-empting. International Conference Recent Advances in Natural Language Processing, RANLP, Vol. 2015-J (pp 123-130). Hissar, Bulgaria, 5 September 2015 - 5 September 2015.
- Topic Models and n-gram Language Models for Author Profiling - Notebook for PAN at CLEF 2015.. CLEF (Working Notes), Vol. 1391
- User profile modelling in online communities. Ceur Workshop Proceedings, Vol. 1275 (pp 35-48)
- PHEME: Veracity in digital social networks. Ceur Workshop Proceedings, Vol. 1181 (pp 19-22)
- Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Recognising Person Entities in Tweets. Proceedings of the European chapter of the Association for Computational Linguistics. ACL
- The GATE Crowdsourcing Plugin: Crowdsourcing Annotated Corpora Made Easy. Proceedings of the European chapter of the Association of Computational Linguistics. ACL
- Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines. Proceedings of the International Conference on Language Resources and Evaluation. ELRA
- The GATE Crowdsourcing Plugin: Crowdsourcing Annotated Corpora Made Easy. Eacl 2014 Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp 93-96)
- Preface. Proceedings of the Annual Meeting of the Association for Computational Linguistics, Vol. 2014-June (pp III)
- Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data. Proceedings of the International Conference on Recent Advances in Natural Language Processing
- Reliably evaluating summaries of twitter timelines. AAAI 2013 Spring Symposium on Analyzing Microtext. Stanford
- . ACM International Conference Proceeding Series
- AnnoMarket: An Open Cloud Platform for NLP. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp 19-24)
- . Ht 2013 Proceedings of the 24th ACM Conference on Hypertext and Social Media (pp 21-30)
- Wheres@ wally? A Classification Approach to Geolocating Users Based on their Social Ties. Proceedings of the 24th ACM Conference on Hypertext and Social Media (pp 11-20)
- Recent Advances in Natural Language Processing, RANLP 2013, 9-11 September, 2013, Hissar, Bulgaria. RANLP
- Recognising and interpreting named temporal expressions. International Conference Recent Advances in Natural Language Processing Ranlp (pp 113-121)
- TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text. Proceedings of the International Conference on Recent Advances in Natural Language Processing
- Crowdsourcing research opportunities: lessons from natural language processing.. I-KNOW (pp 17-17)
- Named entity disambiguation using linked data. 9th Extended Semantic Web Conference (ESWC2012)
- Reputation Profiling with GATE.. CLEF (Online Working Notes/Labs/Workshop), Vol. 1178
- Recent Advances in Natural Language Processing, RANLP 2011, 12-14 September, 2011, Hissar, Bulgaria. RANLP
- Ontology-Based Categorization of Web Services with Machine Learning.. LREC
- CA manager framework: creating customised workflows for ontology population and semantic annotation.. K-CAP (pp 177-178)
- Large-scale, parallel automatic patent annotation.. PaIR (pp 1-8)
- Opinion analysis for business intelligence applications.. OBI, Vol. 308 (pp 3-3)
- A natural language query interface to structured information. SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, Vol. 5021 (pp 361-375)
- COLING 2008, 22nd International Conference on Computational Linguistics, Demo Proceedings, 18-22 August 2008, Manchester, UK. COLING (Demos)
- A Text-based Query Interface to OWL Ontologies.. LREC
- Proceedings of the First International Workshop on Ontology-supported Business Intelligence, OBI 2008, Karlsruhe, Germany, October 27, 2008. OBI, Vol. 308
- RoundTrip Ontology Authoring. SEMANTIC WEB - ISWC 2008, Vol. 5318 (pp 50-65)
- CLOnE: Controlled language for ontology editing. SEMANTIC WEB, PROCEEDINGS, Vol. 4825 (pp 142-155)
- Ontology-based information extraction for business intelligence. SEMANTIC WEB, PROCEEDINGS, Vol. 4825 (pp 843-856)
- Hierarchical, perceptron-like learning for ontology-based information extraction.. WWW (pp 777-786)
- Experiments of Opinion Analysis on the Corpora MPQA and NTCIR-6.. NTCIR
- SVM Based Learning System for F-term Patent Classification.. NTCIR
- Natural language technology for information integration in business intelligence. BUSINESS INFORMATION SYSTEMS, PROCEEDINGS, Vol. 4439 (pp 366-380)
- Creating tools for morphological analysis of sumerian. Proceedings of the 5th International Conference on Language Resources and Evaluation Lrec 2006 (pp 1762-1765)
- User-friendly ontology authoring using a controlled language. Proceedings of the 5th International Conference on Language Resources and Evaluation Lrec 2006 (pp 35-40)
- Automatic extraction of hierarchical relations from text. SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, Vol. 4011 (pp 215-229)
- Mining information for instance unification. Semantic Web - ISEC 2006, Proceedings, Vol. 4273 (pp 329-342)
- Extracting a domain ontology from linguistic resource based on relatedness measurements. 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings (pp 345-351)
- Perceptron learning for Chinese word segmentation. Sighan@ijcnlp 2005 4th Sighan Workshop on Chinese Language Processing Proceedings of the Workshop (pp 154-157)
- . Conll 2005 Proceedings of the Ninth Conference on Computational Natural Language Learning (pp 72-79)
- Indexing and querying linguistic metadata and document content. International Conference Recent Advances in Natural Language Processing Ranlp, Vol. 2005-January (pp 74-81)
- Generating tailored textual summaries from ontologies. SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, Vol. 3532 (pp 531-545)
- SVM based learning system for Information Extraction. DETERMINISTIC AND STATISTICAL METHODS IN MACHINE LEARNING, Vol. 3635 (pp 319-339)
- . Int. J. Artif. Intell. Tools, Vol. 13 (pp 299-331)
- A lightweight approach to coreference resolution for named entities in text. Anaphora Processing, Vol. 263 (pp 97-111)
- . DATA & KNOWLEDGE ENGINEERING, Vol. 48(2) (pp 247-264)
- Automatic Language-Independent Induction of Gazetteer Lists.. LREC
- Large Scale Experiments for Semantic Labeling of Noun Phrases in Raw Text.. LREC
- Open-source Tools for Creation, Maintenance, and Storage of Lexical Resources for Language Generation from Ontologies.. LREC
- Web Services Architecture for Language Resources. Fourth International Conference on Language Resources and Evaluation (LREC2004). Lisbon, Portugal
- Recent Advances in Natural Language Processing III, Selected Papers from RANLP 2003, Borovets, Bulgaria. RANLP, Vol. 260
- Automatic report generation from ontologies: the MIAKT approach. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 3136 (pp 324-335)
- . Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing are evaluation methods, metrics and resources reusable? - Evalinitiatives '03 (pp 3-9), 14 April 2003 - 14 April 2003.
- . Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references -, Vol. 1 (pp 1-9), 31 May 2003.
- . Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems - SEALTS '03, Vol. 8 (pp 17-24), 31 May 2003 - 31 May 2003.
- GATE: A Unicode-based Infrastructure Supporting Multilingual Information Extraction. Proceedings of Workshop on Information Extraction for Slavonic and other Central and Eastern European Languages (IESL03). Borovets, Bulgaria
- . ACM Trans. Asian Lang. Inf. Process., Vol. 2 (pp 295-300)
- Robust Generic and Query-based Summarization.. EACL (pp 235-238)
- Multilingual adaptations of a reusable information extraction tool.. EACL (pp 219-222)
- The use of conceptual graphs for interactive student modelling and adaptive Web explanations. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, Vol. 2774 (pp 230-237)
- Access to multimedia information through multisource and multilanguage information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 2553 (pp 160-171)
- Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content.. ECDL, Vol. 2458 (pp 613-625)
- Human Language Technology for Automatic Annotation and Indexing of Digital Library Content.. ECDL, Vol. 2458 (pp 658-658)
- Adapting a robust multi-genre NE system for automatic content extraction. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS AND APPLICATIONS, PROCEEDINGS, Vol. 2443 (pp 264-273)
- GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL02). Philadelphia, USA
- Adaptivity, Adaptability, and Reading Behaviour: Some Results from the Evaluation of a Dynamic Hypertext System.. AH, Vol. 2347 (pp 69-78)
- . Proceedings of the ACL-02 Workshop on Automatic Summarization -, Vol. 4 (pp 19-26), 11 July 2002 - 12 July 2002.
- Extracting Information for Automatic Indexing of Multimedia Material.. LREC
- . Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics -, Vol. 1 (pp 54-62), 7 July 2002 - 7 July 2002.
- A Unicode-based Environment for Creation and Use of Language Resources. 3rd Language Resources and Evaluation Conference. Las Palmas, Canary Islands Spain
- A framework and graphical development environment for robust NLP tools and applications.. ACL (pp 168-175)
- GATE: an architecture for development of robust HLT applications. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE (pp 168-175)
- Developing reusable and robust language processing components for information systems using GATE. 13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS (pp 223-227)
- The Impact of Empirical Studies on the Design of an Adaptive Hypertext Generation System.. OHS-7/SC-3/AH-3, Vol. 2266 (pp 201-214)
- Dealing with Dependencies between Content Planning and Surface Realisation in a Pipeline Generation Architecture.. IJCAI (pp 1235-1240)
- Tailoring the content of dynamically generated explanations. USER MODELING 2001, PROCEEDINGS, Vol. 2109 (pp 213-215)
- . Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02 (pp 168-168), 7 July 2002 - 12 July 2002.
- . Proceedings of the workshop on Human Language Technology and Knowledge Management -, Vol. 2001 (pp 1-8), 6 July 2001 - 7 July 2001.
- Software Infrastructure for Language Resources: a Taxonomy of Previous Work and a Requirements Analysis. Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC-2). Athens
- Experience of using GATE for NLP R&D. Proceedings of the Workshop on Using Toolsets and Architectures To Build NLP Systems at COLING-2000. Luxembourg
- Uniform language resource access and distribution.. LREC (pp 13-20)
- Generation of multilingual explanations from conceptual graphs. RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING, Vol. 136 (pp 365-374)
- Task-dependent aspects of knowledge acquisition: A case study in a technical domain. CONCEPTUAL STRUCTURES: FULFILLING PEIRCE'S DREAM, Vol. 1257 (pp 183-197)
- Menu-based interfaces to conceptual graphs: The CGLex approach. CONCEPTUAL STRUCTURES: FULFILLING PEIRCE'S DREAM, Vol. 1257 (pp 603-606)
- DB-MAT: Knowledge Acquisition, Processing and NL Generation Using Conceptual Graphs.. ICCS, Vol. 1115 (pp 115-129)
- NL Domain Explanations in Knowledge Based MAT.. COLING (pp 1016-1019)
- DB-MAT: A NL based interface to domain knowledge. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, APPLICATIONS, Vol. 35 (pp 218-227)
- . Advances in Computer Science Research, 7 September 2015 - 9 September 2015.
- Classifying Tweet Level Judgements of Rumours in Social Media. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp 2590-2595). Lisbon, Portugal, 17 September 2015 - 17 September 2015.
Reports
- Social Media and Information Overload: Survey Results
Software, code or databases
- GATE, a General Architecture for Text Engineering. Sheffield, UK: University of Sheffield Retrieved from
Datasets
- Spatio-temporal grounding of claims made on the web, in PHEME.
- .
- .
Preprints
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- Examining Temporal Bias in Abusive Language Detection..
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , Research Square Platform LLC.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- , arXiv.
- .
- .
- .
- Grants
-
- Atrium: , Horizon Europe, 01/2024 - 12/2027, 瞿370,950, as Co-PI
- VIGILANT: , Horizon Europe, 11/2022 - 10/2025, 瞿476,955, as Co-PI
- SoBigData PPP: , Horizon Europe, 10/2022 - 09/2025, 瞿60,326, as PI
- vera.ai: , Horizon Europe, 09/2022 - 11/2025, 瞿776,703, as PI
- , EPSRC, 04/2019 to 09/2027, 瞿5,508,850, as Co-PI
- Ireland Hub, EC, 09/2021 - 03/2024, 瞿211,990, as PI
- SAI: , EPSRC, 02/2021 - 01/2024, 瞿366,348, as Co-PI
- Responsible AI for Inclusive, Democratic Societies: , ESRC, 02/2020 - 01/2024, 瞿508,135, as PI
- SoBigData ++: , EC H2020, 01/2020 - 12/2024, 瞿720,926, as PI
- RISIS2: , EC H2020, 01/2019 - 12/2023, 瞿476,741, as co-PI
- XAIvsDisinfo: , UKRI, 06/2021 - 03/2023, 瞿288,337, as PI
- ELE, EC H2020, 01/2021 - 06/2022, 瞿14,140, as PI
- Studying the spread and impact of COVID-19 anti-vaccine disinformation in the UK, Research England, 12/2020 - 03/2021, 瞿41,164, as PI
- Online Abuse towards Public Figures, Government Officials, and Scientists During the COVID-19 Crisis, Research England, 07/2020 - 03/2021, 瞿55,479, as PI
- ELG: , EC H2020, 01/2019 - 06/2022, 瞿656,631, as PI
- WeVerify: , EC H2020, 12/2018 - 11/2021, 瞿403,577, as PI
- Journalist-in-the-Loop Machine Learning as a Service for Rumour Analysis, Industrial, 11/2018 - 12/2019, 瞿44,642, as PI
- Automatic Detection of Online Misinformation, Industrial, 03/2018 - 12/2020, 瞿43,077, as PI
- , EC H2020, 09/2015 - 08/2019, 瞿649,690, as Co-PI
- ChatBot: The development of a CHATBOT to support successful transition to adult care of young people with Type 1 Diabetes Mellitus, NIHR, 12/2020 - 05/2023, 瞿18,622, as Co-PI
- COMRADES: , EC H2020, 01/2016 - 12/2018, 瞿257,000, as PI
- OpenMinTed: , EC H2020, 06/2015 - 05/2018, 瞿418,388, as Co-PI
- Individual Profiling through Text Analysis, Air Force Office of Scientific Research USA, 09/2014 - 09/2015, 瞿10,746, as Co-PI
- PHEME: , EC FP7, 10/2013 - 12/2016, 瞿489,421, as PI
- DecarboNET: , EC FP7, 10/2013 - 09/2016, 瞿253,753, as PI
- uComp: , EPSRC, 11/2012 - 05/2016, 瞿375,621, as Co-PI
- , EC FP7, 06/2012 - 05/2014, 瞿394,226, as Co-PI
- Linked Data for Environmental Science, Joint Information Systems Committee, 06/2012 - 01/2013, 瞿40,234, as PI
- TrendMiner: , EC FP7, 11/2011 - 10/2014, 瞿400,991, as PI
- GATE Cloud Exploratory: , EPSRC, 02/2011 - 10/2011, 瞿71,677, as Co-PI
- , EPSRC, 10/2010 - 05/2018, 瞿591,755, as PI
- ServiceFinder: , EC FP7, 01/2008 - 12/2009, 瞿206,407, as PI
- MUSING: , EC FP6, 04/2006 - 04/2010, 瞿776,082, as PI
- TAO: , EC FP6, 03/2006 - 02/2009, 瞿581,515, as PI
- Professional activities and memberships
-
Member of the research group