PhageAI and natural language processing for in silico characterization of the phage life cycle

Issue 190 | August 26, 2022
12 min read
Capsid and Tail

This week Marcin Lubocki and Piotr Tynecki explain how AI tools like PhageAI can help us characterize our phages faster and more accurately (but only with the help of the phage community!).

What’s New

Paper: The presence of plasmids in bacterial hosts alters phage isolation and infectivity.

Phage-host interactionsPhage-plasmid interactionsResearch paper

Paper: Phage-like particle vaccines are highly immunogenic and protect against pathogenic coronavirus infection and disease.

COVIDPhage-based vaccinesResearch paper

Paper: New vista into origins of viruses from a prototypic ssDNA phage.

EvolutionResearch paperssDNA phages

Paper: From kill the winner to eliminate the winner in open phage-bacteria systems.

Phage ecologyPhage-host interactionsResearch paper

Paper: Immunological and safety profile of bacteriophages therapy: a preclinical study.

Phage TherapyPhage-immune interactionsResearch paper

Preprint: Hecatomb: an end-to-end research platform for viral metagenomics.

Bioinformatics ToolViral metagenomics

Latest Jobs

Gut microbiomePost Doc
Postdoc in phages for gut microbiome manipulation at Sandia National Laboratories, Williams Lab in Livermore CA, USA. Develop methods for editing genomes in any microbiome. Work involves phage isolation/engineering, possible bioinformatics and mouse biology. Pay: $99,000/yr. Requires biology PhD, record of technical accomplishment, communication skills. Useful skills: Microbiology: phages, anaerobic bacteria, microbial communities, Molecular biology, Bioinformatics, Mouse biology skills not required. Email Kelly Williams ([email protected]) or apply at, Job 685140.
PhD projectPhage-host interactions
PhD project to study genomics of phage-host interactions in the Viral Ecology and Omics Group of Prof. Bas E. Dutilh in Jena, Germany.

Community Board

Anyone can post a message to the phage community — and it could be anything from collaboration requests, post-doc searches, sequencing help — just ask!

Dear ISVM Members,

We are happy to announce upcoming elections for the ISVM Executive Board:

  • Candidates’ applications: deadline September 9th, 2022
  • Voting: September 16th-30th, 2022
  • Results: October 10th-15th, 2022
  • Transition period: October 16th, 2022 - January 15th, 2023
  • Changing of the board: January 15th, 2023

We are now gathering the list of candidates to stand for election.

As an Officer, you can help shape the future of ISVM, and we hope that you will step forward and express your interest in one of the positions.

The positions that are opened in this round (2023/24) are:

  • President-Elect in 2023/2024 (preparation for becoming the next president in 2025/2026)
  • Vice-President (replacement of President in case of absence)
  • Treasurer (financial management & overview of the society)
  • Information Officer (reporting and communication)
  • Assistant Secretary (social media, newsletter; replacement of Information Officer in case of absence)
  • Industry Outreach (industry contacts & sponsorship)
  • Website Administrator (maintenance & design)
  • Membership Secretary (updating memberships and statistics)

The election is open to all ISVM members.

Please see for more information on ISVM governance.

Please contact Rob Lavigne ([email protected]) or Zuzanna Drulis-Kawa ([email protected]) for additional information.

All candidates should complete the candidate registration form (members would have received it by email) and send to [email protected] BEFORE SEPTEMBER 9th 2022.

Voting will be announced with an email that includes a link to the poll.

Thanks for your continued support.

The ISVM Board


Welcome new Phage Directory members!

  • 10 phage professionals
    Adam Yang, Isabella Burdon, Yan Li, Louise Hock, Saieeda Fabia Ali, Karthik L, Joseph Ancla, Marcin Lubocki, Lyman Ngiam, Jinny Liu

  • 7 phage labs
    Nixon Lab, AZTI - Bacteriophage Research Group, Computational Phage Biodiscovery, SRM IST, University of Oklahoma, National Research Council Canada, Microbial Self-healing Research Group, UST BEATS Research Group

  • 3 phage organizations
    Creative Diagnostics, Institute of Phage, CreatiPhage Biotech

Sign up here to join the community!

Phage DirectoryNew Members

PhageAI and natural language processing for in silico characterization of the phage life cycle

Profile Image
Laboratory of Virus Molecular Biology, PhageAI, University of Gdańsk

Bioinformatics, Biotechnology, Molecular Biology, Machine Learning / AI, Teaching

Polish PhD candidate in Medical Sciences at the Intercollegiate Faculty of Biotechnology of University of Gdańsk and Medical University of Gdańsk. Researcher and Life Science Business Developer at PhageAI. Big enthusiast of AI applications in biomedicine.


Profile Image
PhD CandidateIndependent ResearcherCo-founderMachine Learning Developer
SLAVIC AI, Bialystok University of Technology
Twitter @ptynecki

Bioinformatics, Data Analytics, Machine Learning / AI, Software Engineering

Research & Development Manager, Software Engineer (AI), PhD student at Bialystok University of Technology, PhageAI co-creator and promoter of the Phages2050 movement.

Support our idea about Artificial Intelligence application for tomorrow’s Phage Therapy. We’re open to cooperation on many different layers (data analysis, data mining, machine/deep learning, software-development).

The phage therapy perspective in the nearest future

Phage therapy is a promising strategy to treat bacterial infections. Nowadays, with the increasingly growing threat of antimicrobial resistance to antibiotics, there is much more interest in phage therapy clinical trials. The application of phages in clinical setup is believed to be even more boosted in the nearest future as the law regulations have started to evolve to adapt novel approaches. An example is a recently given STAMP (Standardized Treatment and Monitoring of Phage Therapy) protocol approval in Australia. The ongoing program of implementation of such clinical trials on a broad scale, named Phage Australia, will be based on monitoring and standardizing the process of therapy, rather than a specific phage product. With such changes emerging, we can suspect that phage therapy will acquire even more attention in the nearest future. Therefore, we should take some action and prepare our phage collections for the opportunities to come.

The importance of phage characterization

Despite evolving clinical regulations, the implementation of phage therapy is still a challenging process. Every medical product has to be characterized as much as possible before its administration to lower potential adverse effects. In the case of phages, the characterization definition may differ depending on the researcher you ask, as it was investigated in recently published results of the ‘State of Phage’ survey performed by Phage Directory. Generally, the phage characterization may mean an annotation of as much information about a phage as possible. In the case of phage application in a clinical setup, the phage characterization usually means determining of host range, phage life cycle (lytic vs. lysogenic), and screening for any genes detrimental to phage therapy, such as toxins, antibiotic resistance genes, or genes related to horizontal gene transfer.

The problem in phage characterization is that we still have much to discover in phage genomics. Viruses are masters of gene packaging inside their relatively small genomes, and they frequently use completely novel to our understanding mechanisms of gene coding, e.g., some genes can be completely embedded within other genes. Many phage genes are hard to compare with entries in sequence databases, as they are highly variable.

Another issue is understanding the so-called phage dark matter. It is believed that the discovered phages represent only 1% of the total phage repertoire existing worldwide.

These facts imply that characterization with sequence alignment-based comparative methods usually fails. As a consequence, the process of implementation of phage therapy, especially within the criteria of personalized medicine, is very expensive and time-consuming. It mirrors another fact observed in the ‘State of Phage’ survey that many collected phages still await their characterization.

As PhageAI, we would like to invite you to help us in our efforts to overcome this obstacle.

The potential of Artificial Intelligence

Artificial Intelligence (AI) can be a useful tool to overcome the limitations of alignment-based methods in phage characterization. Recently, AI application in biology and medicine has not only gained popularity but also achieved significant successes as we can observe, e.g., in the case of AlphaFold. The growing abilities of novel, state-of-the-art AI algorithms help us achieve what is currently not achievable for a human mind and, from a biological perspective, not achievable for sequence alignment-based methods.

Generally, AI is a term that describes many technologies, which learn some patterns based on the data they have been given. It suits well in biology and genomics in the era of high throughput sequencing methods. Considering that phage research is a field in which genome sequencing is so heavily used these days, it is naturally a good place for biomedical AI.

PhageAI and NLP for phage characterization

Natural Language Processing (NLP) is a branch of AI technologies, and it is developed for understanding the context and meaning of words in natural language. This technology is used in spam filters or virtual assistants like Siri, Cortana, or Alexa, and is applicable wherever text processing is necessary. Interestingly, it can be used in biology.

PhageAI applies and adapts state-of-the-art NLP technologies to phage genomics, giving solutions to many biological problems in phage research. As the more data AI algorithms receive, the better they are in executing their tasks. Consequently, PhageAI uses the trend of high throughput phage genome sequencing to build a high-quality phage data repository.

Based on phage and bacterial data, we ‘teach’ NLP models to understand phage genomics. In simple words, these models treat genome sequences as a text containing a set of letters (A, C, T, G) and learn to read phage genomes just like we read books. Such models can be used to build a set of different classifiers, which answer different biological questions, e.g., is my phage virulent or temperate, or does my phage encode some toxins?

Today, with NLP applied to phage genomics, PhageAI achieves highly accurate predictions of the phage life cycle based solely on a phage genome sequence in a couple of seconds. The current version of PhageAI Life Cycle Classifier supports three life cycle predictions: lytic, lysogenic, and chronic (in which the phage progeny is released from the bacterium without its lysis). Importantly, PhageAI predictions are verified in a wet lab in cooperation with our clients, ensuring higher quality of future predictions.

The results to date encouraged us to start working on new classifiers, which will answer other important questions about phage biology. Currently, on the PhageAI web platform, apart from the Life Cycle Classifier, there is also the Phage Taxonomy Classifier, an alignment-free method for finding the most similar phages to the phage of your interest.

The results obtained with NLP technology are so prominent that the whole genome annotator built on several classifiers is currently under development, covering in the first place the problems directly related to the application of phages in therapy.


The phage life cycle is a key feature in determining phage utility in phage therapy. Prior to using a phage in a clinical setup, there are two important aspects of phage characterization to consider: the ability to kill via the lytic cycle and the diminished risk of horizontal gene transfer via the lysogenic cycle. NLP characterization based on the genome sequence of a phage is a prominent approach that bypasses alignment methods, which fail in the perspective of the unknown phage dark matter.

PhageAI explores NLP possibilities in phage research and develops dedicated software for phage researchers. A researcher can determine the phage life cycle fast and accurately with only a phage genome sequence. Importantly, PhageAI implements NLP methods not only to determine a phage life cycle, but also phage taxonomy and other phage-related features.

Gathering phage data enables higher characterization results. To achieve this, the PhageAI platform aims to become a new standard of the phage data repository and quality control, which will be used by NLP for phage genomics technology to explore new areas of phage research and development.

Invitation to participate on the PhageAI web platform

Although PhageAI technology significantly accelerates the characterization of phages, the exploration of phage dark matter requires the combined forces of the entire phage society. Therefore, on our web platform, we have created a possibility for every phage scientist to participate in the characterization of phages.

We would like to invite you to give us your feedback on the consistency of our predictions with the results you receive in the laboratory through the expert panel on the PhageAI platform. In this way, the tools we provide will be even more accurate and serve the phage community better by accelerating global phage research.

Learn more

Capsid & Tail

Follow Capsid & Tail, the periodical that reports the latest news from the phage therapy and research community.

We send Phage Alerts to the community when doctors require phages to treat their patient’s infections. If you need phages, please email us.

Sign up for Phage Alerts

In collaboration with

Mary Ann Liebert PHAGE

Supported by

Leona M. and Harry B. Helmsley Charitable Trust

Crossref Member Badge