Crop Bioinformatics Adelaide (CroBiAd)

Crop Bioinformatics Adelaide (CroBiAd) provide bioinformatics expertise to the agriculture industry through the development of novel bioinformatics and biometrics methods for wheat genetics and breeding.

Our bioinformatics activities include:

Research - in silico experimentation, such as development of novel methods, tools and pipelines.
Consultancy – recommendations in experimental design for projects such as RNA-Seq and various DNA sequencing.
Bioinformatics service provision - Data analysis solutions and hands-on bioinformatics workshops in a fee for service model.

Based within the University's School of Agriculture, Food and Wine, we are the South Australian node of the EMBL-Australia Bioinformatics Resource (EMBL-ABR).

Biology as a data science

Biology is becoming ever more data driven. This is in part due to technological advances in DNA sequencing which means that many labs can now generate DNA sequence data at low cost. This has brought new opportunities to the field of crop biology but also carries its own challenges due to the size and complexity of this big data.

Bioinformatics is inherently a multidisciplinary field which intersects biology, computer science and statistics. As such, our group is naturally multidisciplinary too, with different career paths and different areas of expertise.

Research
Industry and engagement
People
Resources

Genomics for wheat breeding

Our goals are to develop novel bioinformatics and biometrics methods for wheat genetics and breeding.

The size and complexity of the wheat genome exposes some unique problems for bioinformatics. These require the development of novel approaches for data handling and analyses, including:

Wheat genomics and transcriptomics
Marker development
Software development
Pipeline and algorithm development

Our research is aligned to Program 5 of the Wheat Hub - the ARC Industrial Transformation Research Hub for Wheat in a Hot and Dry Climate.

Wheat Hub program

Our group has strong collaborations with both local biology-focused groups as well as national and international industry partners. This has been recognised by an Emerging Industry Research Partnership Award.

Key industry and international collaborations

DuPont Pioneer - A 10 years collaboration
ARC Industrial Transformation Research Hub for Wheat in a Hot and Dry Climate (Wheat Hub)
Biogemma
International Wheat Genome Sequencing Consortium (IWGSC)
Wheat Information System Expert Working group

Ute Baumann - Group Leader
Elena Kalashyan - Programmer
Nick Warnock - Bioinformatician
Paul Eckermann - Biometrics Officer
Melissa Garcia - Research Fellow
Charity Chidzanga - PhD student
Amritha Amalraj - PhD student (Visiting group member)

Alumni
Andreas Schreiber - Research Fellow

Andy Timmins - Postdoctoral Fellow in Bioinformatics

Angus Wallace - Postdoctoral Fellow in Bioinformatics

Timo Tiirikka - Bioinformatics Officer

John Toubia - Bioinformatics Officer

Virginie Perlo - Bioinformatics Officer

Patrick Laffy - Bioinformatics Officer

Joseph Sclauzero - Summer scholar, honours student

Ngoc Vo - Summer scholar

Anthony Clissold - Summer scholar

Daniel Menadue - Summer scholar

Jia Truong - Summer scholar

Training

Our group has expertise in the development and delivery of hands-on bioinformatics workshops.

Contact Ute Baumann
Software - public
CroBiAd develops a wide range of software, predominantly aimed at biology end-users in order to democratise crop genomic resources.

Because of the nature of our funding and research, much of our code is developed internally and maintained using git repositories on a local GitLab install. As projects approach publication, these resources are published to our public GitHub account.
POTAGE

POPSEQ Ordered Triticum aestivum Gene Expression (POTAGE) is a visualisation tool for speeding up gene discovery in wheat.

Access POTAGE
DAWN: Diversity Among Wheat geNomes

DAWN provides access to a variety of public wheat data sets in the context of the International Wheat Genome Sequencing Consortium’s (IWGSC) RefSeq v1.0 genome assembly^[^1]. This is achieved by pre-processing the data and making it available through a JBrowse genome browser.

Resources

JBrowse

Coordinate converter - convert coordinates from parts to/from whole pseudomolecules

Open data at figshare

Help

Help (Gitter chat)

Bug report / Feature request

DAWN source data sets

The following data sets are available through DAWN:

Reference sequence

IWGSC RefSeq v1.0 genome assembly^[1].

Gene annotations

IWGSC v1.0

IWGSC v1.1

Markers (as part of IWGSC v1.0 annotation)

DArT

EST

MAS

SNP platforms

SSR

Whole genome shotgun data

16 wheat accessions (BPA^[2])

Chinese spring (ENA: PRJNA392179^[3])

Transcriptomic data

INRA GDEC

Exome data

62 wheat accessions (ENA: SRP032974^[4])

Citing DAWN

Watson-Haigh, N.S., Suchecki, R., Kalashyan, E., Garcia, M. & Baumann, U. DAWN: a resource for yielding insights into the diversity among wheat genomes. BMC Genomics 2018 19:941.

External resources

You may also find the following wheat related resources useful:

Ensembl Plants

Wheat@URGI portal

Wheat Expression Browser

Genetic Resources Information System (GRIS)

Software - internal

Access type	Resource	Description
School of Agriculture, Food & Wine only	agwine-blast	A BLAST server containing the IWGSC RefSeq v1.0 assembly.
Internal network only	coching
	bwpf	Bread Wheat Promoter Finder
	blast	A BLAST server containing a mix of published and unpublished data sets
	POTAGE	A POTAGE server containing published and unpublished data sets
	fetch	Simple sequence retrieval for selected genome assemblies
	dev-DAWN	Diversity Among Wheat geNomes
	blast-dev
	GitLab	Project/code/software development resources
	bareos	Backup server
	Genome Ribbon	Web server for visualising structural variants generated from PacBio alignments. Modified to load decompressed CSI BAM index files.

Scientific computing infrastructure

Research involving large, complex, polyploid genomes such as that of wheat often requires computers with large amounts of RAM.

However, such infrastructure is not normally available from typical high-performance computing (HPC) providers. As such, we operate our own scientific computing infrastructure, which includes several large memory nodes.

Current infrastructure

A Slurm based cluster consisting of 2 compute nodes with the following specs:
- 72 CPUs (Intel Xeon E5-2699v3 @ 2.30GHz)
- 755 GB RAM
- 880 GB fast local storage (2 x SSD’s in RAID0)
- Access to 120 TB of clustered storage
A stand-alone compute node with the following specs:
- 64 CPUs (Intel Xeon E7-4830 @ 2.13GHz)
- 512 GB RAM
- 1.7 TB fast local storage
- Access to 120 TB of clustered storage

Access to resources made available through:

NeCTAR Research Cloud (2 allocations: one for training and one for the group)
- 430 CPUs
- 1.7 TB RAM
- 5 TB of object storage
University of Adelaide’s phoenix HPC

Crop Bioinformatics Adelaide (CroBiAd)

Biology as a data science

Genomics for wheat breeding

Alumni

Training

Software - public

POTAGE

DAWN: Diversity Among Wheat geNomes

Resources

Help

DAWN source data sets

Citing DAWN

External resources

Software - internal

Scientific computing infrastructure

Scientific computing infrastructure